c++ - Signed int overflow-underflow cause undefined behaviour but how does the compiler anticipate this? - Stack Overflow

Signed int arithmetic operations can overflow and underflow, and when that happens, that is undefined b

Signed int arithmetic operations can overflow and underflow, and when that happens, that is undefined behavior as per the C++ standard (and C standard). At which point the program can be expected to do potentially do anything.

I've noticed many questions on SO where undefined behavior occurs and the program behaves in unexpected but deterministic manner, eg:

C output changed by adding a printf

In the context of signed int overflow, how does the compiler generate code such that when an overflow occurs it intentionally behaves weirdly? Is there a cmp injected to determine if some flag is set post an iadd or something?

Surely the compiler isn't intentionally generating code that checks for overflow and intentionally has another code path be executed?

I'm probably missing something here, but any explanation about this would be great to have.

Signed int arithmetic operations can overflow and underflow, and when that happens, that is undefined behavior as per the C++ standard (and C standard). At which point the program can be expected to do potentially do anything.

I've noticed many questions on SO where undefined behavior occurs and the program behaves in unexpected but deterministic manner, eg:

C output changed by adding a printf

In the context of signed int overflow, how does the compiler generate code such that when an overflow occurs it intentionally behaves weirdly? Is there a cmp injected to determine if some flag is set post an iadd or something?

Surely the compiler isn't intentionally generating code that checks for overflow and intentionally has another code path be executed?

I'm probably missing something here, but any explanation about this would be great to have.

Share Improve this question asked Nov 19, 2024 at 1:50 Always BecuriousAlways Becurious 3518 bronze badges 7
  • 2 IMHO, I would expect that the compiler doesn't emit any code that performs overflow checking, unless its super obvious. For example, adding 1 to a number may overflow, but most of the time it won't. Some processors may have status flags that indicate that a register operation, like addition, overflowed. But trying to access this status register and bit is difficult in C++. – Thomas Matthews Commented Nov 19, 2024 at 2:02
  • 2 I recommend you search your compiler's documentation for overflow detection. Some may have pragmas or switches to detect overflow. The question is, should the compiler test every arithmetic operation or only a few select operations? This detection will slow down the compilation and the executable. – Thomas Matthews Commented Nov 19, 2024 at 2:04
  • 3 “Undefined behavior” means only that the C++ standard doesn’t tell you what the program should do. It doesn’t mean that bad things will happen, and it doesn’t mean that random things will happen. – Pete Becker Commented Nov 19, 2024 at 2:49
  • 1 - (a - b) == (b - a); without under/overflow this can be assumed to be true. The compiler does not insert checks, but it can assume the value is true because otherwise the value would be undefined. – 463035818_is_not_an_ai Commented Nov 19, 2024 at 11:39
  • 2 @AlwaysBecurious or consider if (condtion) 42 / 0; the compiler is free to emit code that assumes condition is false because thats the only way to execute the code such that is has defined behavior – 463035818_is_not_an_ai Commented Nov 23, 2024 at 16:24
 |  Show 2 more comments

2 Answers 2

Reset to default 24

In the example you link, the compiler is optimizing by assuming undefined behavior can't happen. In that question, overflow is reported only if you added two negative numbers and produced a positive, or two positives produce a negative. Since neither behavior can occur without being the result of an undefined behavior (signed integer overflow), it simply assumes the if test can never pass, and eliminates that code path entirely from the compiled code, leaving only the code for the else path.

So it's not doing any additional work at all at runtime to achieve this result. It's identifying a way to improve performance, the same as it would if it saw:

if (0) {
   x();
} else {
   y();
}

and eliminated the if (0) path, compiling code that unconditionally called y();. It's not required to produce code that responds correctly in the presence of undefined behavior, so it just pretends it can't happen and optimizes accordingly. No code checks for the undefined behavior, no extra work is done, it just produces simple code that behaves incorrectly because the undefined behavior it assumed couldn't happen, actually did.

The authors of clang and gcc interpret the phrase "the Standard imposes no requirements" as being synonymous with "nobody has any right to care what an implementation does". Given code like:

unsigned mul_mod_65536(unsigned short x, unsigned short y)
{
    return (x*y) & 0xFFFFu;
}
unsigned char arr[32775];
void test(unsigned short n)
{
    unsigned result = 0;
    for (unsigned short i=32768; i<n; i++)
        result = mul_mod_65536(i, 65535);
    if (n < 32770)
        arr[n] = result;
}

the gcc optimizer will examine the loop and classify its behavior based upon the value of n:

  • If less than 32769, the loop won't execute at all, and result will be zero.

  • If n is exactly 32769, loop will execute exactly once, setting result to 0x7FFF8000u.

  • In all other cases, "the Standard imposes no requirements", i.e. "nobody has any right to care what the implementation does".

Then, when evaluating the if statement, the gcc optimizer will recognize that in all cases where the Standard would impose any requirements, code should store the least significant byte of result to arr[n], which as it happens will be zero in all such cases. Thus, the test function will be transformed into an unconditional arr[n] = 0;.

Note that such behavior is inconsistent with the pre-existing language the C89 Standard was chartered to describe. According to the published Rationale, the authors of the Standard expected that implementations where e.g. (ushort1 * ushort2) & 0xFFFFu; wouldn't be treated as equivalent to ((unsigned)ushort1 * ushort2) & 0xFFFFu; would become vanishingly rare. The reason the Standard didn't mandate such treatment is that the authors never imagined that compiler writers would interpret the lack of a mandate as an invitation to gratuitously deviate from almost universally expected practices.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745586506a4634573.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信