c - How do I determine the evaluation order of printf? - Stack Overflow

I have read many questions on this topic already:What is the order of evaluation of printf(..) paramete

I have read many questions on this topic already:

  • What is the order of evaluation of printf(..) parameters?
  • What is the order of evaluation of functions in printf?
  • What is the order of evaluation in printf() for pointers?
  • (and others)

Fairly unanimously, the conclusion is that "this is undefined/unspecified behavior." I do agree with that, and in any sensible program, I will try to avoid situations like these.

However, many universities (including mine) conduct pen-and-paper exams in which we need to evaluate these ambiguous statements. How do I proceed?

Can we at least narrow down on this behaviour for at least one compiler and architecture? In my case:

gcc version 13.3.0 (Ubuntu 13.3.0-6ubuntu2~24.04)
Target: x86_64-linux-gnu

Running on Ubuntu 24.04 LTS (Noble Numbat) on a 12th generation Intel CPU (Alder Lake).


Example 1

#include <stdio.h>

int main() {
    int x;

    x = 20;
    printf("%d, %d\n", x++, x);

    x = 20;
    printf("%d, %d\n", x, x++);
}

Output:

20, 21
21, 20

I expected to get 20, 20 for the first one, but I clearly don't.


Example 2

#include <stdio.h>

int a() {
    printf("a has run!\n");
    return 7;
}

int b() {
    printf("b has run!\n");
    return 13;
}

int main() {
    printf("%d, %d\n", a(), b());
}

Output

b has run!
a has run!
7, 13

I have read many questions on this topic already:

  • What is the order of evaluation of printf(..) parameters?
  • What is the order of evaluation of functions in printf?
  • What is the order of evaluation in printf() for pointers?
  • (and others)

Fairly unanimously, the conclusion is that "this is undefined/unspecified behavior." I do agree with that, and in any sensible program, I will try to avoid situations like these.

However, many universities (including mine) conduct pen-and-paper exams in which we need to evaluate these ambiguous statements. How do I proceed?

Can we at least narrow down on this behaviour for at least one compiler and architecture? In my case:

gcc version 13.3.0 (Ubuntu 13.3.0-6ubuntu2~24.04)
Target: x86_64-linux-gnu

Running on Ubuntu 24.04 LTS (Noble Numbat) on a 12th generation Intel CPU (Alder Lake).


Example 1

#include <stdio.h>

int main() {
    int x;

    x = 20;
    printf("%d, %d\n", x++, x);

    x = 20;
    printf("%d, %d\n", x, x++);
}

Output:

20, 21
21, 20

I expected to get 20, 20 for the first one, but I clearly don't.


Example 2

#include <stdio.h>

int a() {
    printf("a has run!\n");
    return 7;
}

int b() {
    printf("b has run!\n");
    return 13;
}

int main() {
    printf("%d, %d\n", a(), b());
}

Output

b has run!
a has run!
7, 13
Share Improve this question edited Feb 3 at 16:47 Peter Mortensen 31.6k22 gold badges110 silver badges133 bronze badges asked Jan 31 at 11:05 eccentricOrangeeccentricOrange 1,23210 silver badges22 bronze badges 10
  • If you want a specific order, use temp vars (example 2): int ta=a(); int tb=b(); printf("%d, %d\n",ta,tb); – Wiimm Commented Jan 31 at 12:43
  • 22 You already know the answer to the printf() behaviour, your post leads with it. But what you're really/also asking is "what to do when faced with a silly question in a uni exam?" The answer to that isn't really programming-related, so perhaps academia.stackexchange would fit better. I would say when you realize you know more than the teacher, you have two choices: a) answer according to the course materials, or b) explain the caveats and how the assumptions of the question are off. You may need to pick based on what you know about the teacher and other circumstances ... – ilkkachu Commented Jan 31 at 19:53
  • 1 @ilkkachu Not really. I did know the solution for the specific examples I presented, but I was not able to find (or derive) a rule for evaluating any general example. E.g. something with 4 inputs and 1 pre-decrement operation. I wanted to understand how to evaluate these as a real compiler would (which I've since learnt is impossible). I have a reasonable prof who would accept an answer stating "unspecified/undefined" and he'd never ask something like this. But the question paper is usually designed by someone else, so I wanted clarity on how it works. I'll deal with the academia side of it :) – eccentricOrange Commented Feb 1 at 11:25
  • 5 @eccentricOrange The rule is whatever the prof who set the exam thinks the rule is, since you are being examined on what that prof thinks C is, rather than what it actually is. How you go about ensuring that you can find out clearly what that is, without disrupting the academic environment is fundermentally a people problem for academia, not a technical quection. – user1937198 Commented Feb 1 at 15:53
  • @eccentricOrange The mental model I use for foo(a, b, c, d), where a,b,c,d may be objects, expressions or function calls is that they could all evaluate simultaneously. If that create conflict or non-uniform result, we risk undefined behavior (UB) and alternate code should be employed. Good luck with Prof. – chux Commented Feb 2 at 17:35
 |  Show 5 more comments

4 Answers 4

Reset to default 32

TL;DR:

  • Example 1 contains undefined and unspecified behavior both at once.
  • Example 2 contains unspecified behavior.

We cannot predict the results in either case, even when using the same compiler for the same target. A compiler is free to generate different results for unspecified behavior on case to case basis.


Detailed answer:

The order of evaluation of function parameters is unspecified behavior, meaning that some compilers will do left-to-right evaluation and others right-to-left (and in theory any order in between). Unspecified behavior means that one of multiple behaviors is implemented, but the compiler need not document which one and the compiler is free to do it differently from case to case.

Can we at least narrow down on this behaviour for at least one compiler and architecture?

No. Because the mainstream compilers do evaluate function parameters differently on case-to-case basis. The number of parameters, the type of the parameters, if optimizations are enabled, if the function is inlined etc etc. There are numerous different behaviors within the same compiler for the same target.

Additionally, if there is at least one side effect such as writing to x in the same expression as another side effect or value computation, and there is an unspecified ("unsequenced") order in which the code is executed, the behavior is also undefined behavior. Meaning a bug, a potential crash, the whole program behaving wrong etc etc.

So printf("%d, %d\n", x++, x); is an example where x++ and x are evaluated in an unspecified order, but they are therefore also unsequenced in relation to each other. The x++ is a side effect and it is unsequenced in relation to the value computation x, so the code is also undefined behavior.

This is different from a situation like this:

int x (void)
{
  static int i=0;
  return i++;
}

printf("%d, %d\n", x(), x());`

This is still unspecified behavior, so it can print either 0, 1 or 1, 0. But the function call comes with multiple sequence points, guaranteeing that all previous side effects (the i++) are carried out before next time that value is used. But it is not undefined behavior, so the code won't crash or produce completely unexpected results.

Can we at least narrow down on this behaviour for at least one compiler and architecture?

No.

The reason for not specifying any order for evaluating function call arguments or subexpressions generally is not so that one compiler can choose right to left and another can choose left to right. It is to allow optimization. Consider this code:

printf("%g\n", a*b + c*d);
y = foo(e+4, a*b);
z = foo(c*d, f+4);

When execution arrives at y = foo(e+4, a*b);, the program has already computed a*b. If this particular C implementation had an implementation order of left-to-right for arguments, it would have to calculate e+4 before a*b.1 But a*b was already calculated, and we do not want to have to calculate it again. So we do not want to require left-to-right order. On the other hand, if it has an order of right-to-left, then, in z = foo(c*d, f+4);, we would have to calculate c*d after f+4. But here c*d was already calculated, and we do not want to have to calculate it again. So we do not want to require right-to-left order. We want to allow whichever order is best for optimization.

A primary reason no sequencing is specified for evaluation of subexpressions is to allow compilers to reorder evaluation freely for optimization. A single particular C implementation will evaluate subexpressions in different orders according to each situation. If the value of a variable happens to be left in a register from evaluation of a prior subexpression, the compiler may prefer to reuse that value to evaluate some other subexpression using it before it starts evaluating an unrelated subexpression.

However, many universities (including mine) conduct pen-and-paper exams in which we need to evaluate these ambiguous statements. How do I proceed?

If an exam question requires some sequencing on expression evaluation that is not specified by the C standard and the exam does not add any premises that would resolve it, then it is a faulty question, and you should tell the people responsible for the exam that it is faulty.

printf("%d, %d\n", x++, x);

The behavior of this code is not defined by the C standard because it violates this rule in clause 6.5.1 of the C 2024 standard (and similarly in previous versions):

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

Specifically, x++ has the side effect of modifying x, and the other x performs a “value computation” on x (it gets the value of x). This side effect and this value computation are unsequenced (no rule in the C standard specifies a sequencing for them), so the behavior is not defined.

printf("%d, %d\n", a(), b());

Evaluation of the arguments of this printf includes evaluation of a(), and b(). Since a() and b() are function calls, there is a sequence point before each call (per C 2024 6.5.3.3). A sequence point creates some sequencing that separate evaluations into things before the sequence point and things after the sequence point. However, there is no specification of what eligible things go before the call to a and what eligible things go after the call to a, and similarly for b.

The consequence of this is that a() will be evaluated either entirely before b(), because a() is evaluated before the sequence point in calling b, or entirely after b(), because b() is evaluated before the sequence point in calling a, but the C standard does not specify which. Possibly a C implementation could specify which order it uses in this case, but there is little benefit to that.

To draw out this case further, suppose a and b are:

int x;

int a(void) { return x++; }
int b(void) { return x; }

Then printf("%d, %d\n", a(), b()); contains in its entirety, like printf("%d, %d\n", x++, x);, both a side effect on x and a value computation of x. However, in printf("%d, %d\n", a(), b());, these are sequenced: Either x++ is before x or x is before x++, although we do not know which. This is fundamentally different. This case is:

  • x++ occurs either entirely before or entirely after x.

The other case is:

  • No sequencing at all is specified for x++ and x.

You might ask how can there be no sequencing for x++ and x? Evaluating x is just a load of the value of x; it has to occur before or after the modification of x. However, consider a C implementation that supports 32-bit int but runs on hardware that has only 16-bit operations. C is an old language, and there were implementations like this, and the C standard was written to be flexible. In this C implementation, x++ consists of multiple operations which can occur in various orders:

  1. Load the low two bytes of x.
  2. Add 1 to the low two bytes of x.
  3. Load the high two bytes of x.
  4. Add the carry from the addition to the high two bytes.
  5. Store the result of adding 1 to the low two bytes.
  6. Store the result of adding the carry.

Those can occur in order 123456 or 213456 or 213465 or 132456 or 132465 or 312456 or 312465 or 125346, and so on. Consider the order 125346. After 125, the low two bytes of x have been changed in memory, but the high two bytes have not. If the separate x expression is evaluated at this point, it will get two bytes from the pre-updated x and two bytes from after it. For example, if x is 0000FFFF16 (65,535), x++ updates it to 0001000016 (65,536). But, x is evaluated after steps 125 and before 346, it will get the post-update 000016 from the low two bytes and the pre-update 000016 from the high two bytes, making 0000000016.

So printf("%d, %d\n", a(), b()); can print “65536 0“.

Another possibility is the order is 123465. Then if the separate evaluation of x is after 12346 and before 5, it gets the post-update 000116 from the high two bytes and the pre-update FFFF16 from the low two bytes, making 0001FFFF16. Then printf("%d, %d\n", a(), b()); can print “65536 131071“.

This possibility of mixed parts of evaluations is why the behavior in these situations is not defined rather than specifying that the behavior is indeterminately sequenced.

Note that for simple expressions, like x++ and x, a compiler could note that the same object is being used, and perhaps the C standard could have written a rule that these are indeterminately sequenced, rather than being undefined. However, in more complicated code, we may have pointers p and q, and the compiler may not know whether *p++ and *q will refer to the same object or not. So the rule here is that the compiler may freely reorder whatever operations it uses to implement *p++ and *q, even if they consist of operations on sub-parts of objects as described above, and it is the programmer’s responsibility to ensure that *p++ and *q either do not refer to the same object or are not evaluated without sequencing.

Footnote

1 Since these arguments have no observable behavior by themselves, the rules about observable behavior would allow reordering them for optimization anyway, regardless of rules about sequencing. However, this example is just for illustration. We could replace this with a more complicated example that did involve observable behavior.

Fairly unanimously, the conclusion is that "this is undefined behavior." I do agree with that, and in any sensible program, I will try to avoid situations like these.

Unanimous decision indeed!

However, many universities (including mine) conduct pen-and-paper exams in which we need to evaluate these ambiguous statements. How do I proceed?

If by pen-and-paper you mean writing the answer, just produce the proper explanations.

If you must select an answer from a list of choices, I hope one of the options for Example 1 is undefined behavior and for Example 2 you should select unspecified behavior, ie: b has run! a has run! 7, 13 or a has run! b has run! 7, 13 are the only possibilities.

Otherwise none of the answers are correct: unless you know what they teach are are willing to select an invalid but expected answer, do not check any and prepare to challenge the test validity (teaming with other students).

Looking at how a compiler for an ARM platform should process the two function calls below should make it clear why compilers are given flexibility:

void foo(int,int);
int bar(void);
extern int x;

...
  foo(x,bar());
  foo(bar(),x);

The ARM calling convention documents specify that on entry to any function whose arguments and return type are all 32 bits or smaller, registers R0 to R3 must be loaded with the arguments; any that don't have any assigned meaning will be ignored. On return, R0 will hold the return value (if any), and R1-R3 are unspecified.

The most efficient way of processing both of the above calls to foo would be to generate code that calls bar, then loads x, and then calls foo. Even though one of the calls has the x argument listed before the bar() argument, and the other has it listed after, calling bar() before loading x will be more efficient than waiting until after x is loaded to call bar. If it were necessary to load x first, a compiler would need to generate code that saves the value loaded from x somewhere before it calls bar, and then load that value back after the call.

Note that the authors of the Standard weren't trying to invite compilers to behave in gratuitously nonsensical fashion in cases where the sequence of operations might be observable. Their decision to waive jurisdiction over such issues was more likely motivated by a lack of anything they really wanted to say about it.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745266015a4619460.html

相关推荐

  • c - How do I determine the evaluation order of printf? - Stack Overflow

    I have read many questions on this topic already:What is the order of evaluation of printf(..) paramete

    4小时前
    20

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信