Tuesday, April 08, 2008

C++ evaluation order gotcha

My girlfriend was looking through an introductory C++ programming book the other day, but while she was comparing actual results with expected results, she saw a discrepancy and I was called in to figure out what was going on.

The program is pretty simple:

#include <iostream>

int f(int x, int y)
{
    std::cout << "f()\n";
    return x + y;
}

int main()
{
    std::cout << "Calling f()\n";
    std::cout << "Result is: " << f(3, 4);
    std::cout << "\n";
    return 0;
}
The book indicated that it should print out the following, which indeed it does (using bcc32 6.10, g++ 3.4.4 and cl 9):
Calling f()
f()
Result is: 7

The question my astute girlfriend has is: why is 'f()' printed out before 'Result is: ', when 'Result is: ' appears earlier in the expression? In other words, why doesn't it print out:

Calling f()
Result is: f()
7
... as you might expect, following the logic from top to bottom and left to right?

The answer comes down to how C++'s designers have, in their infinite wisdom, decided to implement I/O. The left shift operator, '<<', is overloaded and so the critical expression gets parsed as a set of function calls, roughly like this:

operator<<(operator<<(std::cout, "Result is :"), f(3, 4));
This is somewhat explained on the MSDN page allegedly describing Order of Evaluation (C++).

So, the most nested things need to get evaluated first, that's pretty unavoidable. Because of the associativity of the '<<' operator, things on the left end of the line get printed first, followed by each successive item.

Here's where the subtlety is, though: that most noble order, C++ definers, have decreed that the order of evaluation of arguments is beneath their dignity; that is to say, order of argument evaluation is undefined.

However, there are some general things we can say about C++ as implemented on x86. It usually defaults to the C calling convention, which pushes arguments on the stack from right to left, in order to support varargs. Since the arguments need to be pushed from right to left, the compiler will generally take a hint and evaluate them from right to left too. That's where our strange out of order result comes from.

Delphi isn't much less guilty on this specific front, however: it doesn't have a standard in the first place, and it also generally evaluates arguments from right to left. That's often more efficient when starved for registers: usually one will want to use EAX, EDX and ECX - the registers available for Delphi in the default calling convention - for purposes of evaluating the excess arguments (if more than three), which will end up pushed on the stack. If the first argument, which will ultimately end up in EAX, got evaluated first, and so on, many valuable registers would be tied up only to calculate values that end up pushed on the stack, no longer needing a register. The alternatives, spilling to the stack or evaluating to a temporary, are wasteful.

However, for the specific instance of I/O - Writeln - Delphi breaks up multiple outputs into separate calls. So, our code, in Delphi:

 Writeln('Result is: ', f(3, 4));
get translated, behind the scenes, into something vaguely like:
Write0LString(Output, 'Result is: ');
WriteLong(Output, f(3, 4));
WriteLn(Output);
... with much less confusion for the beginner.

3 comments:

Xepol said...

My personal favorite C order issue is how you can do something like this:

int f(int* x) { return (*x+=1); }
int main()
{
int i=1;
printf('%d %d',f(&i),i);
}

I have found that most C programmers will not accurately predict the outcome.

Barry Kelly said...

@xepol - I shouldn't expect that they would predict the outcome, as it is undefined. Argument evaluation order is undefined, so the 'i' that gets printed second could be evaluated before or after the call to f(). Both '2 1' and '2 2' are possible outcomes.

Kyle said...

I ran into a similar issue today, with:

int n = 0;
cout << n++ << n << n++ << endl;

Which prints "120". I was expecting "110".