Thursday, August 30, 2012

C Gotchas - The Next Generation

I just got burned by this one.  Here's the code:

modulus = a + (TRUE == display_warning_message)?1:0 + num_faults;

Let's say a is 1, display_warning_message is TRUE and num_faults is 16. What is modulus?

If you said 18, you're wrong, it's 2.

Let's try again.  Same code, but a is 2, display_warning_message is TRUE and num_faults is 12.  What's modulus?

If you said 15, you're wrong - it's 3.

But why?

Because parenthesis man, parenthesis.  Here's what C sees:

modulus = a + (TRUE == display_warning_message)?1:(0+num_faults);

Here's what you wanted:

modulus = a + ((TRUE == display_warning_message)?1:0) + num_faults;

You see, the C compiler follows the 'biggest chunk' rule for evaluating expressions: if there's ambiguity about what expressions get put together it grabs as large of a chunk of code as it can.  This is the prototypical example:

c = a+++b;
How does the C compiler see this?  Is it:

c = a + (++b);

Or:

 c = (a++) + b;

The answer is to work left to right and apply the biggest chunk rule.  The 'c=' is not ambiguous, so we don't need to worry about it.  The next ambiguous part is the 'a+++b'.  So, starting at the left, the biggest valid chunk of code we can pick is a++ - post increment on a.  That means the second interpretation of the code is correct.

But you should never have to learn the biggest chunk rule.  Instead, just use parenthesis to de-ambiguate your code. Parenthesis always win in Order of Operations Paper/Rock/Scissors - this makes it a very boring game.