收录日期:2019/08/18 17:23:34 时间:2011-02-16 23:07:16 标签:c++,optimization,compiler-construction

The code I have looks like this (all uses of done shown):

bool done = false;
for(int i = 0; i < big; i++)
{
  ...
  for(int j = 0; j < wow; j++)
  {
    ...
    if(foo(i,j))
    {
       done = true;
       break;
    }
    ...
  }
  if(done) break;
  ...
}

will any compilers convert it to this:

for(int i = 0; i < big; i++)
{
  ...
  for(int j = 0; j < wow; j++)
  {
    ...
    if(foo(i,j))
      goto __done; // same as a labeled break if we had it
    ...
  }
  ...
}
__done:;

Note: While I'm mostly interested in if the if(done)break; gets bypassed and removed as dead code, I'm also interested in if it and done gets removed altogether.

Obviously this depends on the compiler. The best thing to do when you're unsure is to view the compiler's assembly output (all popular compilers have a switch for this). Even if you aren't familiar with assembly, you can at least compare the debug version with the optimized version.

That being said, this is one of the few situations where goto is NOT a bad idea. Feel free to use it to break out of inner loops.

Edit

Just tried the following in VS2010 and it does indeed optimize the outer conditional:

bool done = false;
for(int i = 0; i < 10; i++)
{
    for(int j = 0; j < 10; j++)
    {
        if(i == 7 && j == 3)
        {
            done = true;
            break;
        }
    }
    if(done) break;
}
return 0;

GNU compiler does just that, starting with optimization level -O1 (I am using gcc 4.5.1 on x86_64)

call    _Z3fooii  // al = foo(i,j)
testb   %al, %al
jne .L14
...

where .L14 is the label placed exactly where you placed __done:

A better question might be: which modern compiler does not perform this optimization?

I'm not trying to be snarky, but...does it matter? In general, I think it's best to let compilers to their job, and that job is to produce the "best" (note that "best" may vary depending on your needs) compiled code given your source code. Any performance considerations in your code should be identified with a profiler and good working knowledge of algorithmic complexity.

If you're just curious, then disregard this comment. But if your intention is to somehow optimize your code, then I think there are much better avenues.

I've tried GCC 4.2.1 with the following:

// Prevent optimizing out calls to foo and loop unrolling:
extern int big, wow;
bool foo(int,int);

void
bar()
{
    int done = false;
    for(int i = 0; i < big; i++)
    {
        for(int j = 0; j < wow; j++)
        {
            if(foo(i,j))
            {
                done = true;
                break;
            }
        }
        if(done)
            break;
    }
}

...and it falls through straight to postamble with -O3:

  33:   e8 fc ff ff ff          call   34 <bar()+0x34> ; call to foo*
  38:   84 c0                   test   %al,%al
  3a:   74 e5                   je     21 <bar()+0x21> ; next loop iteration
  3c:   83 c4 10                add    $0x10,%esp
  3f:   5b                      pop    %ebx
  40:   5e                      pop    %esi
  41:   5d                      pop    %ebp
  42:   c3                      ret

* This is from an unlinked object file, call 34 is actually call to foo.