Discussion:
compilers that warn on broken loops?
(too old to reply)
mathog
2017-07-12 17:47:33 UTC
Permalink
Raw Message
Every once and a while I accidentally write a loop with no exit, often
because an edit removed an increment. Here is an example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void){
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
fprintf(stdout,"Result: %s\n",string);
exit(EXIT_SUCCESS);
}


Compiled like:

gcc -Wall --std=c99 -pedantic -o test test.c

there are no warnings whatsoever (gcc 5.3.1).

Even Clang's static analysis does not complain:

scan-build gcc -g -O0 -Wall -std=c99 -pedantic -o test test.c

Why no warnings on these broken loops? Would it be that hard for the
compilers to test for a loop with no exit path?

Oddly the first compile (above) results in a program which actually runs
without segfaulting on my test machine, although the output from the 6th
position on is variable. Changing the optimization level changes the
behavior.

./test
Result: vguv+>3B]BJ>�

Anyway, it would be nice if compilers would warn on these, because
sometimes they "sort of work", as in, not crashing anywhere near the
problem loop, which can make them tedious to find. For instance, having
to run in valgrind.

Regards,

David Mathog
h***@gmail.com
2017-07-12 18:11:32 UTC
Permalink
Raw Message
Post by mathog
Every once and a while I accidentally write a loop with no exit, often
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
fprintf(stdout,"Result: %s\n",string);
exit(EXIT_SUCCESS);
}
(snip)

Thanks to Alan Turing:

https://en.wikipedia.org/wiki/Halting_problem

you might be able to get the easy cases, but not all of them.

In this case, you might consider aliasing, and that ptr could
point to i, or at least notice that it isn't easy for the compiler
to figure out that it can't.
Ben Bacarisse
2017-07-12 19:26:55 UTC
Permalink
Raw Message
Post by h***@gmail.com
Post by mathog
Every once and a while I accidentally write a loop with no exit, often
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
fprintf(stdout,"Result: %s\n",string);
exit(EXIT_SUCCESS);
}
(snip)
https://en.wikipedia.org/wiki/Halting_problem
you might be able to get the easy cases, but not all of them.
In this case, you might consider aliasing, and that ptr could
point to i, or at least notice that it isn't easy for the compiler
to figure out that it can't.
As it happens, the compiler is allowed to assume that *ptr and i don't
alias. It probably will do that so for optimisation (most likely i
won't be reloaded from memory every time i < 4 is evaluated), and so it
could do so for the purposes of issuing a warning.
--
Ben.
David Thompson
2017-08-21 18:47:01 UTC
Permalink
Raw Message
On Wed, 12 Jul 2017 20:26:55 +0100, Ben Bacarisse
Post by Ben Bacarisse
Post by h***@gmail.com
Post by mathog
Every once and a while I accidentally write a loop with no exit, often
<snip>
Post by Ben Bacarisse
Post by h***@gmail.com
Post by mathog
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
<snip>
Post by Ben Bacarisse
Post by h***@gmail.com
In this case, you might consider aliasing, and that ptr could
point to i, or at least notice that it isn't easy for the compiler
to figure out that it can't.
As it happens, the compiler is allowed to assume that *ptr and i don't
alias. It probably will do that so for optimisation (most likely i
won't be reloaded from memory every time i < 4 is evaluated), and so it
could do so for the purposes of issuing a warning.
Well, in this case the compiler can easily prove the value of ptr by
dataflow, but if you're saying it can assume ptr mustn't point to i
_because_ of the different type, that's wrong. Although _most_
wrong-type accesses are UB, it is permitted to access (the bytes of)
an object of any type (if addressable at all i.e. not register) using
a pointer to plain/signed/unsigned char; 6.5p7. Although if you
thereby create a trap representation (only if the type _has_ trap
representation(s) on this implementation) _and_ subsequently read the
trap rep (as its true type) _then_ it's UB.
s***@casperkitty.com
2017-08-21 19:46:17 UTC
Permalink
Raw Message
Post by David Thompson
Post by mathog
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
Well, in this case the compiler can easily prove the value of ptr by
dataflow, but if you're saying it can assume ptr mustn't point to i
_because_ of the different type, that's wrong. Although _most_
wrong-type accesses are UB, it is permitted to access (the bytes of)
an object of any type (if addressable at all i.e. not register) using
a pointer to plain/signed/unsigned char; 6.5p7. Although if you
thereby create a trap representation (only if the type _has_ trap
representation(s) on this implementation) _and_ subsequently read the
trap rep (as its true type) _then_ it's UB.
No need for fancy data-flow analysis. Nothing in the above code takes
the address of "i", and nothing could be reachable from anywhere that the
address of "i" could be taken and still be valid.
Ben Bacarisse
2017-08-21 20:18:59 UTC
Permalink
Raw Message
Post by David Thompson
On Wed, 12 Jul 2017 20:26:55 +0100, Ben Bacarisse
Post by Ben Bacarisse
Post by h***@gmail.com
Post by mathog
Every once and a while I accidentally write a loop with no exit, often
<snip>
Post by Ben Bacarisse
Post by h***@gmail.com
Post by mathog
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
<snip>
Post by Ben Bacarisse
Post by h***@gmail.com
In this case, you might consider aliasing, and that ptr could
point to i, or at least notice that it isn't easy for the compiler
to figure out that it can't.
As it happens, the compiler is allowed to assume that *ptr and i don't
alias. It probably will do that so for optimisation (most likely i
won't be reloaded from memory every time i < 4 is evaluated), and so it
could do so for the purposes of issuing a warning.
Well, in this case the compiler can easily prove the value of ptr by
dataflow, but if you're saying it can assume ptr mustn't point to i
_because_ of the different type, that's wrong. Although _most_
wrong-type accesses are UB, it is permitted to access (the bytes of)
an object of any type (if addressable at all i.e. not register) using
a pointer to plain/signed/unsigned char; 6.5p7.
Yup, good spot. Thanks. I wasn't paying attention the actual types in
the example!

<snip>
--
Ben.
bartc
2017-07-12 18:44:27 UTC
Permalink
Raw Message
Post by mathog
Every once and a while I accidentally write a loop with no exit, often
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
fprintf(stdout,"Result: %s\n",string);
exit(EXIT_SUCCESS);
}
gcc -Wall --std=c99 -pedantic -o test test.c
there are no warnings whatsoever (gcc 5.3.1).
scan-build gcc -g -O0 -Wall -std=c99 -pedantic -o test test.c
Why no warnings on these broken loops? Would it be that hard for the
compilers to test for a loop with no exit path?
Oddly the first compile (above) results in a program which actually runs
without segfaulting on my test machine, although the output from the 6th
position on is variable. Changing the optimization level changes the
behavior.
./test
Result: vguv+>3B]BJ>�
Anyway, it would be nice if compilers would warn on these, because
sometimes they "sort of work", as in, not crashing anywhere near the
problem loop, which can make them tedious to find. For instance, having
to run in valgrind.
As far as the language is concerned, there are no official loop
variables, and endless loops are allowed. So it is hard to divine your
intention. And as has been pointed out, in the general case it is hard
to determine if the loop will exit.

At one time I used macros like this to write my loops:

#define FORN(i,n) for (i=0; i<n; ++i)
#define FORN(i,n) for (int i=0; i<n; ++i) // alternate
#define FORAB(i,a,b) for (i=a; i<=b; ++i)

and used them like this:

FORN(i,10) printf("%d\n",i);

Then they are more immune to such errors. But anything more usual, such
as including ptr++ within the loop header, is harder to do.
--
bartc
mathog
2017-07-12 20:19:24 UTC
Permalink
Raw Message
Post by bartc
As far as the language is concerned, there are no official loop
variables,
Official, no, implicit yes. In a for loop there is a specific section
which is tested at each iteration, so if that consists of a test like

i < count

which implicitly defines i and count as "loop variables". If neither i
nor count are modified within the loop, and there are no other exit
mechanisms, and these are local variables, then it is almost certainly
an error. Why would the loop test be present if there was no way for it
to ever be false?
Post by bartc
and endless loops are allowed.
Here we need to distinguish between this, which is fine

while (1){
// something in here can break out
}

and this, which is isn't so wonderful

while (1){
//nothing here can break out
}

In the second case I would want a warning. There may be a handful of
cases where a program is literally supposed to run forever in a single
loop, with no possible way to stop short of pulling the plug, but that
isn't all that common.

Regards,

David Mathog
David Brown
2017-07-13 08:53:16 UTC
Permalink
Raw Message
Post by mathog
Post by bartc
As far as the language is concerned, there are no official loop
variables,
Official, no, implicit yes. In a for loop there is a specific section
which is tested at each iteration, so if that consists of a test like
i < count
which implicitly defines i and count as "loop variables". If neither i
nor count are modified within the loop, and there are no other exit
mechanisms, and these are local variables, then it is almost certainly
an error. Why would the loop test be present if there was no way for it
to ever be false?
Post by bartc
and endless loops are allowed.
Here we need to distinguish between this, which is fine
while (1){
// something in here can break out
}
and this, which is isn't so wonderful
while (1){
//nothing here can break out
}
In the second case I would want a warning. There may be a handful of
cases where a program is literally supposed to run forever in a single
loop, with no possible way to stop short of pulling the plug, but that
isn't all that common.
It is /very/ common to have such loops. Pretty much every embedded
system (of which there are bazillions, mostly programmed in C) runs such
a loop. On other systems, loops like that may also be used (with
"pulling the plug" being "killed by the OS").
Keith Thompson
2017-07-12 21:03:47 UTC
Permalink
Raw Message
Post by mathog
Every once and a while I accidentally write a loop with no exit, often
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
int i;
char string[100];
strcpy(string,"test");
char *ptr = string;
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
(*ptr) += 2;
}
fprintf(stdout,"Result: %s\n",string);
exit(EXIT_SUCCESS);
}
gcc -Wall --std=c99 -pedantic -o test test.c
there are no warnings whatsoever (gcc 5.3.1).
scan-build gcc -g -O0 -Wall -std=c99 -pedantic -o test test.c
Why no warnings on these broken loops? Would it be that hard for the
compilers to test for a loop with no exit path?
Perhaps I'm using a more recent version of clang.

$ clang --version
clang version 3.8.1-12ubuntu1 (tags/RELEASE_381/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
You have new mail in /var/mail/kst
$ clang -std=c11 -Wall -c c.c
c.c:10:13: warning: variable 'i' used in loop condition not modified in
loop body [-Wfor-loop-analysis]
for(i=0; i<4; ptr++){ // note missing i++, loop is broken
^
1 warning generated.
$

More generally, a clever compiler could notice that i is set to 0 and
never modified, and warn that `i<4` is always true. But no compiler is
as clever as we want it to be.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
mathog
2017-07-12 22:28:40 UTC
Permalink
Raw Message
Post by Keith Thompson
Perhaps I'm using a more recent version of clang.
$ clang --version
clang version 3.8.1-12ubuntu1 (tags/RELEASE_381/final)
Yes it is more recent. The one in Centos 6.9 on the test system
here is:

clang version 3.4.2 (tags/RELEASE_34/dot2-final)
Target: x86_64-redhat-linux-gnu
Thread model: posix

Anyway, it is nice to see that at least one tool has this sort of warning.

Regards,

David Mathog
Loading...