Discussion:
Infinite loops have UB
Add Reply
Noob
2017-07-07 20:21:01 UTC
Reply
Permalink
Raw Message
Hello,

Apparently, C11 allows compilers to remove infinite loops
by considering them UB? This is new in C11?
An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)>
157) This is intended to allow compiler transformations such as
removal of empty loops even when termination cannot be proven.
N1528: Why undefined behavior for infinite loops?
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm

Are compilers allowed to eliminate infinite loops?
https://stackoverflow.com/questions/2178115/are-compilers-allowed-to-eliminate-infinite-loops

C Compilers Disprove Fermat's Last Theorem
https://blog.regehr.org/archives/140

Regards.
James R. Kuyper
2017-07-07 20:59:57 UTC
Reply
Permalink
Raw Message
Post by Noob
Hello,
Apparently, C11 allows compilers to remove infinite loops
by considering them UB? ...
4p2 describes three ways the standard uses to identify undefined behavior:
"If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
constraint or runtimeconstraint is violated, the behavior is undefined.
Undefined behavior is otherwise indicated in this International Standard
by the words ‘‘undefined behavior’’ or by the omission of any explicit
definition of behavior. There is no difference in emphasis among these
three; they all describe ‘‘behavior that is undefined’’."
Post by Noob
... This is new in C11?
True.
Post by Noob
An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)>
6.8.5p6 does not contain the word "shall", nor the phrase "undefined
behavior". It does provide an explicit definition of the behavior: the
loop may terminate even if it would not actually do so according to the
ordinary semantics of such a loop. This is no general permission to do
anything the implementation wants, as would be the case with UB. It only
gives permission to terminate the loop.

Since 6.8.5p6 doesn't use any of the three ways listed in 4p2 for
identifying UB, the concept of UB doesn't have anything to do with this.
6.8.5p6 is not listed in Annex J.2, "Undefined behavior".

One way to think about this is that a loop meeting the requirements of
6.8.5p6 doesn't have any observable behavior associated with execution
of the body of the loop. That means an implementation is free to
optimize away any unobservable behavior that nominally would be
associated with that loop (5.1.2.3p6). The loop body therefore could
require no more than exactly 0 seconds to execute. Now, what is the
value of 0*infinity? Any mathematician can tell you that this is not
meaningful question. The C standard is therefore within it's rights to
give implementations permission to make an infinite loop with a 0-time
body take only a finite (or even 0) amount of time to execute.

Implementations are not required to perform that optimization, nor are
they required to document their choice. This should, therefore, qualify
as "unspecified behavior" (3.4.4p1), with only two choices provided:
terminate after a finite amount of time, or never terminate. This is,
however, not explicitly stated, and 6.8.5p6 isn't in the list of
unspecified behaviors in Annex J.1.
h***@gmail.com
2017-07-08 10:38:52 UTC
Reply
Permalink
Raw Message
On Friday, July 7, 2017 at 2:00:15 PM UTC-7, James R. Kuyper wrote:

(snip, someone wrote)
Post by James R. Kuyper
An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)>
(snip)
Post by James R. Kuyper
One way to think about this is that a loop meeting the requirements of
6.8.5p6 doesn't have any observable behavior associated with execution
of the body of the loop. That means an implementation is free to
optimize away any unobservable behavior that nominally would be
associated with that loop (5.1.2.3p6). The loop body therefore could
require no more than exactly 0 seconds to execute. Now, what is the
value of 0*infinity? Any mathematician can tell you that this is not
meaningful question. The C standard is therefore within it's rights to
give implementations permission to make an infinite loop with a 0-time
body take only a finite (or even 0) amount of time to execute.
OK, so the compiler can optimize out finite loops that don't
do anything, either.

for(i=0;i<1000000;i++) ;

so if one hopes for a little delay, the optimizer might remove it.
Post by James R. Kuyper
Implementations are not required to perform that optimization, nor are
they required to document their choice. This should, therefore, qualify
terminate after a finite amount of time, or never terminate. This is,
however, not explicitly stated, and 6.8.5p6 isn't in the list of
unspecified behaviors in Annex J.1.
Ben Bacarisse
2017-07-08 14:04:48 UTC
Reply
Permalink
Raw Message
Post by h***@gmail.com
(snip, someone wrote)
An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)>
<snip>
Post by h***@gmail.com
OK, so the compiler can optimize out finite loops that don't
do anything, either.
for(i=0;i<1000000;i++) ;
so if one hopes for a little delay, the optimizer might remove it.
This is not an "infinite loop" so it's really a side case. Even without
the text in question, a conforming C compiler can deduce that the loop
terminates and can replace the code with i = 1000000; (provided i has a
type that is wide enough).

<snip>
--
Ben.
James Kuyper
2017-07-09 01:45:26 UTC
Reply
Permalink
Raw Message
Post by h***@gmail.com
(snip, someone wrote)
Post by James R. Kuyper
An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)>
(snip)
Post by James R. Kuyper
One way to think about this is that a loop meeting the requirements of
6.8.5p6 doesn't have any observable behavior associated with execution
of the body of the loop. That means an implementation is free to
optimize away any unobservable behavior that nominally would be
associated with that loop (5.1.2.3p6). The loop body therefore could
require no more than exactly 0 seconds to execute. Now, what is the
value of 0*infinity? Any mathematician can tell you that this is not
meaningful question. The C standard is therefore within it's rights to
give implementations permission to make an infinite loop with a 0-time
body take only a finite (or even 0) amount of time to execute.
OK, so the compiler can optimize out finite loops that don't
do anything, either.
for(i=0;i<1000000;i++) ;
True, and perfectly ordinary. What's extraordinary is that 6.8.5p6
allows the same thing to be done for a loop that should never terminate.
Post by h***@gmail.com
so if one hopes for a little delay, the optimizer might remove it.
While trying to use C to create busy loops is a common practice, the C
standard provides no support for the idea. Unless each pass through your
loop has observable behavior that necessarily takes a finite amount of
time to achieve, an implementation is always free to optimize it away.
Ben Bacarisse
2017-07-07 21:22:28 UTC
Reply
Permalink
Raw Message
Post by Noob
Apparently, C11 allows compilers to remove infinite loops
by considering them UB?
That's a wild overstatement. C compilers can assume that loops without
and side effects and a non-constant controlling expression terminate.
Post by Noob
This is new in C11?
Yes, but many compilers did this already.
Post by Noob
An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)>
157) This is intended to allow compiler transformations such as
removal of empty loops even when termination cannot be proven.
N1528: Why undefined behavior for infinite loops?
That's not a helpful title! It's an internal document and those it is
primarily intended for will know the context; but infinite loops are not
UB per se.
Post by Noob
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm
Well worth reading though.

<snip>
--
Ben.
s***@casperkitty.com
2017-07-08 05:29:49 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
That's a wild overstatement. C compilers can assume that loops without
and side effects and a non-constant controlling expression terminate.
That ignores the $50,000 question: what is a compiler allowed to do on the
basis of such an assumption?

For example, suppose a compiler given the following code knows that function
foo() cannot exit and has no side-effects, but it cannot tell what argument
values, if any, would cause it to return an even number. Given the code:

long long foo(long long);
void test(long long arg)
{
while(arg & 1)
arg = foo(arg);
printf("The result was ");
printf(arg & 1 ? "odd" : "even");
printf(", more specifically %lld", arg);
}

what, if anything, would test() be allowed to output if given a value of
"arg" which would cause the "while" loop to run forever? I would see
significant value in allowing it to perform the first printf even if the
loop never terminates, since it should not be dependent in any way upon
anything that happens in the loop. Modern compiler philosophy would
suggest that because the second printf can't be reached when "arg" is odd,
a compiler should be allowed to assume it's even. That, however, seems
rather sketchier to me.
Pascal J. Bourguignon
2017-07-08 09:46:49 UTC
Reply
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Ben Bacarisse
That's a wild overstatement. C compilers can assume that loops without
and side effects and a non-constant controlling expression terminate.
That ignores the $50,000 question: what is a compiler allowed to do on the
basis of such an assumption?
For example, suppose a compiler given the following code knows that function
foo() cannot exit and has no side-effects, but it cannot tell what argument
long long foo(long long);
void test(long long arg)
{
while(arg & 1)
arg = foo(arg);
printf("The result was ");
printf(arg & 1 ? "odd" : "even");
printf(", more specifically %lld", arg);
}
what, if anything, would test() be allowed to output if given a value of
"arg" which would cause the "while" loop to run forever? I would see
significant value in allowing it to perform the first printf even if the
loop never terminates, since it should not be dependent in any way upon
anything that happens in the loop. Modern compiler philosophy would
suggest that because the second printf can't be reached when "arg" is odd,
a compiler should be allowed to assume it's even. That, however, seems
rather sketchier to me.
Those rules about UB vs. infinite loops or other dead-code sound to me
like if C wanted to do lazy evaluation. This is ridiculous, C is not
Haskell!

I would rather have a rule in the standard that any source instruction
should be translated independently.
--
__Pascal J. Bourguignon__
mailto:***@informatimago.com
https://www.informatimago.com
Ben Bacarisse
2017-07-08 11:05:23 UTC
Reply
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Ben Bacarisse
That's a wild overstatement. C compilers can assume that loops without
an[y] side effects and a non-constant controlling expression terminate.
That ignores the $50,000 question: what is a compiler allowed to do on the
basis of such an assumption?
"Ignores" is a bit strong. I'm not ignoring that question, I'm just not
addressing it because I am not an expert in compiler optimisation.

For that reason alone, I'll leave your example for others to comment on.

<snip>
--
Ben.
Manfred
2017-07-08 15:38:37 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by Noob
N1528: Why undefined behavior for infinite loops?
That's not a helpful title! It's an internal document and those it is
primarily intended for will know the context; but infinite loops are not
UB per se.
Post by Noob
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm
Well worth reading though.
Giving a look at that document, where it addresses "The technical issue:
loop optimization", I have a problem with its notion of "observable
behaviour":

for (p = q; p != 0; p = p -> next) {
++count;
}
for (p = q; p != 0; p = p -> next) {
++count2;
}

Could these two loops be merged and replaced by the following loop?

for (p = q; p != 0; p = p -> next) {
++count;
++count2;
}

The document focuses on termination of the first loop, but to me there
appears to be another issue before that:

Consider a separate thread that can observe count and count2 (these are
explicitly globals, multithreading is considered just in the subsequents
words)
Even with finite loops, the first version (two separate loops) generates
the following observable states only:

count == count2 == 0
count > count2
count == count2 == <total length of the list>

The single-loop version generates the following observable states:
count == count2+1
count == count2 == <any integer between 0 and length of list>

In particular, the state
count == count2 == <any intermediate between 1 and length of list-1>

cannot be generated by the abstract machine, so, how can such an
optimization comply with the observable behaviour rule?
(According to the observable behaviour n1570 5.1.2.3-6 let's assume
synchronized access to count and count2, and that the comparison of
count and count2 affects program output)

Thanks in advance for any explanation
Manfred
2017-07-08 15:43:31 UTC
Reply
Permalink
Raw Message
Post by Manfred
(According to the observable behaviour n1570 5.1.2.3-6 let's assume
synchronized access to count and count2, and that the comparison of
count and count2 affects program output)
Sorry, my bad: syncronization here rules out this optimization
David Brown
2017-07-11 20:22:15 UTC
Reply
Permalink
Raw Message
Post by Manfred
Post by Manfred
(According to the observable behaviour n1570 5.1.2.3-6 let's assume
synchronized access to count and count2, and that the comparison of
count and count2 affects program output)
Sorry, my bad: syncronization here rules out this optimization
Exactly. count and count2, despite being globals, can be changed any
number of times and in any order - unless they are volatile, atomic, or
otherwise synchronized (which, as you say, rules out this optimisation).
Kenny McCormack
2017-07-11 20:27:08 UTC
Reply
Permalink
Raw Message
Post by David Brown
Post by Manfred
Post by Manfred
(According to the observable behaviour n1570 5.1.2.3-6 let's assume
synchronized access to count and count2, and that the comparison of
count and count2 affects program output)
Sorry, my bad: syncronization here rules out this optimization
Exactly. count and count2, despite being globals, can be changed any
number of times and in any order - unless they are volatile, atomic, or
otherwise synchronized (which, as you say, rules out this optimisation).
It is my understanding that we are not allowed to use the word "global"
in this newsgroup.

(Like the word "stack" and a few others...)
--
"Unattended children will be given an espresso and a free kitten."
David Brown
2017-07-12 22:03:24 UTC
Reply
Permalink
Raw Message
Post by Kenny McCormack
Post by David Brown
Post by Manfred
Post by Manfred
(According to the observable behaviour n1570 5.1.2.3-6 let's assume
synchronized access to count and count2, and that the comparison of
count and count2 affects program output)
Sorry, my bad: syncronization here rules out this optimization
Exactly. count and count2, despite being globals, can be changed any
number of times and in any order - unless they are volatile, atomic, or
otherwise synchronized (which, as you say, rules out this optimisation).
It is my understanding that we are not allowed to use the word "global"
in this newsgroup.
(Like the word "stack" and a few others...)
/You/ are perhaps not allowed to use these words - /I/ am because I know
what I mean by them, and most others here know what is meant. Those
that are interested in the fine details can internally translate words
like "global" and "stack" into the exact phrases used in the standards
as and when they feel it is relevant.

Tim Rentsch
2017-07-10 20:26:53 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by Noob
Apparently, C11 allows compilers to remove infinite loops
by considering them UB?
That's a wild overstatement. C compilers can assume that loops without
and side effects and a non-constant controlling expression terminate.
I agree it's an overstatement. I wouldn't go so far as to call it
a wild overstatement.
Post by Ben Bacarisse
[...] infinite loops are not UB per se.
They aren't in the sense that the Standard defines the term, but
in some cases this rule in the Standard allows the same range of
consequences as for undefined behavior. The problem is that
assuming that a loop will terminate may give rise to an unsound
logic, in which any deduction is valid. If any deduction is
valid, we can conclude anything we like about how the program
is supposed to behave. Practically speaking that's no different
than undefined behavior.

IMO it's a huge mistake to take a program construct that has
well-defined semantics, and then modify it so that in some cases
what the semantics are is indistinguishable from what would be
allowed for undefined behavior. What is worse, there is no clear
demarcation between cases that might have this property and those
that don't. The only safe way to treat it is to assume that all
potentially infinite loops (ie, that meet the criteria given in
the Standard's rule) have undefined behavior.
s***@casperkitty.com
2017-07-10 21:23:01 UTC
Reply
Permalink
Raw Message
Post by Tim Rentsch
IMO it's a huge mistake to take a program construct that has
well-defined semantics, and then modify it so that in some cases
what the semantics are is indistinguishable from what would be
allowed for undefined behavior. What is worse, there is no clear
demarcation between cases that might have this property and those
that don't. The only safe way to treat it is to assume that all
potentially infinite loops (ie, that meet the criteria given in
the Standard's rule) have undefined behavior.
If the Standard had specified that code may be re-ordered, even across an
infinite loop, provided that no computation's result is observably moved
before the computation, do you think that would have achieved the desired
effect? Given:

int test(long long x, long long y)
{
long long q=x;
while(q)
q=function_with_no_side_effects(q);
otherFunction1();
otherFunction2(q * y);
}

such a rule would permit a compiler to overlap the execution of
otherFunction1() with that of the loop without regard for whether it
terminates (since it has no dependency upon anything within the loop).
It would allow the execution of otherFunction2 to be overlapped if y
is zero and it can prove that the loop will terminate (since such
overlapping would not be observable). If the loop didn't terminate,
however, the call to otherFunction2() would have a data dependency upon
a value computed within the loop, thus precluding overlap by making it
observable.
Keith Thompson
2017-07-10 22:07:30 UTC
Reply
Permalink
Raw Message
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Noob
Apparently, C11 allows compilers to remove infinite loops
by considering them UB?
That's a wild overstatement. C compilers can assume that loops without
and side effects and a non-constant controlling expression terminate.
I agree it's an overstatement. I wouldn't go so far as to call it
a wild overstatement.
Post by Ben Bacarisse
[...] infinite loops are not UB per se.
They aren't in the sense that the Standard defines the term, but
in some cases this rule in the Standard allows the same range of
consequences as for undefined behavior. The problem is that
assuming that a loop will terminate may give rise to an unsound
logic, in which any deduction is valid. If any deduction is
valid, we can conclude anything we like about how the program
is supposed to behave. Practically speaking that's no different
than undefined behavior.
IMO it's a huge mistake to take a program construct that has
well-defined semantics, and then modify it so that in some cases
what the semantics are is indistinguishable from what would be
allowed for undefined behavior. What is worse, there is no clear
demarcation between cases that might have this property and those
that don't. The only safe way to treat it is to assume that all
potentially infinite loops (ie, that meet the criteria given in
the Standard's rule) have undefined behavior.
IMHO it was also a mistake (perhaps a huge one) to define this
in vague terms such as "may be assumed by the implementation
to terminate". It should have been described in terms of the
permissible behavior of the program (including undefined behavior
in some cases if that's the intent).

I'm not pleased that this:

const int keep_going = 1;
while (keep_going) {
;
}
puts("How did we get here?");

can legally print "How did we get here?", but I'm less pleased by the
fact that it's described in such vague terms.

Context: N1570 6.8.5p6

An iteration statement whose controlling expression is not
a constant expression, that performs no input/output
operations, does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.

A footnote says:

This is intended to allow compiler transformations such as
removal of empty loops even when termination cannot be proven.

This was added in C11.

(If you want a loop to be unconditionally infinite, make sure the
controlling expression is a constant expression. `while (1)` and
`for (;;)` still work as expected.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2017-07-10 23:10:45 UTC
Reply
Permalink
Raw Message
This is intended to allow compiler transformations such as
removal of empty loops even when termination cannot be proven.
This was added in C11.
(If you want a loop to be unconditionally infinite, make sure the
controlling expression is a constant expression. `while (1)` and
`for (;;)` still work as expected.)
IMHO, a much more significant benefit of the rule is that it allows
independent calculations to be performed in parallel if at least one
of the calculations has no effects which would be observable during
the execution of the other. That can offer some major performance
benefits even if compilers are required to honor data dependencies,
and would not generally be problematic if compilers honor data
dependencies. Given:

void test(long long x)
{
while(x & 1)
x = function_with_no_side_effects(x);
printf("Hey there!\n");
printf("Zero times x = %d\n", x*0);
printf("X and 1%d\n", x & 1);
}

There are many cases where allowing code with no dependencies upon
computations done in a loop (e.g. the first printf) to run in parallel
with it could allow major performance gains. Extending that permission
to cases where the computations in a loop are irrelevant to program
output (e.g. the second printf) would sometimes offer benefits, but
not as often as in the no-dependencies case. Such extension would also
increase the risk of "astonishing" behavior.

Extending the permission even further, to cases where a compiler could
show that it could determine everything it needed to determine what a
program would output *if the loop terminates*, would sometimes allow
still more optimizations, but increase the likelihood that a program
which will search for some problem's solution until it finds one might
behave in arbitrary fashion when no solution exists.

If the Standard defined a "__stdc_dummy_loop_side_effect()" intrinsic, then
it might be practical to require that any loop which might be endless must
include some side-effect (since code which had no other reason to include
a side-effect could use the dummy one, which wouldn't impede efficiency as
much as a volatile write, dummy function call, etc. As of yet, however,
all Standard-defined ways of generating "side-effects" would impose a
needless run-time cost.
Tim Rentsch
2017-07-10 23:45:49 UTC
Reply
Permalink
Raw Message
Post by Keith Thompson
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Noob
Apparently, C11 allows compilers to remove infinite loops
by considering them UB?
That's a wild overstatement. C compilers can assume that loops without
and side effects and a non-constant controlling expression terminate.
I agree it's an overstatement. I wouldn't go so far as to call it
a wild overstatement.
Post by Ben Bacarisse
[...] infinite loops are not UB per se.
They aren't in the sense that the Standard defines the term, but
in some cases this rule in the Standard allows the same range of
consequences as for undefined behavior. The problem is that
assuming that a loop will terminate may give rise to an unsound
logic, in which any deduction is valid. If any deduction is
valid, we can conclude anything we like about how the program
is supposed to behave. Practically speaking that's no different
than undefined behavior.
IMO it's a huge mistake to take a program construct that has
well-defined semantics, and then modify it so that in some cases
what the semantics are is indistinguishable from what would be
allowed for undefined behavior. What is worse, there is no clear
demarcation between cases that might have this property and those
that don't. The only safe way to treat it is to assume that all
potentially infinite loops (ie, that meet the criteria given in
the Standard's rule) have undefined behavior.
IMHO it was also a mistake (perhaps a huge one) to define this
in vague terms such as "may be assumed by the implementation
to terminate". It should have been described in terms of the
permissible behavior of the program (including undefined behavior
in some cases if that's the intent). [...]
Wholeheartedly agree.
Loading...