Discussion:
Representation of _Bool
Add Reply
Keith Thompson
2021-05-24 02:14:09 UTC
Reply
Permalink
As promised, I've studied what the C standard says about the
requirements for the representation of _Bool. I've referred to the
C11 standard and to drafts of C17 and C2x (N2596). C11 and C17 do
not differ in this area as far as I can tell, but there are some
new things in the C2x proposal.

An object declared as type _Bool is large enough to store the values
0 and 1.

_Bool is an unsigned integer type.

The rank of _Bool shall be less than the rank of all other standard
integer types. This implies that the range of values of _Bool is
a subrange of the range of values of unsigned char. A _Bool object
cannot store a value less than 0 or greater than UCHAR_MAX.

When any scalar value is converted to _Bool, the result is 0 if the
value compares equal to 0; otherwise, the result is 1. This makes
it difficult, but not impossible, to store a value other than 0
or 1 in a _Bool object, but it can be done (or at least attempted)
via type-punning using a union with _Bool and unsigned char members.

C11 footnote: "While the number of bits in a _Bool object is at least
CHAR_BIT, the width (number of sign and value bits) of a _Bool may be
just 1 bit." This acknowledges that _Bool *may* have more than one
value bit, and therefore may represent values other than 0 and 1.
N2596 drops the parenthesized clause (probably because _Bool has
no sign bit).

N2596 adds a macro BOOL_WIDTH to <limits.h>, "width for an object
of type _Bool". It is *at least* 1, implying again that it can
be greater than 1. (I don't see any implementation that defines
BOOL_WIDTH.)

(N2596 also changes the definitions of false and true in <stdbool.h>
so they're of type _Bool rather than int. This doesn't affect
representation.)

Conclusions:

sizeof (_Bool) >= 1. It may be greater than 1, but that would
be weird. If sizeof (_Bool) > 1, then it must have padding bits.

_Bool has no sign bit.

_Bool has *at least* one value bit. It may have more, but no more
than CHAR_BIT of them.

The standard allows some variations in how _Bool is represented.
C programmers would be well advised to avoid writing code for which
this matters.

A conforming implementation may do any of the following (I'll assume
for brevity that CHAR_BIT==8):

* _Bool has 8 value bits. Any value from 0 to 255 inclusive
is valid. Storing a value other than 0 or 1 can be done via
type punning using a union of a _Bool and an unsigned char.

* _Bool has 1 value bit and 7 padding bits, with 254 trap
representations. Using type punning to store a value other than
0 or 1 in a _Bool object, and then accessing that object's value,
results in undefined behavior.

* _Bool has 1 value bit, 7 padding bits, and no trap representations.
Since padding bits by definition do not contribute to the value,
only the value bit's value is relevant. Using type punning to store
a value other than 0 or 1 in a _Bool object gives it a value of 0
if the value is even, 1 if the value is odd.

Other variations are possible (and arguably silly). For example, _Bool
might have 4 value bits and 4 padding bits, or it might be bigger than
1 byte. I expect that kind of thing only on the DeathStation 9000.

Here's a small program that attempts to explore how an implementation
represents objects of type _Bool:

#include <stdio.h>
#include <limits.h>

union U {
_Bool b;
unsigned char rep;
};

int main(void) {
union U obj;
_Bool b;
for (obj.rep = 0; obj.rep <= 3; obj.rep ++) {
printf("obj.b = %d, which is %s, obj.rep = %d",
obj.b, obj.b ? "true " : "false", obj.rep);
b = obj.b;
printf(" ... b = %d, which is %s\n", b, b ? "true " : "false");
}
}

Using gcc 11.1.0, on Ubuntu 20.02 x86_64, I get this output:
obj.b = 0, which is false, obj.rep = 0 ... b = 0, which is false
obj.b = 1, which is true , obj.rep = 1 ... b = 1, which is true
obj.b = 2, which is true , obj.rep = 2 ... b = 2, which is true
obj.b = 3, which is true , obj.rep = 3 ... b = 3, which is true

This mostly looks like _Bool has 8 value bits, but if that were the
case, then I *think* that the value of b would always be 0 or 1.
The rules of simple assignment (b = obj.b) specify that the value
of the right operand is converted to the type of the assignment
expression. Converting *any* scalar value to _Bool yields 0 or 1,
even if the value is already of type _Bool. So I conclude that
for gcc, 2 and 3 (and probably anything other than 0 or 1) are
trap representations for _Bool, and that _Bool has 1 value bit,
7 padding bits, and 254 trap representation.

It's possible that the intent is for _Bool to have 8 value bits and the
gcc authors' interpretation of the requirements for simple assignment
differ from mine. (I won't presume to say who's right.)

Using clang 12.0.0 on the same system, I get:
obj.b = 0, which is false, obj.rep = 0 ... b = 0, which is false
obj.b = 1, which is true , obj.rep = 1 ... b = 1, which is true
obj.b = 0, which is false, obj.rep = 2 ... b = 0, which is false
obj.b = 1, which is true , obj.rep = 3 ... b = 1, which is true

All bits other than the low-order one are ignored. This is
consistent with _Bool having 1 value bit, 7 padding bits, and no
trap representations. It's also consistent with 2 and 3 being
trap representations, since that would cause undefined behavior.
It's not consistent with _Bool having more than 1 value bit.

When implementers add support for BOOL_WIDTH, they'll have to decide
explicitly how many value bits _Bool has.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Ben Bacarisse
2021-05-24 11:11:17 UTC
Reply
Permalink
Post by Keith Thompson
sizeof (_Bool) >= 1. It may be greater than 1, but that would
be weird. If sizeof (_Bool) > 1, then it must have padding bits.
I don't understand how you draw that last conclusion.

<cut>
Post by Keith Thompson
Here's a small program that attempts to explore how an implementation
#include <stdio.h>
#include <limits.h>
union U {
_Bool b;
unsigned char rep;
};
When doing this kind of thing, my preference is to write

union U {
_Bool b;
unsigned char rep[sizeof (_Bool)];
};

even when it's very unlikely that the size will be > 1. It makes the
purpose so very clear.
--
Ben.
Richard Damon
2021-05-24 11:43:10 UTC
Reply
Permalink
Post by Ben Bacarisse
Post by Keith Thompson
sizeof (_Bool) >= 1. It may be greater than 1, but that would
be weird. If sizeof (_Bool) > 1, then it must have padding bits.
I don't understand how you draw that last conclusion.
rank(_Bool) < rank(unsigned char) so
max value of _Bool <= UCHAR_MAX so
max number of value bits in _Bool is CHAR_BITS

_Bool has sizeof(_Bool)*CHAR_BIT bits in it, and only CHAR_BIT of them
can be value bits.

if sizeof(_Bool) > 1 there are bits left over that aren't value or sign
(since it doesn't have any, being an unsigned type) bits, so must be
padding bits.
Ben Bacarisse
2021-05-24 16:27:53 UTC
Reply
Permalink
Post by Richard Damon
Post by Ben Bacarisse
Post by Keith Thompson
sizeof (_Bool) >= 1. It may be greater than 1, but that would
be weird. If sizeof (_Bool) > 1, then it must have padding bits.
I don't understand how you draw that last conclusion.
rank(_Bool) < rank(unsigned char) so
max value of _Bool <= UCHAR_MAX so
max number of value bits in _Bool is CHAR_BITS
Ah, yes. Thanks.
--
Ben.
Vir Campestris
2021-05-25 20:20:24 UTC
Reply
Permalink
Post by Richard Damon
rank(_Bool) < rank(unsigned char) so
max value of _Bool <= UCHAR_MAX so
max number of value bits in _Bool is CHAR_BITS
_Bool has sizeof(_Bool)*CHAR_BIT bits in it, and only CHAR_BIT of them
can be value bits.
if sizeof(_Bool) > 1 there are bits left over that aren't value or sign
(since it doesn't have any, being an unsigned type) bits, so must be
padding bits.
Once upon a time I was working with a TI graphics processor which had
arbitrary sized values - very handy if you were doing as we were and had
a 3-bit greyscale display.

The difference between two adjacent _bits_ was 1.
Between two adjacent _bytes_ it was 8.

Single bit booleans make absolute sense on a device like that.

And yes, we were working in C. And some things didn't port.

Andy
--
Under the hood there was a 32-bit bus, and setting 1 bit was a
read-modify-write - but not even visible at assembler level.
Keith Thompson
2021-05-24 20:15:11 UTC
Reply
Permalink
Post by Ben Bacarisse
Post by Keith Thompson
sizeof (_Bool) >= 1. It may be greater than 1, but that would
be weird. If sizeof (_Bool) > 1, then it must have padding bits.
I don't understand how you draw that last conclusion.
[answered elsethread]
Post by Ben Bacarisse
<cut>
Post by Keith Thompson
Here's a small program that attempts to explore how an implementation
#include <stdio.h>
#include <limits.h>
union U {
_Bool b;
unsigned char rep;
};
When doing this kind of thing, my preference is to write
union U {
_Bool b;
unsigned char rep[sizeof (_Bool)];
};
even when it's very unlikely that the size will be > 1. It makes the
purpose so very clear.
Good point.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Tim Rentsch
2021-05-24 13:49:19 UTC
Reply
Permalink
Post by Keith Thompson
As promised, I've studied what the C standard says about the
requirements for the representation of _Bool. I've referred to the
C11 standard and to drafts of C17 and C2x (N2596). C11 and C17 do
not differ in this area as far as I can tell, but there are some
new things in the C2x proposal.
Thank you, this looks good (and nice to have C17 and C2x included).
Post by Keith Thompson
[...]
The rank of _Bool shall be less than the rank of all other standard
integer types. This implies that the range of values of _Bool is
a subrange of the range of values of unsigned char. A _Bool object
cannot store a value less than 0 or greater than UCHAR_MAX.
AFAICT the width of _Bool is permitted to be greater than the
width of an extended unsigned integer type whose width is less
than CHAR_BIT. It seems weird to allow that, but I don't see
anything that forbids it.
Post by Keith Thompson
[...]
Here's a small program that attempts to explore how an implementation
[..program..]
[..gcc results and analysis..] So I conclude that
for gcc, 2 and 3 (and probably anything other than 0 or 1) are
trap representations for _Bool, and that _Bool has 1 value bit,
7 padding bits, and 254 trap representation.
Yes I believe that's right.
Post by Keith Thompson
It's possible that the intent is for _Bool to have 8 value bits and the
gcc authors' interpretation of the requirements for simple assignment
differ from mine. (I won't presume to say who's right.)
Other evidence suggests gcc takes the width of _Bool to be 1.
See below.
Post by Keith Thompson
[..clang results and analysis..]
It's not consistent with _Bool having more than 1 value bit.
There could be another value bit that is not adjacent to the low
order bit, with a padding bit inbetween. Of course, it is highly
unlikely that that is the case.
Post by Keith Thompson
When implementers add support for BOOL_WIDTH, they'll have to decide
explicitly how many value bits _Bool has.
I think other parts of the language necessitate the decision
having been made, even without BOOL_WIDTH. Both gcc and
clang take the width of _Bool to be 1, as may be seen by
compiling the following program:

struct {
_Bool just_checking : 2;
} test;
Keith Thompson
2021-05-24 20:30:36 UTC
Reply
Permalink
Post by Tim Rentsch
Post by Keith Thompson
As promised, I've studied what the C standard says about the
requirements for the representation of _Bool. I've referred to the
C11 standard and to drafts of C17 and C2x (N2596). C11 and C17 do
not differ in this area as far as I can tell, but there are some
new things in the C2x proposal.
Thank you, this looks good (and nice to have C17 and C2x included).
Post by Keith Thompson
[...]
The rank of _Bool shall be less than the rank of all other standard
integer types. This implies that the range of values of _Bool is
a subrange of the range of values of unsigned char. A _Bool object
cannot store a value less than 0 or greater than UCHAR_MAX.
AFAICT the width of _Bool is permitted to be greater than the
width of an extended unsigned integer type whose width is less
than CHAR_BIT. It seems weird to allow that, but I don't see
anything that forbids it.
For example, an implementation might have an extended integer type
_Nybble with 4 value bits. I'll look into that.
Post by Tim Rentsch
Post by Keith Thompson
[...]
Here's a small program that attempts to explore how an implementation
[..program..]
[..gcc results and analysis..] So I conclude that
for gcc, 2 and 3 (and probably anything other than 0 or 1) are
trap representations for _Bool, and that _Bool has 1 value bit,
7 padding bits, and 254 trap representation.
Yes I believe that's right.
Post by Keith Thompson
It's possible that the intent is for _Bool to have 8 value bits and the
gcc authors' interpretation of the requirements for simple assignment
differ from mine. (I won't presume to say who's right.)
Other evidence suggests gcc takes the width of _Bool to be 1.
See below.
Post by Keith Thompson
[..clang results and analysis..]
It's not consistent with _Bool having more than 1 value bit.
There could be another value bit that is not adjacent to the low
order bit, with a padding bit inbetween. Of course, it is highly
unlikely that that is the case.
Post by Keith Thompson
When implementers add support for BOOL_WIDTH, they'll have to decide
explicitly how many value bits _Bool has.
I think other parts of the language necessitate the decision
having been made, even without BOOL_WIDTH. Both gcc and
clang take the width of _Bool to be 1, as may be seen by
struct {
_Bool just_checking : 2;
} test;
Right, C11 6.7.2.1p4:

The expression that specifies the width of a bit-field shall be an
integer constant expression with a nonnegative value that does not
exceed the width of an object of the type that would be specified
were the colon and expression omitted.

I even cited the footnote on that paragraph, but missed the paragraph
itself.

Both clang and gcc (latest versions) complain that the width of the bit
field exceeds that of its type, so they both have a width of 1 for
_Bool. I'll spend some time later looking into the question of trap
representations.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Philipp Klaus Krause
2021-05-25 09:40:44 UTC
Reply
Permalink
Post by Keith Thompson
I'll spend some time later looking into the question of trap
representations.
AFAIK, for some compilers _Bool has trap representations, i.e. if you
(e.g. via memcpy) put a value other than false or true into a bool, you
get undefined behaviour when reading that value: On architectures where
it is faster, a jumptable might be used instead of a conditional jump
for an if/else construct. Or the code genration for a switch (a + b + c
+ d) will assume that the possible range is 0 to 4 if a to d are bool,
and generate a jump table just covering that range.

Philipp
jacobnavia
2021-05-24 16:40:07 UTC
Reply
Permalink
In a related issue, I have thought about implementing boolen arrays, where

_Bool tab[8];

sizeof(tab) == 1

I;e. represent boolean arrays as just arrays of 1 bit or bit-arrays.
There are many advantages to bit arrays.

For instance they could replace the usual representation

int32_t flags;

#define FLAG_SOMETHING 1

if (flags & FLAG_SOMETHING)

etc. They would be very compact, making possible to store boolean
values efficiently. The only problem that I saw was:

sizeof(tab[2]) == ????? (0.125?)

so I dropped this idea, I just did not know what I should return for that.

jacob
David Brown
2021-05-24 16:55:22 UTC
Reply
Permalink
Post by jacobnavia
In a related issue, I have thought about implementing boolen arrays, where
_Bool tab[8];
sizeof(tab) == 1
I;e. represent boolean arrays as just arrays of 1 bit or bit-arrays.
There are many advantages to bit arrays.
For instance they could replace the usual representation
int32_t flags;
#define FLAG_SOMETHING 1
if (flags & FLAG_SOMETHING)
etc. They would be  very compact, making possible to store boolean
sizeof(tab[2]) ==  ????? (0.125?)
so I dropped this idea, I just did not know what I should return for that.
jacob
Such compact arrays of booleans have their advantages and their
use-cases, but they also have their disadvantages - sometimes they will
be slower, sometimes faster than a normal boolean array.

But one thing that I am confident about, is that it would not be
conforming to standard C - in particular, you couldn't take the address
of "tab[1]" and consider it to point to a normal bool variable.

So this would have to be either an extension to C, or a feature of your
container library. (Or it could be both - you could have an extension,
and your container library could have conditional compilation that spots
your compiler and uses the extension to implement it efficiently, while
falling back to standard C for other compilers.)
Bart
2021-05-24 17:31:28 UTC
Reply
Permalink
Post by jacobnavia
In a related issue, I have thought about implementing boolen arrays, where
_Bool tab[8];
sizeof(tab) == 1
I;e. represent boolean arrays as just arrays of 1 bit or bit-arrays.
There are many advantages to bit arrays.
For instance they could replace the usual representation
etc. They would be  very compact, making possible to store boolean
sizeof(tab[2]) ==  ????? (0.125?)
so I dropped this idea, I just did not know what I should return for that.
sizeof returns a size in bytes, so it would just be rounded up to the
nearest byte:

_Bool tab[50];

sizeof(tab) would be 7 (56 bits) (unless you want to pad it up to 64).
sizeof(tab[3]) would be 1

To work with bits, or to find out the length of the array, you will need
a new operator, such as bitsof():

bitsof(tab) would be 50
bitsof(tab[3]) would be 1

Where it would be troublesome however, is that C works with arrays using
pointers. Then you will need bit-pointers, and it starts getting messy
(I've done it).
Post by jacobnavia
int32_t flags;
#define FLAG_SOMETHING 1
if (flags & FLAG_SOMETHING)
Short bit sequences up to 32 or 64 bits are a different matter. You
don't need bit-arrays and bit-pointers for this. I do it with bit
operations, for example:

uint32_t flags; // signed is not so appropriate

#define BIT_SOMETHING 0 // bit numbers not masks

if (flags.[BIT_SOMETHING]) // yields 0/1, not 0/non-0

But while flags can occupy more than one bit, it gets harder to combine
multiple unrelated flags, like bits 0, 3 and 10, compared with masks.
m137
2025-01-17 02:47:49 UTC
Reply
Permalink
Hi Keith,

Thank you for posting this. I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
with "non-value representation":
- **Trap representation** was last defined in [N2731
3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
as "an object representation that need not represent a value of the
object type."
- **Non-value representation** is most recently defined in [N3435
3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
as "an object representation that does not represent a value of the
object type."

The definition of non-value representation rules out object
representations that represent a value of the object type from being
non-value representations. So it seems to be stricter than the
definition of trap representation, which does not seem to rule out such
object representations from being trap representations. Is this
interpretation correct?

If so, what happens to the 254 trap representations that GCC and Clang
reserve for `_Bool`? Assuming a width of 1, each of those 254 object
representations represents a value in `_Bool`'s domain (the half whose
value bit is 1 represents the value `true`, while the other half whose
value bit is 0 represents the value `false`), so they cannot be thought
of as non-value representations (since a non-value representation must
be an object representation that **does not** represent a value of the
object type).

I've been stuck on this for quite some time, so would be grateful for
any guidance you could provide.


Thank you
David Brown
2025-01-17 09:18:25 UTC
Reply
Permalink
Post by m137
Hi Keith,
Thank you for posting this.
When, where? No attribution; referenced article is expired from this
Eternal September server, which has decently long retentation times.
Post by m137
I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
That is correct. Probably because "trap representation" insinuates
that such a representation *must* produce a trap, or else the
implementation has no right to specify such a representation.
Yes, I believe that is the reason. Earlier standards make it clear in
the definitions that accessing a "trap representation" does not imply
"performing a trap" - but the term is easily misunderstood for those
that read parts of the standard without referring back to the definitions.
Impelmentations are not obliged to produce traps in relation to
non-value representations. Since the behaviors in question are
undefined, they may do so.
Agreed. I can't see any differences in the semantics here - only the
term used has changed.
Post by m137
If so, what happens to the 254 trap representations that GCC and Clang
reserve for `_Bool`? Assuming a width of 1, each of those 254 object
GCC and Clang specifies trap representations for _Bool? Where is this
found in their documentation?
I can't answer for clang - I don't know it in high enough detail. But
gcc certainly considers the use of _Bool representations other than 0 or
1 as undefined behaviour (except when accessed through a char type
lvalue - such as by memcpy). I once had a bug where data memcpy'ed into
a struct containing a bool resulted in a bool that had something other
than 0 or 1 in the memory byte - and that lead to the both "true" and
"false" paths being followed in some code that used it.

That was, of course, perfectly good code generation for undefined behaviour.

But to be clear - it was UB, not a trap. Pre-C23 usage of "trap
representations" is UB and may or may not perform a trap, depending on
the implementation (including any flags it is given). With C23, it
would now have the more appropriate name "non-value representation" and
exactly the same effect.

gcc now has a new "hardbool" feature, implemented as a type attribute:

<https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-hardbool-type-attribute>

This lets you create new types that act much like booleans, except that
you can specify the true and false representations directly, and that
other representations are actually trapping - they are checked at
runtime and lead to a call to __builtin_trap().
Post by m137
representations represents a value in `_Bool`'s domain (the half whose
value bit is 1 represents the value `true`, while the other half whose
value bit is 0 represents the value `false`), so they cannot be thought
of as non-value representations (since a non-value representation must
be an object representation that **does not** represent a value of the
object type).
In an integer type, it is indeed possible for the padding bits to be
nonzero, without changing the value given by the value bits.
However, how that works is not specified; it's up to an implementation,
and doesn't have to be documented.
An implementation could say that the padding bits don't mean anything;
they can have any value whatsoever and so the situation is as you
say: the bool representations with a 0 in the value bit are all false,
and those with a 1 are all true.
However, an implementation can also say that certain patterns of
bits are non-value reprensentations.
One example given is the possibility of parity bits. Suppose some
integer type has one padding bit which behaves as a parity bit. Then
suppose whenever that bit has incorrect parity, the representation is
deemed a non-value representation.
With regard to bool (say, one implemented in 8 bits), an impelmentation
can assert that if there is a nonzero value in any padding bit, the
result is a non-value representation. Then, only 0 and 1 are valid;
all other byte codes are non-value representations.
Implementations determine their own rules for how configurations of
padding bits may, on their own, or in interaction with configurations
of value bits, give rise to non-value representations.
All good stuff, nicely written.
Tim Rentsch
2025-01-17 18:39:38 UTC
Reply
Permalink
Post by m137
Hi Keith,
Thank you for posting this.
Normally followup postings include a reference of some sort to the
article being replied to.
Post by m137
I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
- **Trap representation** was last defined in [N2731
3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
as "an object representation that need not represent a value of the
object type."
- **Non-value representation** is most recently defined in [N3435
3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
as "an object representation that does not represent a value of the
object type."
The definition of non-value representation rules out object
representations that represent a value of the object type from being
non-value representations. So it seems to be stricter than the
definition of trap representation, which does not seem to rule out such
object representations from being trap representations. Is this
interpretation correct?
No. Except for using a different name, there is no difference
between "trap representation" and "non-value representation".
Post by m137
If so, what happens to the 254 trap representations that GCC and Clang
reserve for `_Bool`? Assuming a width of 1, each of those 254 object
representations represents a value in `_Bool`'s domain (the half whose
value bit is 1 represents the value `true`, while the other half whose
value bit is 0 represents the value `false`), so they cannot be thought
of as non-value representations (since a non-value representation must
be an object representation that **does not** represent a value of the
object type).
I don't know that either gcc or clang have any trap representations
for _Bool. Furthermore whether they do could depend on either which
version or what compiler options are being used.

Let's assume 8-bit chars, and also that the width of _Bool is 1
(which is optional before C23 and required in C23). Here is what
can be said about the 256 states of a _Bool object.

1. All zero bits must be a legal value for 0.

2. There must be at least one combination of bits that is a legal
value for 1 (and since it must be distinct from the all-zero
value for 0, must have at least one bit set to 1).

3. The remaining 254 possible combinations of bit settings can be
any mixture of legal values and trap representations, which are also
known as non-value representations starting in C23.

4. Considering the set of legal value bit settings, there must be at
least one bit position that is 0 in all cases where the value is
0, and is 1 in all cases where the value is 1.

5. Accessing any representation corresponding to a legal value has
well-defined behavior, and yields 0 or 1 depending on the setting of
the bit (or bits) mentioned in #4.

6. Accessing any trap/non-value representation is undefined behavior
and might do anything at all. It might appear to work. It might
work in some cases but not others. It might yield a value that is
neither 0 or 1. It might abort the program. It might cause the
computer the program is running on to run a different operating
system (of course this outcome isn't very likely, but as far as the
C standard is concerned it cannot be ruled out).

Does this answer all your questions?
m137
2025-01-19 02:11:37 UTC
Reply
Permalink
Post by Tim Rentsch
Post by m137
Hi Keith,
Thank you for posting this.
Normally followup postings include a reference of some sort to the
article being replied to.
Hi Tim,

Sorry for the confusion, I am new to platform and hadn't realised that I
need to quote Keith's post in my reply.
Post by Tim Rentsch
Post by m137
I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
- **Trap representation** was last defined in [N2731
3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
as "an object representation that need not represent a value of the
object type."
- **Non-value representation** is most recently defined in [N3435
3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
as "an object representation that does not represent a value of the
object type."
The definition of non-value representation rules out object
representations that represent a value of the object type from being
non-value representations. So it seems to be stricter than the
definition of trap representation, which does not seem to rule out such
object representations from being trap representations. Is this
interpretation correct?
No. Except for using a different name, there is no difference
between "trap representation" and "non-value representation".
The reason I thought they were different was because the definition of
trap representation uses the phrase "need not", which seemed more
permissive than the "does not" in the definition of non-value
representation.
Post by Tim Rentsch
Let's assume 8-bit chars, and also that the width of _Bool is 1
(which is optional before C23 and required in C23). Here is what
can be said about the 256 states of a _Bool object.
1. All zero bits must be a legal value for 0.
2. There must be at least one combination of bits that is a legal
value for 1 (and since it must be distinct from the all-zero
value for 0, must have at least one bit set to 1).
3. The remaining 254 possible combinations of bit settings can be
any mixture of legal values and trap representations, which are also
known as non-value representations starting in C23.
4. Considering the set of legal value bit settings, there must be at
least one bit position that is 0 in all cases where the value is
0, and is 1 in all cases where the value is 1.
5. Accessing any representation corresponding to a legal value has
well-defined behavior, and yields 0 or 1 depending on the setting of
the bit (or bits) mentioned in #4.
6. Accessing any trap/non-value representation is undefined behavior
and might do anything at all. It might appear to work. It might
work in some cases but not others. It might yield a value that is
neither 0 or 1. It might abort the program. It might cause the
computer the program is running on to run a different operating
system (of course this outcome isn't very likely, but as far as the
C standard is concerned it cannot be ruled out).
Does this answer all your questions?
Yes, thank you for taking the time to reply, I really appreciate it.
Just to clarify, since padding bits do not count towards the value being
represented, in point (2) above, it would have to be the value bit
specifically that is set to 1; and similarly in point (4), the bit
position that is being referred to is the value bit. Is this correct?

--
Tim Rentsch
2025-01-19 04:37:25 UTC
Reply
Permalink
[...]
Post by m137
Hi Tim,
Sorry for the confusion, I am new to platform and hadn't realised
that I need to quote Keith's post in my reply.
No worries. Glad you are up to speed now.
Post by m137
Post by Tim Rentsch
Let's assume 8-bit chars, and also that the width of _Bool is 1
(which is optional before C23 and required in C23). Here is what
can be said about the 256 states of a _Bool object.
1. All zero bits must be a legal value for 0.
2. There must be at least one combination of bits that is a legal
value for 1 (and since it must be distinct from the all-zero
value for 0, must have at least one bit set to 1).
3. The remaining 254 possible combinations of bit settings can be
any mixture of legal values and trap representations, which are also
known as non-value representations starting in C23.
4. Considering the set of legal value bit settings, there must be at
least one bit position that is 0 in all cases where the value is
0, and is 1 in all cases where the value is 1.
5. Accessing any representation corresponding to a legal value has
well-defined behavior, and yields 0 or 1 depending on the setting of
the bit (or bits) mentioned in #4.
6. Accessing any trap/non-value representation is undefined behavior
and might do anything at all. It might appear to work. It might
work in some cases but not others. It might yield a value that is
neither 0 or 1. It might abort the program. It might cause the
computer the program is running on to run a different operating
system (of course this outcome isn't very likely, but as far as the
C standard is concerned it cannot be ruled out).
Does this answer all your questions?
Yes, thank you for taking the time to reply, I really appreciate it.
Just to clarify, since padding bits do not count towards the value being
represented, in point (2) above, it would have to be the value bit
specifically that is set to 1; and similarly in point (4), the bit
position that is being referred to is the value bit. Is this correct?
Yes, I think that's right, but we can't always tell which bit is the
value bit just by looking at the set of legal values. Consider an
implementation where _Bool 0 is represented by all zeros and _Bool 1
is represented by all ones, and every combination that includes both
zeros and ones (which is everything else) is a trap representation.
The width of _Bool must be 1, but which bit is the value bit? We
can't tell. Fortunately the C standard says how different types are
encoded is implementation defined (if not defined explicitly), so we
can consult the documentation to see which bit of _Bool is the value
bit.
James Kuyper
2025-01-17 19:10:11 UTC
Reply
Permalink
Post by m137
Hi Keith,
Thank you for posting this.
When, where? No attribution; referenced article is expired from this
Eternal September server, which has decently long retentation times.
While Google Groups has stopped archiving new messages, it retains all
of the messages it previously archived, including the ones for this
thread. It was started on 2021-05-23 by Keith Thompson.
Keith Thompson
2025-01-17 21:34:53 UTC
Reply
Permalink
Post by m137
Hi Keith,
Thank you for posting this.
The message being referred to is one I posted Sun 2021-05-23, with
Message-ID <***@nosuchdomain.example.com>. It's visible on
Google Groups at
<https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

As others have suggested, please include attribution information when
posting a followup. You don't need to quote the entire message,
but provide at least some context, particularly when the parent
message is old.

This is an update to that message.
Post by m137
I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
- **Trap representation** was last defined in [N2731
3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
as "an object representation that need not represent a value of the
object type."
- **Non-value representation** is most recently defined in [N3435
3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
as "an object representation that does not represent a value of the
object type."
The definition of non-value representation rules out object
representations that represent a value of the object type from being
non-value representations. So it seems to be stricter than the
definition of trap representation, which does not seem to rule out such
object representations from being trap representations. Is this
interpretation correct?
I don't believe so. As far as I can tell, a "non-value
representation" (C23 and later) is exactly the same thing as a "trap
representation" (C17 and earlier). The older term was probably
considered unclear, since it could imply that a trap is required.
In fact, reading an object with a trap/non-value representation
has undefined behavior, which can include yielding the value you
might have expected.
Post by m137
If so, what happens to the 254 trap representations that GCC and Clang
reserve for `_Bool`?
I see no evidence in gcc's documentation that gcc treats
representations other than 0 or 1 as trap/non-value representations.
I see only two references to "trap representation", one for signed
integer types (saying that there are no trap representations)
and one regarding type-punning via unions. There are no relevant
references to "padding bits".

I'm less familiar with clang's documentation, but I see no reference
to "trap representation" or "non-value representation".

We can get some information about this by running a test program.
See below.
Post by m137
Assuming a width of 1, each of those 254 object
representations represents a value in `_Bool`'s domain (the half whose
value bit is 1 represents the value `true`, while the other half whose
value bit is 0 represents the value `false`), so they cannot be thought
of as non-value representations (since a non-value representation must
be an object representation that **does not** represent a value of the
object type).
Reading an object with a non-value representation has undefined
behavior. If the observed value happens to be a valid value of the
object's type, that's still consistent with undefined behavior.
*Everything* is consistent with undefined behavior.
Post by m137
I've been stuck on this for quite some time, so would be grateful for
any guidance you could provide.
Editions of the C standard earlier than C23 were not entirely
clear about the representation of _Bool. (C90 does not have _Bool
or bool. C99 through C17 have _Bool as a keyword, with bool as
a macro defined in <stdbool.h>. C23 has bool as a keyword, with
_Bool as an alternate spelling.)

In C99 and later, _Bool/bool is required to be an unsigned integer
type large enough to hold the values 0 and 1. Its size must be at
least CHAR_BIT bits (which is at least 8). The *rank* of _Bool is
less than the rank of all other standard integer types.

The rank implies that the range of values is a subset of the
range of values of any other unsigned integer type. The rank does
*not* imply anything about relative sizes. unsigned char has a
higher rank than bool, but bool could have additional padding bits
making sizeof(bool)>1. (Probably no implementation does this.)
unsigned char has no padding bits.

C11 implies that _Bool can have more than one value bit, which
means it could represent values greater than 1 (but no more than
0..UCHAR_MAX).

C23 (I'm using the N3096 draft) tightens the requirements, saying
that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
padding bits -- again implying that sizeof(bool) might be greater
than 1, but forbidding values greater than 1.

Typically in C17 and earlier, and always in C23, _Bool/bool will
have exactly 1 value bit and CHAR_BIT-1 padding bits. Padding bits
do not contribute to the value of an object (so 0 and 1 are the
only possible values), but non-zero padding bits *may or may not*
create trap/non-value representations. (A gratuitously exotic
implementation might use a representation other than 00000001 for
true, but 00000000 is guaranteed to be a representation for 0/false.)

As far as I can tell, the standard is silent on whether a bool object
with non-zero padding bits is a trap/non-value representation or not.

I wrote a test program to explore how bool is treated. It uses
memcpy to set the representation of a bool object and then prints
the value of that object. Source is at the bottom of this message.

If bool has no non-value representations, then the values of the
CHAR_BIT-1 padding bits must be ignored when reading a bool object,
and the value of such an object is determined only by its single
value bit, 0 or 1. If it does have non-value representations,
then reading such an object has undefined behavior.

With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
when used in a condition and all other representations are treated
as true. Converting the value of a bool object to another integer
type yields the value of its full 8-bit representation. If a bool
object holds a representation other than 00000000 or 00000001,
it compares equal to both `true` and `false`.

This implies that bool has 1 value bit and 7 padding bits (as
required by C23) and that it has 2 value representations and 254
trap representations. The observed behavior for the non-value
representations is the result of undefined behavior. (gcc -std=c23
sets __STDC_VERSION__ to 202000L, not 202311L. The documentation
acknowledges that support for C23 is experimental and incomplete.)

With clang 19.1.4, with "-std=c23", the behavior is consistent
with bool having no non-value representations. The 7 padding bits
do not contribute to the value of a bool object. Any bool object
with 0 as the low-order bit is treated as false in a condition and
yields 0 when converted to another integer type,. Any bool object
with 1 as the low-order bit is treated as true, and yields 1 when
converted to another integer type. I presume the intent is for bool
to have 256 value representations and no non-value representations
(with the padding bits ignored as required), but it's also consistent
with bool having non-value representations and the observed behavior
being undefined. It's not possible to determine with a test program
whether the output is the result of undefined behavior or not.

As far as I can tell, the question of whether bool has non-value
representations is unspecified but not implementation-defined,
meaning that an implementation is not required to document its
choice.

#include <stdio.h>
#include <string.h>
#include <limits.h>
#if __STDC_VERSION__ < 202311L
#include <stdbool.h>
#endif
int main() {
printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
#if __STDC_VERSION__ < 202311L
puts("Older than C23, using <stdbool.h>");
#else
puts("C23 or later, using bool directly");
#endif
printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
sizeof (unsigned char), sizeof (bool));

const bool no = false;
const bool yes = true;
unsigned char uc;
memcpy(&uc, &no, 1);
printf("false is represented as %d\n", (int)uc);
memcpy(&uc, &yes, 1);
printf("true is represented as %d\n", (int)uc);

for (int i = 0; i <= UCHAR_MAX; i ++) {
const unsigned char uc = i;
bool b;
memcpy(&b, &uc, 1);
const unsigned char value = b;
printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
(unsigned)uc,
value,
b ? "truthy" : "falsy ",
b == false ? "==" : "!=",
b == true ? "==" : "!=");
}
}
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Tim Rentsch
2025-01-18 20:17:02 UTC
Reply
Permalink
Post by Keith Thompson
Post by m137
Hi Keith,
Thank you for posting this.
The message being referred to is one I posted Sun 2021-05-23, with
Google Groups at
<https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.
As others have suggested, please include attribution information when
posting a followup. You don't need to quote the entire message,
but provide at least some context, particularly when the parent
message is old.
This is an update to that message.
Post by m137
I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
- **Trap representation** was last defined in [N2731 3.19.4(1)]
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
as "an object representation that need not represent a value of the
object type."
- **Non-value representation** is most recently defined in
[N3435 3.26(1)]
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
as "an object representation that does not represent a value of the
object type."
The definition of non-value representation rules out object
representations that represent a value of the object type from
being non-value representations. So it seems to be stricter than
the definition of trap representation, which does not seem to rule
out such object representations from being trap representations.
Is this interpretation correct?
I don't believe so. As far as I can tell, a "non-value
representation" (C23 and later) is exactly the same thing as a
"trap representation" (C17 and earlier). The older term was
probably considered unclear, since it could imply that a trap is
required. In fact, reading an object with a trap/non-value
representation has undefined behavior, which can include yielding
the value you might have expected.
Post by m137
If so, what happens to the 254 trap representations that GCC and
Clang reserve for `_Bool`?
I see no evidence in gcc's documentation that gcc treats
representations other than 0 or 1 as trap/non-value representations.
I see only two references to "trap representation", one for signed
integer types (saying that there are no trap representations) and
one regarding type-punning via unions. There are no relevant
references to "padding bits".
I'm less familiar with clang's documentation, but I see no reference
to "trap representation" or "non-value representation".
We can get some information about this by running a test program.
See below.
Post by m137
Assuming a width of 1, each of those 254
object representations represents a value in `_Bool`'s domain (the
half whose value bit is 1 represents the value `true`, while the
other half whose value bit is 0 represents the value `false`), so
they cannot be thought of as non-value representations (since a
non-value representation must be an object representation that
**does not** represent a value of the object type).
Reading an object with a non-value representation has undefined
behavior. If the observed value happens to be a valid value of
the object's type, that's still consistent with undefined
behavior. *Everything* is consistent with undefined behavior.
Post by m137
I've been stuck on this for quite some time, so would be grateful
for any guidance you could provide.
Editions of the C standard earlier than C23 were not entirely
clear about the representation of _Bool. (C90 does not have _Bool
or bool. C99 through C17 have _Bool as a keyword, with bool as
a macro defined in <stdbool.h>. C23 has bool as a keyword, with
_Bool as an alternate spelling.)
In C99 and later, _Bool/bool is required to be an unsigned integer
type large enough to hold the values 0 and 1. Its size must be at
least CHAR_BIT bits (which is at least 8). The *rank* of _Bool is
less than the rank of all other standard integer types.
The rank implies that the range of values is a subset of the
range of values of any other unsigned integer type. The rank does
*not* imply anything about relative sizes. unsigned char has a
higher rank than bool, but bool could have additional padding bits
making sizeof(bool)>1. (Probably no implementation does this.)
unsigned char has no padding bits.
C11 implies that _Bool can have more than one value bit, which
means it could represent values greater than 1 (but no more than
0..UCHAR_MAX).
C23 (I'm using the N3096 draft) tightens the requirements, saying
that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
padding bits -- again implying that sizeof(bool) might be greater
than 1, but forbidding values greater than 1.
Typically in C17 and earlier, and always in C23, _Bool/bool will
have exactly 1 value bit and CHAR_BIT-1 padding bits. Padding bits
do not contribute to the value of an object (so 0 and 1 are the
only possible values), but non-zero padding bits *may or may not*
create trap/non-value representations. (A gratuitously exotic
implementation might use a representation other than 00000001 for
true, but 00000000 is guaranteed to be a representation for 0/false.)
As far as I can tell, the standard is silent on whether a bool object
with non-zero padding bits is a trap/non-value representation or not.
There are no conditions other than the rules for how integer
types are represented. As long as those conditions are met an
implementation is free to make any set of object representations
be a trap representation (and I assume that hasn't changed for
C23, not counting the change that the width of _Bool must be
one under C23).
Post by Keith Thompson
I wrote a test program to explore how bool is treated. It uses
memcpy to set the representation of a bool object and then prints
the value of that object. Source is at the bottom of this message.
If bool has no non-value representations, then the values of the
CHAR_BIT-1 padding bits must be ignored when reading a bool object,
and the value of such an object is determined only by its single
value bit, 0 or 1. If it does have non-value representations,
then reading such an object has undefined behavior.
With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
when used in a condition and all other representations are treated
as true. Converting the value of a bool object to another integer
type yields the value of its full 8-bit representation. If a bool
object holds a representation other than 00000000 or 00000001,
it compares equal to both `true` and `false`.
This implies that bool has 1 value bit and 7 padding bits (as
required by C23) and that it has 2 value representations and 254
trap representations. The observed behavior for the non-value
representations is the result of undefined behavior. (gcc -std=c23
sets __STDC_VERSION__ to 202000L, not 202311L. The documentation
acknowledges that support for C23 is experimental and incomplete.)
With clang 19.1.4, with "-std=c23", the behavior is consistent
with bool having no non-value representations. The 7 padding bits
do not contribute to the value of a bool object. Any bool object
with 0 as the low-order bit is treated as false in a condition and
yields 0 when converted to another integer type,. Any bool object
with 1 as the low-order bit is treated as true, and yields 1 when
converted to another integer type. I presume the intent is for bool
to have 256 value representations and no non-value representations
(with the padding bits ignored as required), but it's also consistent
with bool having non-value representations and the observed behavior
being undefined. It's not possible to determine with a test program
whether the output is the result of undefined behavior or not.
As far as I can tell, the question of whether bool has non-value
representations is unspecified but not implementation-defined,
meaning that an implementation is not required to document its
choice.
6.2.6.1 paragraph 2 says objects other than bitfields are composed
of contiguous sequences of one or more bytes, the number, order,
and encoding of which are either explicitly specified or
implementation-defined. Which object representations are legal
values and which are non-value/trap representations should be
part of the encoding, and hence implementation defined.
Post by Keith Thompson
#include <stdio.h>
#include <string.h>
#include <limits.h>
#if __STDC_VERSION__ < 202311L
#include <stdbool.h>
#endif
int main() {
printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
#if __STDC_VERSION__ < 202311L
puts("Older than C23, using <stdbool.h>");
#else
puts("C23 or later, using bool directly");
#endif
printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
sizeof (unsigned char), sizeof (bool));
const bool no = false;
const bool yes = true;
unsigned char uc;
memcpy(&uc, &no, 1);
printf("false is represented as %d\n", (int)uc);
memcpy(&uc, &yes, 1);
printf("true is represented as %d\n", (int)uc);
for (int i = 0; i <= UCHAR_MAX; i ++) {
const unsigned char uc = i;
bool b;
memcpy(&b, &uc, 1);
const unsigned char value = b;
printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
(unsigned)uc,
value,
b ? "truthy" : "falsy ",
b == false ? "==" : "!=",
b == true ? "==" : "!=");
}
}
I was surprised to discover that running this program (as C11,
under gcc 8.4.0) with the last 'false' changed to 'no' and the
last 'true' changed to 'yes' gave a different result, namely,
except for value==0 and value==1 there were no "==" for the
b comparisons.
m137
2025-01-19 02:30:02 UTC
Reply
Permalink
Post by Keith Thompson
The message being referred to is one I posted Sun 2021-05-23, with
Google Groups at
<https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.
As others have suggested, please include attribution information when
posting a followup. You don't need to quote the entire message,
but provide at least some context, particularly when the parent
message is old.
Hi Keith,

Sorry for the confusion, I am new to the platform and had not realised
that I needed to quote your post in my reply.
Post by Keith Thompson
Post by m137
The definition of non-value representation rules out object
representations that represent a value of the object type from being
non-value representations. So it seems to be stricter than the
definition of trap representation, which does not seem to rule out such
object representations from being trap representations. Is this
interpretation correct?
I don't believe so. As far as I can tell, a "non-value
representation" (C23 and later) is exactly the same thing as a "trap
representation" (C17 and earlier). The older term was probably
considered unclear, since it could imply that a trap is required.
In fact, reading an object with a trap/non-value representation
has undefined behavior, which can include yielding the value you
might have expected.
The reason I thought they were different was because the definition of
trap representation uses the phrase "need not", i.e. a trap
representation is an object representation that **need not** represent a
value of the object type. I read this as saying that a trap
representation could be an object representation that represents a value
of the object type, **or** it could be one that does not. This seemed
more permissive than the definition of non-value representation, which
uses the phrase "does not", i.e. a non-value representation is an object
representation that *does not* represent a value of the object type. I
took that as meaning that object representations that do represent a
value of the object type (such as those 254 representations of `_Bool`,
assuming a width of 1) are excluded from being classed as non-value
representations. But I understand now that that is not the case.
Post by Keith Thompson
Editions of the C standard earlier than C23 were not entirely
clear about the representation of _Bool.
Yes, confusingly, I could not find anything about the width of a `_Bool`
in C99, and C11 and C17 only talk about it in a footnote all the way
down in 6.7.2.1:

- C11 final draft, footnote 122:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#page=131
- C17 final draft, footnote 124:
https://web.archive.org/web/20181230041359/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf#page=100
Post by Keith Thompson
Typically in C17 and earlier, and always in C23, _Bool/bool will
have exactly 1 value bit and CHAR_BIT-1 padding bits. Padding bits
do not contribute to the value of an object (so 0 and 1 are the
only possible values), but non-zero padding bits *may or may not*
create trap/non-value representations. (A gratuitously exotic
implementation might use a representation other than 00000001 for
true, but 00000000 is guaranteed to be a representation for 0/false.)
As far as I can tell, the standard is silent on whether a bool object
with non-zero padding bits is a trap/non-value representation or not.
I wrote a test program to explore how bool is treated. It uses
memcpy to set the representation of a bool object and then prints
the value of that object. Source is at the bottom of this message.
If bool has no non-value representations, then the values of the
CHAR_BIT-1 padding bits must be ignored when reading a bool object,
and the value of such an object is determined only by its single
value bit, 0 or 1. If it does have non-value representations,
then reading such an object has undefined behavior.
With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
when used in a condition and all other representations are treated
as true. Converting the value of a bool object to another integer
type yields the value of its full 8-bit representation. If a bool
object holds a representation other than 00000000 or 00000001,
it compares equal to both `true` and `false`.
This implies that bool has 1 value bit and 7 padding bits (as
required by C23) and that it has 2 value representations and 254
trap representations. The observed behavior for the non-value
representations is the result of undefined behavior. (gcc -std=c23
sets __STDC_VERSION__ to 202000L, not 202311L. The documentation
acknowledges that support for C23 is experimental and incomplete.)
With clang 19.1.4, with "-std=c23", the behavior is consistent
with bool having no non-value representations. The 7 padding bits
do not contribute to the value of a bool object. Any bool object
with 0 as the low-order bit is treated as false in a condition and
yields 0 when converted to another integer type,. Any bool object
with 1 as the low-order bit is treated as true, and yields 1 when
converted to another integer type. I presume the intent is for bool
to have 256 value representations and no non-value representations
(with the padding bits ignored as required), but it's also consistent
with bool having non-value representations and the observed behavior
being undefined. It's not possible to determine with a test program
whether the output is the result of undefined behavior or not.
Compiling the last snippet in this article:
https://www.trust-in-soft.com/resources/blogs/2016-06-16-trap-representations-and-padding-bits,
with Clang 19.1.0 and options "-std=c23 -O3 -pedantic" seems to show
that Clang does treat `_Bool` as having 2 value representations and 254
non-value representations (see here:
https://gcc.godbolt.org/z/4jK9d69P8).

Thank you so much for taking the time to provide such a thorough
analysis. It really clears things up for me.

--
Kenny McCormack
2025-01-19 09:31:18 UTC
Reply
Permalink
Post by m137
Post by Keith Thompson
The message being referred to is one I posted Sun 2021-05-23, with
Google Groups at
<https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.
As others have suggested, please include attribution information when
posting a followup. You don't need to quote the entire message,
but provide at least some context, particularly when the parent
message is old.
Hi Keith,
Sorry for the confusion, I am new to the platform and had not realised
that I needed to quote your post in my reply.
You don't need to (in the sense that the world would end if you don't do
it), but it makes things easier for your readers if you do.

It is common in CLC to way overstate the case for various things. This
seems to be an instance of that.
--
Res ipsa loquitur.
m137
2025-01-21 00:16:40 UTC
Reply
Permalink
Post by Kenny McCormack
You don't need to (in the sense that the world would end if you don't do
it), but it makes things easier for your readers if you do.
Hi Kenny,

Thanks for letting me know. I like the quotes as it makes it easier to
see which parts of a post are being addressed in a follow-up post.

--

m137
2025-01-19 02:08:54 UTC
Reply
Permalink
Post by m137
Hi Keith,
Thank you for posting this.
When, where? No attribution; referenced article is expired from this
Eternal September server, which has decently long retentation times.
Hi Kaz,

Sorry for the confusion, I am new to the platform and had not realised
that I needed to quote Keith's post in my text.
Post by m137
I noticed that the newer drafts of C23
(N2912 onwards, I think) have replaced the term "trap representation"
That is correct. Probably because "trap representation" insinuates
that such a representation *must* produce a trap, or else the
implementation has no right to specify such a representation.
Impelmentations are not obliged to produce traps in relation to
non-value representations. Since the behaviors in question are
undefined, they may do so.
Thanks, I was wondering about this.
Post by m137
If so, what happens to the 254 trap representations that GCC and Clang
reserve for `_Bool`? Assuming a width of 1, each of those 254 object
GCC and Clang specifies trap representations for _Bool? Where is this
found in their documentation?
It is not documented (see this thread for GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88662). But I think it can
be inferred from the code snippets in Keith's OP and most recent post.
GCC seems to treat all object representations of `_Bool` other than
0b00000000 and 0b00000001 as trap/non-value representations.
I am not sure about Clang, but compiling the last snippet in this
article:
https://www.trust-in-soft.com/resources/blogs/2016-06-16-trap-representations-and-padding-bits with
Clang 19.1.0 and options "-std=c23 -O3 -pedantic" seems to show that
Clang treats `_Bool` as having 254 non-value representations (see here:
https://gcc.godbolt.org/z/4jK9d69P8).
In an integer type, it is indeed possible for the padding bits to be
nonzero, without changing the value given by the value bits.
However, how that works is not specified; it's up to an implementation,
and doesn't have to be documented.
An implementation could say that the padding bits don't mean anything;
they can have any value whatsoever and so the situation is as you
say: the bool representations with a 0 in the value bit are all false,
and those with a 1 are all true.
However, an implementation can also say that certain patterns of
bits are non-value reprensentations.
One example given is the possibility of parity bits. Suppose some
integer type has one padding bit which behaves as a parity bit.  Then
suppose whenever that bit has incorrect parity, the representation is
deemed a non-value representation.
With regard to bool (say, one implemented in 8 bits), an impelmentation
can assert that if there is a nonzero value in any padding bit, the
result is a non-value representation. Then, only 0 and 1 are valid;
all other byte codes are non-value representations.
Implementations determine their own rules for how configurations of
padding bits may, on their own, or in interaction with configurations
of value bits, give rise to non-value representations.
Thank you, I really appreciate you taking the time to reply.

--
Keith Thompson
2025-01-19 02:28:26 UTC
Reply
Permalink
***@gmail.com (m137) writes:
[...]
Post by m137
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88662). But I think it can
be inferred from the code snippets in Keith's OP and most recent post.
GCC seems to treat all object representations of `_Bool` other than
0b00000000 and 0b00000001 as trap/non-value representations.
I am not sure about Clang, but compiling the last snippet in this
https://www.trust-in-soft.com/resources/blogs/2016-06-16-trap-representations-and-padding-bits with
Clang 19.1.0 and options "-std=c23 -O3 -pedantic" seems to show that
https://gcc.godbolt.org/z/4jK9d69P8).
Interesting.

Here's a program based on that snippet:

#include <stdio.h>
#include <string.h>

int f(bool *b) {
if (*b)
return 1;
else
return 0;
}

int main(void) {
bool arg;
unsigned char uc = 123;
memcpy(&arg, &uc, 1);
printf("%d\n", f(&arg));
}

With the latest gcc (14.2.0) and clang (19.1.4), it prints 1 when
compiled with "-std=c23 -O0 -pedantic", and 123 when compiled with
-O1, -O2, and -O3.

For gcc, my previous results already indicated that bool has 254
non-value representations. For clang, the results do seem to
indicate the same thing (though I might argue that it could also
be a bug, unless the clang developers actually intended bool to
have 254 non-value representations).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Continue reading on narkive:
Loading...