how cast works?

Discussion:

how cast works?

(too old to reply)

Thiago Adams

2024-08-07 11:28:09 UTC

How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?

For instance, any 4 bytes type, cast to 2 bytes type is just the lower 2
bytes?

[A][B][C][D]
->
[A][B]

I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?
How about floating point?

The motivation problem.
I have a union, with unsigned int, unsigned char etc.(all types)
I need to execute a cast in runtime (like script).
The problem is that this causes an explosion of combinations that I am
trying to avoid.

---------------------------------------------------------

enum type {
TYPE_UNSIGNED_CHAR,
TYPE_UNSIGNED_INT,
};
struct number {
enum type type;
union U {
unsigned int unsigned_int_value;
unsigned char unsigned_char_value;
} data;
};

struct number cast_to(enum type t, const struct number* n) {
struct number r = {0};
r.type = t;
r.data = n->data;
//maybe fill with zeros the extra bytes?
return r;
}

// All combinations...
struct number cast_to2(enum type t, const struct number* n)
{
struct number r = {0};
r.type = n->type;

switch (t)
{
case TYPE_UNSIGNED_CHAR:
switch (n->type)
{
case TYPE_UNSIGNED_CHAR:
r.data.unsigned_char_value = n->data.unsigned_char_value;
break;
case TYPE_UNSIGNED_INT:
r.data.unsigned_char_value = n->data.unsigned_int_value;
break;
}
break;

case TYPE_UNSIGNED_INT:
switch (t)
{
case TYPE_UNSIGNED_CHAR:
r.data.unsigned_int_value = n->data.unsigned_char_value;
break;

case TYPE_UNSIGNED_INT:
r.data.unsigned_int_value = n->data.unsigned_int_value;
break;
}
break;
}
return r;
}

Thiago Adams

2024-08-07 11:33:38 UTC

I also would like to understand better why integer promotions were created.

My guess..it is because the values are used in registers and there is no
"char" size register so the values are converted in a bigger type.
Is my guess correct?

Keith Thompson

2024-08-07 20:13:40 UTC

Post by Thiago Adams
I also would like to understand better why integer promotions were created.
My guess..it is because the values are used in registers and there is
no "char" size register so the values are converted in a bigger type.
Is my guess correct?

In the C abstract machine, there are no arithmetic operations on types
narrower than int and unsigned int. If x and y are of type char, x+y
has to promote both to int (or perhaps unsigned int on some weird
systems) before performing the addition, because there is char+char
operation.

Some CPUs do have operations for narrower types, and a compiler can
generate them if the results are consistent with the standard-defined
semantics.

The C rules are motivated by the set of operations available by typical
hardware, but they've been set in stone for a long time.

And it's about available operations, not about registers.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Tim Rentsch

2024-08-12 00:43:32 UTC

Post by Thiago Adams
I also would like to understand better why integer promotions were created.
My guess..it is because the values are used in registers and there is
no "char" size register so the values are converted in a bigger type.
Is my guess correct?

The motivations for integer promotion rules are purely historical.
You're looking for answers in the wrong places.

Also, it would be better for your understanding of C if you would
stop thinking about what is going on at the level of actual
hardware. Doing that serves to confuse a lot more than it helps.

Dan Purgert

2024-08-07 20:00:28 UTC

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?
[...]

I don't know what happens when you're changing datatype lengths, but if
they're the same length, it's just telling the compiler what the
variable should be treated as (e.g. [8-bit] int to char)

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?

"Signed" just means the first bit indicates negative.

So an "unsigned" 8 bit integer will have the 256 values ranging from

0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0
(0b00000000)

TO

128 + 64 + 32 + 16 + 8 +4 + 2 + 1 = 255
(0b11111111)

Whereas a "signed" 8 bit integer will have the 256 values ranging from

(-128) + 0 + 0 + 0 + 0 + 0 + 0 + 0 = -128
(0b10000000)

TO

0 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127
(0b01111111)

At least in two's compliment (but that's the way it's done in C)

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

--
|_|O|_|
|_|_|O| Github: https://github.com/dpurgert
|O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860

Keith Thompson

2024-08-07 20:26:12 UTC

Post by Dan Purgert

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?
[...]

I don't know what happens when you're changing datatype lengths, but if
they're the same length, it's just telling the compiler what the
variable should be treated as (e.g. [8-bit] int to char)

Not necessarily. For example, for an implementation that uses sign and
magnitude or ones'-complement, converting a 32-bit signed value to a
32-bit unsigned type has a well defined result whose representation (bit
pattern) does not match the representation of the argument if it's
negative.

And of course the source and target types don't have to be the same
size. For integers, the conversion is defined in terms of values; it
might be implemented using truncation, zero-extension, sign-extension,
or something else.

C23, not yet published, requires two's complement but still allows for
padding bits.

Post by Dan Purgert

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?

"Signed" just means the first bit indicates negative.

It's more complicated than that.

[...]

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

Yes, but floating-point conversions (between different floating-point
types or between floating-point and integer) are defined in terms of
values. (double)1 == 1.0, regardless of how that 1.0 is represented.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Lawrence D'Oliveiro

2024-08-07 23:00:17 UTC

Floating point is a huge mess ...

That mess has gone away with the essentially universal adoption of
IEEE754. There were a few hardware stragglers back in the 1990s, but even
they have come around by now.

Not sure if Java has caught up yet, though ...

Thiago Adams

2024-08-08 11:14:25 UTC

Post by Dan Purgert

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?
[...]

I don't know what happens when you're changing datatype lengths, but if
they're the same length, it's just telling the compiler what the
variable should be treated as (e.g. [8-bit] int to char)

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?

"Signed" just means the first bit indicates negative.
So an "unsigned" 8 bit integer will have the 256 values ranging from
0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0
(0b00000000)
TO
128 + 64 + 32 + 16 + 8 +4 + 2 + 1 = 255
(0b11111111)
Whereas a "signed" 8 bit integer will have the 256 values ranging from
(-128) + 0 + 0 + 0 + 0 + 0 + 0 + 0 = -128
(0b10000000)
TO
0 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127
(0b01111111)
At least in two's compliment (but that's the way it's done in C)

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Bart

2024-08-08 13:23:44 UTC

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?
[...]

I don't know what happens when you're changing datatype lengths, but if
they're the same length, it's just telling the compiler what the
variable should be treated as (e.g. [8-bit] int to char)

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?

"Signed" just means the first bit indicates negative.
So an "unsigned" 8 bit integer will have the 256 values ranging from
0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0
       (0b00000000)
               TO
128 + 64 + 32 + 16 + 8 +4 + 2 + 1 = 255
       (0b11111111)
Whereas a "signed" 8 bit integer will have the 256 values ranging from
   (-128) + 0 + 0 + 0 + 0 + 0 + 0 + 0 = -128
       (0b10000000)
               TO
   0 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127
       (0b01111111)
At least in two's compliment (but that's the way it's done in C)

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Try godbolt.org. Type in a fragment of code that does different kinds of
casts (it needs to be well-formed, so inside a function), and see what
code is produced with different C compilers.

Use -O0 so that the code isn't optimised out of existence, and so that
you can more easily match it to the C source.

Michael S

2024-08-08 16:32:03 UTC

On Thu, 8 Aug 2024 14:23:44 +0100

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?
[...]

I don't know what happens when you're changing datatype lengths,
but if they're the same length, it's just telling the compiler
what the variable should be treated as (e.g. [8-bit] int to char)

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?

"Signed" just means the first bit indicates negative.
So an "unsigned" 8 bit integer will have the 256 values ranging from
0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0
       (0b00000000)
               TO
128 + 64 + 32 + 16 + 8 +4 + 2 + 1 = 255
       (0b11111111)
Whereas a "signed" 8 bit integer will have the 256 values ranging from
   (-128) + 0 + 0 + 0 + 0 + 0 + 0 + 0 = -128
       (0b10000000)
               TO
   0 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127
       (0b01111111)
At least in two's compliment (but that's the way it's done in C)

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for
encoding; though I think most C implementations use the one from
the IEEE on 1985 (uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized
registers.

Try godbolt.org. Type in a fragment of code that does different kinds
of casts (it needs to be well-formed, so inside a function), and see
what code is produced with different C compilers.
Use -O0 so that the code isn't optimised out of existence, and so
that you can more easily match it to the C source.

I'd recommend an opposite - use -O2 so the cast that does nothing
optimized away.

int foo_i2i(int x) { return (int)x; }
int foo_u2i(unsigned x) { return (int)x; }
int foo_b2i(_Bool x) { return (int)x; }
int foo_d2i(double x) { return (int)x; }

etc
https://godbolt.org/z/GWjbcG4GT

Thiago Adams

2024-08-08 17:11:36 UTC

Post by Michael S
On Thu, 8 Aug 2024 14:23:44 +0100

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?
[...]

I don't know what happens when you're changing datatype lengths,
but if they're the same length, it's just telling the compiler
what the variable should be treated as (e.g. [8-bit] int to char)

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?

"Signed" just means the first bit indicates negative.
So an "unsigned" 8 bit integer will have the 256 values ranging from
0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0
       (0b00000000)
               TO
128 + 64 + 32 + 16 + 8 +4 + 2 + 1 = 255
       (0b11111111)
Whereas a "signed" 8 bit integer will have the 256 values ranging from
   (-128) + 0 + 0 + 0 + 0 + 0 + 0 + 0 = -128
       (0b10000000)
               TO
   0 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127
       (0b01111111)
At least in two's compliment (but that's the way it's done in C)

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for
encoding; though I think most C implementations use the one from
the IEEE on 1985 (uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized
registers.

Try godbolt.org. Type in a fragment of code that does different kinds
of casts (it needs to be well-formed, so inside a function), and see
what code is produced with different C compilers.
Use -O0 so that the code isn't optimised out of existence, and so
that you can more easily match it to the C source.

I'd recommend an opposite - use -O2 so the cast that does nothing
optimized away.
int foo_i2i(int x) { return (int)x; }
int foo_u2i(unsigned x) { return (int)x; }
int foo_b2i(_Bool x) { return (int)x; }
int foo_d2i(double x) { return (int)x; }
etc
https://godbolt.org/z/GWjbcG4GT

To see what is the expected behavior I am doing for instance

static_assert((unsigned char)1234 == 210);

Bart

2024-08-08 17:29:40 UTC

Post by Michael S
On Thu, 8 Aug 2024 14:23:44 +0100

Post by Bart
Try godbolt.org. Type in a fragment of code that does different kinds
of casts (it needs to be well-formed, so inside a function), and see
what code is produced with different C compilers.
Use -O0 so that the code isn't optimised out of existence, and so
that you can more easily match it to the C ource.

I'd recommend an opposite - use -O2 so the cast that does nothing
optimized away.
int foo_i2i(int x) { return (int)x; }
int foo_u2i(unsigned x) { return (int)x; }
int foo_b2i(_Bool x) { return (int)x; }
int foo_d2i(double x) { return (int)x; }

The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results can
also be misleading:

Take this example:

void fred(void) {
_Bool b;
int i;
i=b;
}

Unoptimised, it generates this code:

push rbp
mov rbp, rsp

mov al, byte ptr [rbp - 1]
and al, 1
movzx eax, al
mov dword ptr [rbp - 8], eax

pop rbp
ret

You can see from this that a Bool occupies one byte; it is masked to 0/1
(so it doesn't trust it to contain only 0/1), then it is widened to an
int size.

With optimisation turned on, even at -O1, it produces this:

ret

That strikes me as rather less enlightening!

Meanwhile your foo_b2i function contains this optimised code:

mov eax, edi
ret

The masking and widening is not present. Presumably, it is taking
advantage of the fact that a _Bool argument will be converted and
widened to `int` at the callsite even though the parameter type is also
_Bool. So the conversion has already been done.

You will see this if writing also a call to foo_b2i() and looking at the
/non-elided/ code.

The unoptimised code for foo_b2i is pretty awful (like masking twice,
with a pointless write to memory between them). But sometimes with gcc
there is no sensible middle ground between terrible code, and having
most of it eliminated.

The unoptimised code from my C compiler for foo_b2i, excluding
entry/exit code, is:

movsx eax, byte [rbp + foo_b2i.x]

My compiler assumes that a _Bool type already contains 0 or 1.

Thiago Adams

2024-08-08 17:50:01 UTC

Post by Michael S
On Thu, 8 Aug 2024 14:23:44 +0100

Post by Bart
Try godbolt.org. Type in a fragment of code that does different kinds
of casts (it needs to be well-formed, so inside a function), and see
what code is produced with different C compilers.
Use -O0 so that the code isn't optimised out of existence, and so
that you can more easily match it to the C ource.

I'd recommend an opposite - use -O2 so the cast that does nothing
optimized away.
int foo_i2i(int x) { return (int)x; }
int foo_u2i(unsigned x) { return (int)x; }
int foo_b2i(_Bool x) { return (int)x; }
int foo_d2i(double x) { return (int)x; }

The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results can
void fred(void) {
   _Bool b;
     int i;
     i=b;
}
        push    rbp
        mov     rbp, rsp
        mov     al, byte ptr [rbp - 1]
        and     al, 1
        movzx   eax, al
        mov     dword ptr [rbp - 8], eax
        pop     rbp
        ret
You can see from this that a Bool occupies one byte; it is masked to 0/1
(so it doesn't trust it to contain only 0/1), then it is widened to an
int size.
        ret
That strikes me as rather less enlightening!
        mov     eax, edi
        ret
The masking and widening is not present. Presumably, it is taking
advantage of the fact that a _Bool argument will be converted and
widened to `int` at the callsite even though the parameter type is also
_Bool. So the conversion has already been done.
You will see this if writing also a call to foo_b2i() and looking at the
/non-elided/ code.
The unoptimised code for foo_b2i is pretty awful (like masking twice,
with a pointless write to memory between them). But sometimes with gcc
there is no sensible middle ground between terrible code, and having
most of it eliminated.
The unoptimised code from my C compiler for foo_b2i, excluding
    movsx   eax, byte [rbp + foo_b2i.x]
My compiler assumes that a _Bool type already contains 0 or 1.

If you are doing constant expression in your compiler, then you have the
same problem (casts) I am solving in cake.

For instance
static_assert((unsigned char)1234 == 210);

is already working in my cake. I had to simulate this cast.

Previously, I was doing all computations with bigger types for constant
expressions. Then I realize compile time must work as the runtime.

For constexpr the compiler does not accept initialization invalid types.
for instance.

constexpr char s = 12345;

<source>:6:21: error: constexpr initializer evaluates to 12345 which is
not exactly representable in type 'const char'
6 | constexpr char s = 12345;

I am also checking all wraparound and overflow in constant expressions.
I have a warning when the computed value is different from the math value.

Thiago Adams

2024-08-08 17:57:48 UTC

Post by Thiago Adams

> On Thu, 8 Aug 2024 14:23:44 +0100
>> Try godbolt.org. Type in a fragment of code that does different kinds
>> of casts (it needs to be well-formed, so inside a function), and see
>> what code is produced with different C compilers.
>>
>> Use -O0 so that the code isn't optimised out of existence, and so
>> that you can more easily match it to the C ource.
>>
>>
>
>
> I'd recommend an opposite - use -O2 so the cast that does nothing
> optimized away.
>
> int foo_i2i(int x) { return (int)x; }
> int foo_u2i(unsigned x) { return (int)x; }
> int foo_b2i(_Bool x) { return (int)x; }
> int foo_d2i(double x) { return (int)x; }
The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results can
   void fred(void) {
    _Bool b;
      int i;
      i=b;
   }
         push    rbp
         mov     rbp, rsp
         mov     al, byte ptr [rbp - 1]
         and     al, 1
         movzx   eax, al
         mov     dword ptr [rbp - 8], eax
         pop     rbp
         ret
You can see from this that a Bool occupies one byte; it is masked to
0/1 (so it doesn't trust it to contain only 0/1), then it is widened
to an int size.
         ret
That strikes me as rather less enlightening!
         mov     eax, edi
         ret
The masking and widening is not present. Presumably, it is taking
advantage of the fact that a _Bool argument will be converted and
widened to `int` at the callsite even though the parameter type is
also _Bool. So the conversion has already been done.
You will see this if writing also a call to foo_b2i() and looking at
the /non-elided/ code.
The unoptimised code for foo_b2i is pretty awful (like masking twice,
with a pointless write to memory between them). But sometimes with gcc
there is no sensible middle ground between terrible code, and having
most of it eliminated.
The unoptimised code from my C compiler for foo_b2i, excluding
     movsx   eax, byte [rbp + foo_b2i.x]
My compiler assumes that a _Bool type already contains 0 or 1.

If you are doing constant expression in your compiler, then you have the
same problem (casts) I am solving in cake.
For instance
static_assert((unsigned char)1234 == 210);
is already working in my cake. I had to simulate this cast.
Previously, I was doing all computations with bigger types for constant
expressions. Then I realize compile time must work as the runtime.
For constexpr the compiler does not accept initialization invalid types.
for instance.
constexpr char s = 12345;
<source>:6:21: error: constexpr initializer evaluates to 12345 which is
not exactly representable in type 'const char'
6 | constexpr char s = 12345;
I am also checking all wraparound and overflow in constant expressions.
I have a warning when the computed value is different from the math value.

Cake also transpiles C99 to C89.
Currently I am changing _Bool to unsigned char. But the values are not
converted.
For instance:

//C99
_Bool b = 123;

//C89
unsigned char b = !!(123);

This part !! is missing at current implementation.
Think in how it could be done.
I think this also explain why bool was not on the first versions of C.

Bart

2024-08-08 18:01:56 UTC

Post by Thiago Adams

> On Thu, 8 Aug 2024 14:23:44 +0100
>> Try godbolt.org. Type in a fragment of code that does different kinds
>> of casts (it needs to be well-formed, so inside a function), and see
>> what code is produced with different C compilers.
>>
>> Use -O0 so that the code isn't optimised out of existence, and so
>> that you can more easily match it to the C ource.
>>
>>
>
>
> I'd recommend an opposite - use -O2 so the cast that does nothing
> optimized away.
>
> int foo_i2i(int x) { return (int)x; }
> int foo_u2i(unsigned x) { return (int)x; }
> int foo_b2i(_Bool x) { return (int)x; }
> int foo_d2i(double x) { return (int)x; }
The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results can
   void fred(void) {
    _Bool b;
      int i;
      i=b;
   }
         push    rbp
         mov     rbp, rsp
         mov     al, byte ptr [rbp - 1]
         and     al, 1
         movzx   eax, al
         mov     dword ptr [rbp - 8], eax
         pop     rbp
         ret
You can see from this that a Bool occupies one byte; it is masked to
0/1 (so it doesn't trust it to contain only 0/1), then it is widened
to an int size.
         ret
That strikes me as rather less enlightening!
         mov     eax, edi
         ret
The masking and widening is not present. Presumably, it is taking
advantage of the fact that a _Bool argument will be converted and
widened to `int` at the callsite even though the parameter type is
also _Bool. So the conversion has already been done.
You will see this if writing also a call to foo_b2i() and looking at
the /non-elided/ code.
The unoptimised code for foo_b2i is pretty awful (like masking twice,
with a pointless write to memory between them). But sometimes with gcc
there is no sensible middle ground between terrible code, and having
most of it eliminated.
The unoptimised code from my C compiler for foo_b2i, excluding
     movsx   eax, byte [rbp + foo_b2i.x]
My compiler assumes that a _Bool type already contains 0 or 1.

If you are doing constant expression in your compiler, then you have the
same problem (casts) I am solving in cake.
For instance
static_assert((unsigned char)1234 == 210);
is already working in my cake. I had to simulate this cast.
Previously, I was doing all computations with bigger types for constant
expressions. Then I realize compile time must work as the runtime.
For constexpr the compiler does not accept initialization invalid types.
for instance.
constexpr char s = 12345;
<source>:6:21: error: constexpr initializer evaluates to 12345 which is
not exactly representable in type 'const char'
6 | constexpr char s = 12345;
I am also checking all wraparound and overflow in constant expressions.
I have a warning when the computed value is different from the math value.

In my C compiler I have no constexpr (don't know why you got that
impression). And I don't check that initialisers for integer types
overflow their destination.

This is because within the language in general:

char c; int i;

c = i;

Such an assignment is not checked at runtime (and I don't know if this
can be warned against, or if a runtime check can be added).

This is just how C works: too-large values are silently truncated (there
are worse aspects of the language, like being able to do `int i;
(&i)[12345];`).

But you are presumably superimposing a new stricter language on top. In
the case, if my `c = i` assignment was not allowed, how do I get around
that; by a cast?

Thiago Adams

2024-08-08 18:13:06 UTC

Post by Thiago Adams

> On Thu, 8 Aug 2024 14:23:44 +0100
>> Try godbolt.org. Type in a fragment of code that does different kinds
>> of casts (it needs to be well-formed, so inside a function), and see
>> what code is produced with different C compilers.
>>
>> Use -O0 so that the code isn't optimised out of existence, and so
>> that you can more easily match it to the C ource.
>>
>>
>
>
> I'd recommend an opposite - use -O2 so the cast that does nothing
> optimized away.
>
> int foo_i2i(int x) { return (int)x; }
> int foo_u2i(unsigned x) { return (int)x; }
> int foo_b2i(_Bool x) { return (int)x; }
> int foo_d2i(double x) { return (int)x; }
The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results
   void fred(void) {
    _Bool b;
      int i;
      i=b;
   }
         push    rbp
         mov     rbp, rsp
         mov     al, byte ptr [rbp - 1]
         and     al, 1
         movzx   eax, al
         mov     dword ptr [rbp - 8], eax
         pop     rbp
         ret
You can see from this that a Bool occupies one byte; it is masked to
0/1 (so it doesn't trust it to contain only 0/1), then it is widened
to an int size.
         ret
That strikes me as rather less enlightening!
         mov     eax, edi
         ret
The masking and widening is not present. Presumably, it is taking
advantage of the fact that a _Bool argument will be converted and
widened to `int` at the callsite even though the parameter type is
also _Bool. So the conversion has already been done.
You will see this if writing also a call to foo_b2i() and looking at
the /non-elided/ code.
The unoptimised code for foo_b2i is pretty awful (like masking twice,
with a pointless write to memory between them). But sometimes with
gcc there is no sensible middle ground between terrible code, and
having most of it eliminated.
The unoptimised code from my C compiler for foo_b2i, excluding
     movsx   eax, byte [rbp + foo_b2i.x]
My compiler assumes that a _Bool type already contains 0 or 1.

If you are doing constant expression in your compiler, then you have
the same problem (casts) I am solving in cake.
For instance
static_assert((unsigned char)1234 == 210);
is already working in my cake. I had to simulate this cast.
Previously, I was doing all computations with bigger types for
constant expressions. Then I realize compile time must work as the
runtime.
For constexpr the compiler does not accept initialization invalid types.
for instance.
constexpr char s = 12345;
<source>:6:21: error: constexpr initializer evaluates to 12345 which
is not exactly representable in type 'const char'
6 | constexpr char s = 12345;
I am also checking all wraparound and overflow in constant expressions.
I have a warning when the computed value is different from the math value.

In my C compiler I have no constexpr (don't know why you got that
impression). And I don't check that initialisers for integer types
overflow their destination.

It does not need to be "constexpr" just constant expressions like enum
or case of switch. (Cast at compile time is not new in C23.)

Post by Bart
char c; int i;
c = i;
Such an assignment is not checked at runtime (and I don't know if this
can be warned against, or if a runtime check can be added).
This is just how C works: too-large values are silently truncated (there
are worse aspects of the language, like being able to do `int i;
(&i)[12345];`).

But this is something well defined, not UB (I guess).

Post by Bart
But you are presumably superimposing a new stricter language on top. In
the case, if my `c = i` assignment was not allowed, how do I get around
that; by a cast?

I am just following C standard. Adding some extra warnings for wraparound.

Keith Thompson

2024-08-08 19:29:02 UTC

Thiago Adams <***@gmail.com> writes:
[...]

Post by Thiago Adams
For constexpr the compiler does not accept initialization invalid types.
for instance.
constexpr char s = 12345;
<source>:6:21: error: constexpr initializer evaluates to 12345 which
is not exactly representable in type 'const char'
6 | constexpr char s = 12345;

The problem there is the value, not the type. This:

constexpr char s = 10;

is perfectly valid. The initializer is of type int, not char, but it's
implicitly converted.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

David Brown

2024-08-08 17:58:34 UTC

Post by Michael S
On Thu, 8 Aug 2024 14:23:44 +0100

Post by Bart
Try godbolt.org. Type in a fragment of code that does different kinds
of casts (it needs to be well-formed, so inside a function), and see
what code is produced with different C compilers.
Use -O0 so that the code isn't optimised out of existence, and so
that you can more easily match it to the C ource.

I'd recommend an opposite - use -O2 so the cast that does nothing
optimized away.
int foo_i2i(int x) { return (int)x; }
int foo_u2i(unsigned x) { return (int)x; }
int foo_b2i(_Bool x) { return (int)x; }
int foo_d2i(double x) { return (int)x; }

The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results can

Michael is correct - the OP should enable optimisation, precisely to
avoid the issue you are concerned about. Without optimisation, the
results are misleading - they will only show things that are /not/
involved in the conversion, swamping the useful results with code that
messes about putting data on and off the stack. When optimised
compilation shows that no code is generated, it is a very clear
indication that no operations are needed for the conversions in question
- unoptimized code hides that.

Post by Bart
void fred(void) {
   _Bool b;
     int i;
     i=b;
}
        push    rbp
        mov     rbp, rsp
        mov     al, byte ptr [rbp - 1]
        and     al, 1
        movzx   eax, al
        mov     dword ptr [rbp - 8], eax
        pop     rbp
        ret
You can see from this that a Bool occupies one byte; it is masked to 0/1
(so it doesn't trust it to contain only 0/1), then it is widened to an
int size.

No, you can't see that. All you can see is garbage in, garbage out.
You have to start with a function that has some meaning!

Post by Bart
ret

Try again with:

int foo(bool x) { return x; }

bool bar(int x) { return x; }

Try it with -O0 and -O1, and then tell us which you think gives a
clearer indication of the operations needed.

Bart

2024-08-08 19:09:56 UTC

Post by David Brown

> On Thu, 8 Aug 2024 14:23:44 +0100
>> Try godbolt.org. Type in a fragment of code that does different kinds
>> of casts (it needs to be well-formed, so inside a function), and see
>> what code is produced with different C compilers.
>>
>> Use -O0 so that the code isn't optimised out of existence, and so
>> that you can more easily match it to the C ource.
>>
>>
>
>
> I'd recommend an opposite - use -O2 so the cast that does nothing
> optimized away.
>
> int foo_i2i(int x) { return (int)x; }
> int foo_u2i(unsigned x) { return (int)x; }
> int foo_b2i(_Bool x) { return (int)x; }
> int foo_d2i(double x) { return (int)x; }
The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results can

Michael is correct - the OP should enable optimisation, precisely to
avoid the issue you are concerned about. Without optimisation, the
results are misleading - they will only show things that are /not/
involved in the conversion, swamping the useful results with code that
messes about putting data on and off the stack. When optimised
compilation shows that no code is generated, it is a very clear
indication that no operations are needed for the conversions in question
- unoptimized code hides that.

   void fred(void) {
    _Bool b;
      int i;
      i=b;
   }
         push    rbp
         mov     rbp, rsp
         mov     al, byte ptr [rbp - 1]
         and     al, 1
         movzx   eax, al
         mov     dword ptr [rbp - 8], eax
         pop     rbp
         ret
You can see from this that a Bool occupies one byte; it is masked to
0/1 (so it doesn't trust it to contain only 0/1), then it is widened
to an int size.

No, you can't see that. All you can see is garbage in, garbage out. You
have to start with a function that has some meaning!

Sorry but my function is perfectly valid. It's taking a Bool value and
converting it to an int.

Perhaps you don't understand x86 code? I'll tell you: it loads that
/byte-sized/ value, masks it, and widens it to an int. I'm surprised you
can't see that.

But I suspect a long gaslighting session coming on, where you refute the
evidence that everyone else can see!

Post by David Brown

ret

int foo(bool x) { return x; }
bool bar(int x) { return x; }
Try it with -O0 and -O1, and then tell us which you think gives a
clearer indication of the operations needed.

Michael is wrong and so are you.

If you want to know what casting from bool to int entails, then testing
it via a function call like this is the wrong way to do it, since half
of it depends on what happens when evaluating arguments at the call site.

Especially if you let the compiler do what it likes, like using its
knowledge of that call process, which is not displayed here in the
optimised code of the function body.

So I have some questions of you:

* How exactly is a _Bool value (which occupies one byte) translated to a
32-bit signed integer? What is involved?

This is machine independent other than the sizes mentioned.

Given your answer, how does it correlate with either:

mov eax, edi ; from your test; both optimised code

<nothing> ; from my test

The advantage of unoptimised code is that it will contain everything
that is normally involved; it doesn't throw anything away.

It doesn't require convincing the compiler that you're doing something
useful to avoid it eliminating most or all your code, or turning it
something that is just plain misleading.

That might be useful when compiling a huge production version of an app,
but it is useless when trying to shed light on an isolated fragment of code.

Look, just forget it, I'm not in the mood for another marathon subthread.

So, what's involved in turning Bool to int? According to your examples
with -O1: nothing. You just copy 32 bits unchanged from one to the
other. Mildly surprising, but you are of course right, right?

However, now *I* have a problem, figuring out why on earth C compiler
does the conversion like this:

movsx eax, byte [source]

Because this must be wrong, right?

David Brown

2024-08-08 22:32:00 UTC

Post by David Brown

> On Thu, 8 Aug 2024 14:23:44 +0100
>> Try godbolt.org. Type in a fragment of code that does different kinds
>> of casts (it needs to be well-formed, so inside a function), and see
>> what code is produced with different C compilers.
>>
>> Use -O0 so that the code isn't optimised out of existence, and so
>> that you can more easily match it to the C ource.
>>
>>
>
>
> I'd recommend an opposite - use -O2 so the cast that does nothing
> optimized away.
>
> int foo_i2i(int x) { return (int)x; }
> int foo_u2i(unsigned x) { return (int)x; }
> int foo_b2i(_Bool x) { return (int)x; }
> int foo_d2i(double x) { return (int)x; }
The OP is curious as to what's involved when a conversion is done.
Hiding or eliminating code isn't helpful in that case; the results

Michael is correct - the OP should enable optimisation, precisely to
avoid the issue you are concerned about. Without optimisation, the
results are misleading - they will only show things that are /not/
involved in the conversion, swamping the useful results with code that
messes about putting data on and off the stack. When optimised
compilation shows that no code is generated, it is a very clear
indication that no operations are needed for the conversions in
question - unoptimized code hides that.

   void fred(void) {
    _Bool b;
      int i;
      i=b;
   }
         push    rbp
         mov     rbp, rsp
         mov     al, byte ptr [rbp - 1]
         and     al, 1
         movzx   eax, al
         mov     dword ptr [rbp - 8], eax
         pop     rbp
         ret
You can see from this that a Bool occupies one byte; it is masked to
0/1 (so it doesn't trust it to contain only 0/1), then it is widened
to an int size.

No, you can't see that. All you can see is garbage in, garbage out.
You have to start with a function that has some meaning!

Sorry but my function is perfectly valid. It's taking a Bool value and
converting it to an int.

No, it is not.

Attempting to use the value of a non-static local variable that has not
been initialised or assigned is undefined behaviour. Your function is
garbage. No one can draw any conclusions about how meaningless code is
compiled.

Post by Bart
Perhaps you don't understand x86 code? I'll tell you: it loads that
/byte-sized/ value, masks it, and widens it to an int. I'm surprised you
can't see that.

I understand x86 well enough - perhaps not as well as you, but well
enough. I do, however, understand C better than you, it seems. The x86
code is irrelevant to the fact that your code has undefined behaviour.
(And even if it were defined, it would still do nothing relevant.)

Post by Bart
But I suspect a long gaslighting session coming on, where you refute the
evidence that everyone else can see!

You are projecting.

Look, it is extraordinarily simple to write functions that /actually/ do
the conversions under discussion. Why waste time writing nonsense
functions that do that?

Post by David Brown

ret

int foo(bool x) { return x; }
bool bar(int x) { return x; }
Try it with -O0 and -O1, and then tell us which you think gives a
clearer indication of the operations needed.

Michael is wrong and so are you.
If you want to know what casting from bool to int entails, then testing
it via a function call like this is the wrong way to do it, since half kj
of it depends on what happens when evaluating arguments at the call site.

Nope.

But if you prefer, just use external variables:

int i;
bool b;

void to_int_0(void) { i = b; }
void to_bool_0(void) { b = i; }

<https://www.godbolt.org/z/eT9Y84Gx4>

Again, look at the two functions with -O0 and -O1, and tell me which is
clearer.

Post by Bart
Especially if you let the compiler do what it likes, like using its
knowledge of that call process, which is not displayed here in the
optimised code of the function body.
* How exactly is a _Bool value (which occupies one byte) translated to a
32-bit signed integer? What is involved?

A _Bool is always either 0 or 1. The conversion is whatever the
compiler needs to give an int of value 0 or 1.

The implementation details depend entirely on the target. Typically, if
the _Bool is in memory, then a single byte is read and zero-extended to
the width of a register. If the _Bool is passed to a function as a
parameter, it is usually already extended - but that will depend on the
calling conventions.

Post by Bart
This is machine independent other than the sizes mentioned.

The specification is independent - it is given by the C standards. The
implementation is most certainly not machine independent. It is also
not necessarily consistent - compilers can and do pick different code
depending on the circumstances (where the _Bool came from, and how the
int is going to be used). You are even wrong in stating that a _Bool
occupies one byte, since that is not a requirement for a C
implementation (though I don't know of any real-world C implementations
with larger _Bool's).

Post by Bart
mov eax, edi ; from your test; both optimised code

Looks fine.

Post by Bart
<nothing> ; from my test

That's fine for the nonsense function you wrote.

Post by Bart
The advantage of unoptimised code is that it will contain everything
that is normally involved; it doesn't throw anything away.

No, it does not - because normal code generated by C compilers used by C
programmers who want sensible results will be the result of compiling
with optimisation, and will look very different.

Post by Bart
It doesn't require convincing the compiler that you're doing something
useful to avoid it eliminating most or all your code, or turning it
something that is just plain misleading.

You have /never/ understood how to look at generated code, have you?

Post by Bart
That might be useful when compiling a huge production version of an app,
but it is useless when trying to shed light on an isolated fragment of code.
Look, just forget it, I'm not in the mood for another marathon subthread.

OK. I expect the OP to understand these things better.

Post by Bart
So, what's involved in turning Bool to int? According to your examples
with -O1: nothing. You just copy 32 bits unchanged from one to the
other. Mildly surprising, but you are of course right, right?

Yes, I am right - and I don't see that as even mildly surprising here.
That's how you convert a _Bool in a register to an int in a register.
You'll see equally little code when converting between signed and
unsigned types, or various other integer types.

Post by Bart
However, now *I* have a problem, figuring out why on earth C compiler
movsx eax, byte [source]
Because this must be wrong, right?

No. You are asking it to do something else - you are asking it to load
a _Bool from memory, not just convert it.

Keith Thompson

2024-08-08 23:14:09 UTC

David Brown <***@hesbynett.no> writes:
[...]

Post by David Brown
A _Bool is always either 0 or 1. The conversion is whatever the
compiler needs to give an int of value 0 or 1.

The value of a _Bool object is always either 0 or 1 *unless* the program
does something weird.

C23 is a bit clearer about the representation of bool (still also called
_Bool) than previous editions. It states (draft N3220) that :

The type bool shall have one value bit and (sizeof(bool)*CHAR_BIT)-1
padding bits.

There are several ways to force a representation other than 00000000 or
00000001 into a _Bool object, including a union, memset(), or type
punning via a pointer cast.

C23 dropped the term "trap representation", replacing it with "non-value
representation" -- a reasonable change, since accessing a trap
representation is not guaranteed to cause the program to "perform a
trap" (defined as "interrupt execution of the program such that no
further operations are performed").

It doesn't specify whether setting the padding bits to 1 results in a
non-value representation.

If non-zero padding bits create a non-value representation, then
accessing a bool object holding such a representation has undefined
behavior. It could, among other things, yield the same value implied by
the representation as if it were an ordinary integer of the same size.

If there are no non-value representations, then only the
single value bit determines the value, which is either false or true.

As you mentioned, I expect that sizeof (bool) will normally be 1, but an
implementation could make it wider, e.g. with 1 value bit and 31 padding
bits.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Lawrence D'Oliveiro

2024-08-09 02:47:08 UTC

Post by Keith Thompson
The value of a _Bool object is always either 0 or 1 *unless* the program
does something weird.

If a variable is declared to be of a particular type, does that mean that
any possible value of that variable is, by definition, interpreted as some
instance of that type?

Keith Thompson

2024-08-09 05:55:52 UTC

Post by Lawrence D'Oliveiro

Post by Keith Thompson
The value of a _Bool object is always either 0 or 1 *unless* the program
does something weird.

If a variable is declared to be of a particular type, does that mean that
any possible value of that variable is, by definition, interpreted as some
instance of that type?

Of course, given what "value" means. How could that not be the case?

If you mean any possible *representation* (for example, if _Bool is 8
bits there are 256 possible representations), I refer you to my previous
discussion of non-value representations (formerly called trap
representations) in the text that you snipped.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

2024-08-09 06:08:04 UTC

Post by Lawrence D'Oliveiro

Post by Keith Thompson
The value of a _Bool object is always either 0 or 1 *unless* the program
does something weird.

If a variable is declared to be of a particular type, does that mean that
any possible value of that variable is, by definition, interpreted as some
instance of that type?

No, modulo some uncertainties about what precisely you mean by "possible
value". I'm assuming that you mean object representations that can be
created by your code.

Every object type has a range of representable values. For each of those
representable values there's one or more object representations that
represent that value. Writing a representable value to an object using
code with defined behavior and an lvalue of a given type always results
in a valid object representation according to that type.

However, there can be, and often are, object representations that do not
represent a value when interpreted according to the type of the lvalue
used to read them. These are called non-value representations. This can
only happen as a result of type-punning or code that has undefined behavior.

Here's what the standard says about such situations:
"If such a representation is read by an lvalue expression that does not
have character type, the behavior is undefined. If such a representation
is produced by a side effect that modifies all or any part of the object
by an lvalue expression that does not have character type, the behavior
is undefined.55)" (6.2.6.1p5)

So, no, not all object representations represent valid values of the
type used to access them, and it's a very bad thing when you let that
happen.

David Brown

2024-08-09 16:16:19 UTC

Post by Keith Thompson
[...]

Post by David Brown
A _Bool is always either 0 or 1. The conversion is whatever the
compiler needs to give an int of value 0 or 1.

The value of a _Bool object is always either 0 or 1 *unless* the program
does something weird.

True. But attempting to use a _Bool object (as a _Bool) that does not
contain either 0 or 1 is going to be undefined behaviour (at least it
was on the platform where I saw this happen as a code bug).

Post by Keith Thompson
C23 is a bit clearer about the representation of bool (still also called
The type bool shall have one value bit and (sizeof(bool)*CHAR_BIT)-1
padding bits.
There are several ways to force a representation other than 00000000 or
00000001 into a _Bool object, including a union, memset(), or type
punning via a pointer cast.
C23 dropped the term "trap representation", replacing it with "non-value
representation" -- a reasonable change, since accessing a trap
representation is not guaranteed to cause the program to "perform a
trap" (defined as "interrupt execution of the program such that no
further operations are performed").
It doesn't specify whether setting the padding bits to 1 results in a
non-value representation.

That's probably an implementation-defined issue, is it not?

Post by Keith Thompson
If non-zero padding bits create a non-value representation, then
accessing a bool object holding such a representation has undefined
behavior. It could, among other things, yield the same value implied by
the representation as if it were an ordinary integer of the same size.
If there are no non-value representations, then only the
single value bit determines the value, which is either false or true.
As you mentioned, I expect that sizeof (bool) will normally be 1, but an
implementation could make it wider, e.g. with 1 value bit and 31 padding
bits.

Keith Thompson

2024-08-09 19:18:29 UTC

Post by David Brown

Post by Keith Thompson
[...]

Post by David Brown
A _Bool is always either 0 or 1. The conversion is whatever the
compiler needs to give an int of value 0 or 1.

The value of a _Bool object is always either 0 or 1 *unless* the
program does something weird.

True. But attempting to use a _Bool object (as a _Bool) that does not
contain either 0 or 1 is going to be undefined behaviour (at least it
was on the platform where I saw this happen as a code bug).

It depends on whether representations with non-zero padding bits are
treated as trap representations (non-value representations in C23) or
not.

[...]

Post by David Brown

Post by Keith Thompson
It doesn't specify whether setting the padding bits to 1 results in a
non-value representation.

That's probably an implementation-defined issue, is it not?

I'm not sure whether it's implementation-defined or unspecified.
I don't see any mention of trap/non-value representations in Annex J.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Tim Rentsch

2024-08-12 00:07:56 UTC

Post by Keith Thompson

Post by David Brown

Post by Keith Thompson
[...]

Post by David Brown
A _Bool is always either 0 or 1. The conversion is whatever the
compiler needs to give an int of value 0 or 1.

The value of a _Bool object is always either 0 or 1 *unless* the
program does something weird.

True. But attempting to use a _Bool object (as a _Bool) that does not
contain either 0 or 1 is going to be undefined behaviour (at least it
was on the platform where I saw this happen as a code bug).

It depends on whether representations with non-zero padding bits are
treated as trap representations (non-value representations in C23) or
not.

In C99 and C11, iirc, the width of _Bool may be any value between
1 and CHAR_BIT. If the width of _Bool is greater than 1, a _Bool
may have a well-defined value that is neither 0 or 1. My guess is
most implementations define the width of _Bool as 1, but they don't
have to (again, iirc, in C99 and C11).

Post by Keith Thompson

Post by David Brown

Post by Keith Thompson
It doesn't specify whether setting the padding bits to 1 results in a
non-value representation.

That's probably an implementation-defined issue, is it not?

I'm not sure whether it's implementation-defined or unspecified.
I don't see any mention of trap/non-value representations in Annex J.
[...]

6.2.6.1 p 2;

J.3.13 p 1, third subpoint.

Keith Thompson

2024-08-12 03:14:01 UTC

Post by Tim Rentsch

Post by Keith Thompson

Post by David Brown

Post by Keith Thompson
[...]

Post by David Brown
A _Bool is always either 0 or 1. The conversion is whatever the
compiler needs to give an int of value 0 or 1.

The value of a _Bool object is always either 0 or 1 *unless* the
program does something weird.

True. But attempting to use a _Bool object (as a _Bool) that does not
contain either 0 or 1 is going to be undefined behaviour (at least it
was on the platform where I saw this happen as a code bug).

It depends on whether representations with non-zero padding bits are
treated as trap representations (non-value representations in C23) or
not.

In C99 and C11, iirc, the width of _Bool may be any value between
1 and CHAR_BIT. If the width of _Bool is greater than 1, a _Bool
may have a well-defined value that is neither 0 or 1. My guess is
most implementations define the width of _Bool as 1, but they don't
have to (again, iirc, in C99 and C11).

C11 (N1570) isn't 100% clear, but I think you're right. The conversion
rank of _Bool is less than the rank of the char types. I don't see an
explicit statement that this implies that _Bool has less precision than
unsigned char, so conceivably a conforming implementation could give
_Bool a precision of 2*CHAR_BIT, but C23 has cleared this up so I'm not
going to worry about it (and it's possible I've missed something).

Post by Tim Rentsch

Post by Keith Thompson

Post by David Brown

Post by Keith Thompson
It doesn't specify whether setting the padding bits to 1 results in a
non-value representation.

That's probably an implementation-defined issue, is it not?

I'm not sure whether it's implementation-defined or unspecified.
I don't see any mention of trap/non-value representations in Annex J.
[...]

6.2.6.1 p 2;

So it's implementation-defined.

N1570 :

Except for bit-fields, objects are composed of contiguous sequences of
one or more bytes, the number, order, and encoding of which are either
explicitly specified or implementation-defined.

Post by Tim Rentsch
J.3.13 p 1, third subpoint.

The number, order, and encoding of bytes in any object (when not
explicitly specified in this International Standard) (6.2.6.1).

(listed under Implementation-defined behavior).

Quoting the standard so that everyone else doesn't have to go look it up
(and guess which edition you're referring to). You might consider doing
that yourself.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Bart

2024-08-09 00:56:15 UTC

Post by David Brown

Post by Bart
Sorry but my function is perfectly valid. It's taking a Bool value and
converting it to an int.

No, it is not.
Attempting to use the value of a non-static local variable that has not
been initialised or assigned is undefined behaviour. Your function is
garbage. No one can draw any conclusions about how meaningless code is
compiled.

FFS. You really think that makes a blind bit of difference? A variable
is not initialised, so the bool->int code shown must be just random
rubbish generated by the compiler? You think I wouldn't spot if there
was something amiss?

OK, let's initialise it and see what difference it actually makes. My
code is now this:

#include <stdbool.h>

void BC(void) {
_Bool b;
int i;
i=b;
}

void DB(void) {
_Bool b=false;
int i;
i=b;
}

The output from gcc -O1 is this:

BC:
ret
DB:
ret

There is no difference. So, even initialised, it tells me nothing about
what might be involved in bool->int conversion. It is useless.

Now I'll try it with -O0 (line breaks added):

BC:
push rbp
mov rbp, rsp

movzx eax, BYTE PTR [rbp-1]
mov DWORD PTR [rbp-8], eax

nop
pop rbp
ret
DB:
push rbp
mov rbp, rsp
mov BYTE PTR [rbp-1], 0

movzx eax, BYTE PTR [rbp-1]
mov DWORD PTR [rbp-8], eax

nop
pop rbp
ret

Exactly the same code, except DB has an extra line to initialise that value.

Are you surprised it is the same? I am 99% sure that you already knew
this, but were pretending that the code was meaningless, for reasons
that escape me.

One more thing: the ASM code I posted earlier was from Clang 18.1, above
it's from gcc 14.1.

The Clang code masks bit 0 of the bool value; gcc doesn't.

However, you can only know that by using -O0 in both cases. Using the
-O1 or higher that you recommend, you only see this:

BC:
ret

DB:
ret

for both compilers. That is utterly useless. YMMV.

Post by David Brown

Post by Bart
If you want to know what casting from bool to int entails, then
testing it via a function call like this is the wrong way to do it,
since half kj
of it depends on what happens when evaluating arguments at the call site.

Nope.

So in:

mov eax, ecx

inside foo_b2i(), at what point did the 8-bit _Bool get turned into the
32-bit value in ecx?

Post by David Brown
That's fine for the nonsense function you wrote.

Actually, MS (the poster) wrote those foo_* functions .

Post by David Brown

Post by Bart
The advantage of unoptimised code is that it will contain everything
that is normally involved; it doesn't throw anything away.

No, it does not - because normal code generated by C compilers used by C
programmers who want sensible results will be the result of compiling
with optimisation, and will look very different.

As I said, this isn't production code of a real program. It is a single
line. Elsewhere you had to resort to using statics (and linkage outside
of a function) to stop code being eliminated out of existence.

With -O0 you don't need such tricks, and don't need to think about what
might have been removed that you really need to see.

Post by David Brown

Post by Bart
So, what's involved in turning Bool to int? According to your examples
with -O1: nothing. You just copy 32 bits unchanged from one to the
other. Mildly surprising, but you are of course right, right?

Yes, I am right - and I don't see that as even mildly surprising here.
That's how you convert a _Bool in a register to an int in a register.

We don't know where the bool passed to foo_b2i came from; maybe it was
an element of an array or struct, so it would have been widened at some
point before calling the function. So that example would give a
misleading picture of what's involved.

Post by David Brown
No. You are asking it to do something else - you are asking it to load
a _Bool from memory, not just convert it.

It loads from memory and converts it in one instruction; isn't that
something? But this is pretty much what your godbolt link does, where
you have to resort to using static variables, since for -O1 and above,
locals are kept in registers:

movzx eax, BYTE PTR b[rip]

It seems there /is/ something I overlooked: the -S output of gcc/clang
is less readable, since variable names disappear: they either are kept
in registers (where you don't know which is which); or they are
addressed by numeric offsets, and again you don't know what is what.

David Brown

2024-08-09 17:08:42 UTC

Post by David Brown

Post by Bart
Sorry but my function is perfectly valid. It's taking a Bool value
and converting it to an int.

No, it is not.
Attempting to use the value of a non-static local variable that has
not been initialised or assigned is undefined behaviour. Your
function is garbage. No one can draw any conclusions about how
meaningless code is compiled.

FFS. You really think that makes a blind bit of difference?

Yes, I do.

Post by Bart
A variable
is not initialised, so the bool->int code shown must be just random
rubbish generated by the compiler? You think I wouldn't spot if there
was something amiss?

You wrote C code that had something amiss!

How often are you going to do this? Write some piece of meaningless
crap with undefined behaviour or - at best - no observable behaviour at
all, compile with silly choices of flags, and then make nonsensical
claims about what compilers do or how C is defined? How hard is it for
you to actually write a C function that has defined behaviour that does
what you want it to do? Get step 1 of these tests right, and we won't
have to keep repeating this stuff.

Post by Bart
OK, let's initialise it and see what difference it actually makes. My
#include <stdbool.h>
void BC(void) {
      _Bool b;
      int i;
      i=b;
}
void DB(void) {
      _Bool b=false;
      int i;
      i=b;
}
        ret
        ret
There is no difference.

There is a difference in the code. One (BC) has no defined behaviour.
The other (DB) has defined behaviour with no observable behaviour. It's
not a surprise that a compiler generates no code for either of them.

Post by Bart
So, even initialised, it tells me nothing about
what might be involved in bool->int conversion. It is useless.

Agreed. Nobody suggested your code above as a good idea.

Post by Bart
        push    rbp
        mov     rbp, rsp
        movzx   eax, BYTE PTR [rbp-1]
        mov     DWORD PTR [rbp-8], eax
        nop
        pop     rbp
        ret
        push    rbp
        mov     rbp, rsp
        mov     BYTE PTR [rbp-1], 0
        movzx   eax, BYTE PTR [rbp-1]
        mov     DWORD PTR [rbp-8], eax
        nop
        pop     rbp
        ret
Exactly the same code, except DB has an extra line to initialise that value.
Are you surprised it is the same? I am 99% sure that you already knew
this, but were pretending that the code was meaningless, for reasons
that escape me.

Instead of trolling with what you know, without doubt, are pointless
straw men, why not apply a little thought and write functions that make
sense? Or look at the functions that I wrote?

The stuff you write is either meaningless, or pointless, or perhaps
both. I have no doubt that you know this fine.

So the big question is, why are you writing it? Is it just to provoke
me? Is it because you want to confuse the OP and other readers? Do you
like pretending to be a fool?

I'm giving up trying to help you - at least until you show some hint of
trying to learn. I will still make posts pointing out when you write
nonsense that might confuse or mislead others, but I'll stop trying to
explain things unless you specifically ask.

Bart

2024-08-10 10:03:02 UTC

Post by David Brown

Post by David Brown

Post by Bart
Sorry but my function is perfectly valid. It's taking a Bool value
and converting it to an int.

No, it is not.
Attempting to use the value of a non-static local variable that has
not been initialised or assigned is undefined behaviour. Your
function is garbage. No one can draw any conclusions about how
meaningless code is compiled.

FFS. You really think that makes a blind bit of difference?

Yes, I do.

Post by Bart
A variable is not initialised, so the bool->int code shown must be
just random rubbish generated by the compiler? You think I wouldn't
spot if there was something amiss?

You wrote C code that had something amiss!
How often are you going to do this? Write some piece of meaningless
crap with undefined behaviour or - at best - no observable behaviour at
all, compile with silly choices of flags, and then make nonsensical
claims about what compilers do or how C is defined?

What did I get wrong about that particular conversion?

I write such fragments of code for my own compilers a hundred times a day.

It only seems to cause a hoo-hah with your favourite compiler.

Post by David Brown

Post by Bart
   void DB(void) {
       _Bool b=false;
       int i;
       i=b;
   }

There is a difference in the code. One (BC) has no defined behaviour.
The other (DB) has defined behaviour with no observable behaviour.

Ah. No observable behaviour.

This is indeed a problem with an optimising compiler, because it is in
that mode that a compiler will detect that your code has no observable
effects (other than in half a dozen ways I could mention, such as
looking at the generated code!), and decides it's not going to bother
producing it.

This is why I recommend -O0 with such compilers. Or better, use a
simpler compiler.

(Mine for example have absolute no problem with meaningless fragments,
so long as they satisfy language rules. They will even produce comment
lines like these:

;------------------------------
...
;------------------------------

to more easily isolate a function body from entry/exit code.

For something available to everyone, then Tiny C is available on
godbold.org.)

Post by David Brown
not a surprise that a compiler generates no code for either of them.

Post by Bart
So, even initialised, it tells me nothing about what might be involved
in bool->int conversion. It is useless.

Agreed. Nobody suggested your code above as a good idea.

Post by Bart
         push    rbp
         mov     rbp, rsp
         movzx   eax, BYTE PTR [rbp-1]
         mov     DWORD PTR [rbp-8], eax
         nop
         pop     rbp
         ret
         push    rbp
         mov     rbp, rsp
         mov     BYTE PTR [rbp-1], 0
         movzx   eax, BYTE PTR [rbp-1]
         mov     DWORD PTR [rbp-8], eax
         nop
         pop     rbp
         ret
Exactly the same code, except DB has an extra line to initialise that value.
Are you surprised it is the same? I am 99% sure that you already knew
this, but were pretending that the code was meaningless, for reasons
that escape me.

Instead of trolling with what you know, without doubt, are pointless
straw men, why not apply a little thought and write functions that make
sense?

When you are testing code fragments, they DON'T make sense. Hence -O0 to
avoid having to soft-soap or cajole the compiler into producing relevant
code.

It's to make life easier not harder. The only thing with -O0 is to learn
to disregard the irrelevant bits. But at least YOU get to say what is
irrelevant.

Post by David Brown
I'm giving up trying to help you - at least until you show some hint of
trying to learn.

What the fuck are you on about?

I'm suggesting using -O0 because it is easier: you write code fragments
without have to write a complete, meaningful program that has observable
behaviour.

Which is quite hard to do; benchmaking code from an optimising compiler
can be challenging, since it is easy for it to circumvent the very task
you're trying to measure. Apparently, program runtime does not count as
'observable behaviour' so it is not something that attempt is made to
preserve.

Post by David Brown
I will still make posts pointing out when you write
nonsense that might confuse or mislead others, but I'll stop trying to
explain things unless you specifically ask.

So will I, to other people.

The set of tests posted by MS, compiler with optimising options with
Clang, shed no light on bool to int conversion.

Lawrence D'Oliveiro

2024-08-09 02:45:43 UTC

Post by Bart
But I suspect a long gaslighting session coming on, where you refute the
evidence that everyone else can see!

A denial is not a refutation.

Keith Thompson

2024-08-08 19:42:16 UTC

[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for
encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?

Some CPUs have floating-point registers, some don't. C says nothing
about registers.

What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Thiago Adams

2024-08-08 20:34:04 UTC

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for
encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.

We don't have "char" register or signed/unsigned register. But I believe
we may have double and float registers. So float does not need to be
converted to double.

There is no specif question here, just trying to understand the
rationally behind the conversions rules.

Bart

2024-08-08 21:41:47 UTC

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for
encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.
We don't have "char" register or signed/unsigned register. But I believe
we may have double and float registers. So float does not need to be
converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.

Your initial post showed come confusion about how conversions work. They
are not performed 'in-place', any more than writing `a + 1` changes the
value of `a`.

Take:

int a; double x;

x = (double)a;

The cast is implicit here but I've written it out to make it clear. My C
compiler produces intermediate code like this before converting it to
native code:

push x r64 # r64 means float64
fix r64 -> i32
pop a i32

I could choose to interprete this code just as it is. Then, in this
execution model, there are no registers at all, only a stack that can
hold data of any type.

The 'fix' instruction pops the double value from the stack, converts it
to int (which involves changing both the bit-pattern, and the
bit-width), and pushes it back onto the stack.

Registers come into it when running it directly on a real machine. But
you seem more concerned with safety and correctness than performance, so
there's probably no real need to look at actual generated native code.

That'll just be confusing (especially if you follow the advice to
generate only optimised code).

Keith Thompson

2024-08-08 23:17:55 UTC

Bart <***@freeuk.com> writes:
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]

The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.

There is no such thing as an "implicit cast" in C.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Bart

2024-08-09 10:04:35 UTC

Post by Keith Thompson
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]
The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.
There is no such thing as an "implicit cast" in C.

Suppose I write this code:

x = a; // implicit 'conversion'
x = (double)a; // explicit 'conversion'

My compiler produces these two bits of AST for the RHS of both expressions:

1 00009 r64---|---2 convert: sfloat_c i32 => r64
1 00009 i32---|---|---1 name: t.main.a.1

1 00010 r64---|---2 convert: sfloat_c i32 => r64
1 00010 i32---|---|---1 name: t.main.a.1

So whatever you call that `(double)` part of the second line, which is
written explicitly, exactly the same thing is done internally (ie
'implicitly') to the first line. (The 09/10 are line numbers.)

Since C likes to use the term 'cast' for such conversions, I don't see a
problem with talking about implicit and explicit versions.

It just seems to irk the pedantics here.

David Brown

2024-08-09 17:12:47 UTC

Post by Keith Thompson
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]
The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.
There is no such thing as an "implicit cast" in C.

x = a; // implicit 'conversion'
x = (double)a; // explicit 'conversion'
1 00009 r64---|---2 convert: sfloat_c i32 => r64
1 00009 i32---|---|---1 name: t.main.a.1
1 00010 r64---|---2 convert: sfloat_c i32 => r64
1 00010 i32---|---|---1 name: t.main.a.1
So whatever you call that `(double)` part of the second line, which is
written explicitly, exactly the same thing is done internally (ie
'implicitly') to the first line. (The 09/10 are line numbers.)

You've written it yourself. Both are conversions - one is implicit, the
other is explicit.

Since C likes to use the term 'cast' for such conversions, I don't see a
problem with talking about implicit and explicit versions.

C does not "like" to use the term "cast" for anything other than cast
operations, as defined by the C standards. Implicit conversions are not
casts.

/You/ might like to call implicit conversions "casts", but you'd be
wrong to do so.

It just seems to irk the pedantics here.

You mean, people who know what they are talking about rather than those
that make up stuff as they go along?

James Kuyper

2024-08-09 17:57:59 UTC

...

Post by Keith Thompson
There is no such thing as an "implicit cast" in C.

x = a; // implicit 'conversion'
x = (double)a; // explicit 'conversion'
1 00009 r64---|---2 convert: sfloat_c i32 => r64
1 00009 i32---|---|---1 name: t.main.a.1
1 00010 r64---|---2 convert: sfloat_c i32 => r64
1 00010 i32---|---|---1 name: t.main.a.1

Of course - an implicit conversion has exactly the same effect as a
explicit conversion, if the source and destination types are the same.
That doesn't make it correct to use the term "cast" to describe anything
other than an explicit conversion.

So whatever you call that `(double)` part of the second line, which is
written explicitly, exactly the same thing is done internally (ie
'implicitly') to the first line. (The 09/10 are line numbers.)
Since C likes to use the term 'cast' for such conversions, ...

No, C only uses the term "cast" to describe the following:

"6.5.4 Cast operators
1 cast-expression:
unary-expression
( type-name ) cast-expression" (6.5.4p1)

A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

... I don't see a
problem with talking about implicit and explicit versions.

There's nothing wrong with talking about implicit conversions versus
explicit conversions. Explicit conversion are also called casts.

Bart

2024-08-09 20:59:44 UTC

Post by James Kuyper
A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

Are you sure? What else would they be known as?

Post by James Kuyper

... I don't see a
problem with talking about implicit and explicit versions.

There's nothing wrong with talking about implicit conversions versus
explicit conversions.

Of course! Because those terms happen to be used in a couple of places
in the standard. Usage of any terms not appearing in the standard is
apparently outlawed in this newsgroup.

Post by James Kuyper
Explicit conversion are also called casts.

"6.5.4p3

Conversions that involve pointers, other than where permitted by the
constraints of 6.5.16.1, shall be specified by means of an explicit cast."

Here it uses the term 'explicit cast'. Why is that; isn't the term
'cast' unambiguous without needing to say 'explicit'?

Also, what is exactly is the difference between 'explicit conversion'
and 'explicit cast'?

Why can't there also be a similar correlation between 'implicit
conversion' and 'implicit cast'? The only reason I can see is that out
of these four terms, only 3 of them happen to appear in the standard.

I don't see that as a compelling reason why that term should be
considered absolutely wrong; 'implicit cast' just never came up.

It's not as though the standard provides an official glossary, but even
if it did, surely people ought to be allowed to use alternate terms for
an informal discussion? This is a not a committee meeting.

I remember people here getting castigated for using the term 'type
cast'. And yet, in H.2.4p1:

"The LIA−1 type conversions are the following type casts:"

Look also at H.2.4p4 (also p5):

"C’s conversions (casts) from floating-point to floating-point can meet
LIA−1 requirements if an implementation uses round-to-nearest (IEC 60559
default)."

Here it clearly indicates that 'conversions' (presumably both implicit
and explicit) are also known as 'casts'.

This seems to imply (no pun) that 'implicit casts' are a thing; the
opportunity to use that term just never came up.

I don't wish to be rude to you or KT but .....

Keith Thompson

2024-08-09 21:47:58 UTC

Post by James Kuyper
A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

Are you sure? What else would they be known as?

I believe James made a small error here, either omitting a "not" or
accidentally writing "explicitly" rather than "implicitly".

With that correction, I presume what James wrote is clear enough.
I ask you not to pretend that you still don't understand it.

[...]

Post by Bart
Here it uses the term 'explicit cast'. Why is that; isn't the term
'cast' unambiguous without needing to say 'explicit'?

It's redundant.

Post by Bart
Also, what is exactly is the difference between 'explicit conversion'
and 'explicit cast'?

The both mean the same thing in C, but "explicit cast" is redundant.

Post by Bart
Why can't there also be a similar correlation between 'implicit
conversion' and 'implicit cast'? The only reason I can see is that out
of these four terms, only 3 of them happen to appear in the standard.

Because a cast is an explicit operator.

Post by Bart
I don't see that as a compelling reason why that term should be
considered absolutely wrong; 'implicit cast' just never came up.
It's not as though the standard provides an official glossary,

In fact it does. Some terms are defined in section 3, and others are
defined elsewhere in the standard (definitions are denoted by italics).
I don't see a definition for the term "cast", but it's clearly described
in N3220 6.5.4 "Cast operators".

Array indexing is defined in terms of pointer arithmetic. The
evaluation of `a[i]` involves an implicit addition operation. It does
not involve an implicit "+" symbol.

Post by Bart
even if it did, surely people ought to be allowed to use alternate
terms for an informal discussion? This is a not a committee meeting.

You're allowed to write whatever nonsense you like, and the rest of us
are allowed to tell you that you're wrong.

Explain why using the word "cast" incorrectly is better than using it
correctly. Explain why you can't just refer to explicit and implicit
conversions. Do you have a motivation other than being annoying?

Post by Bart
I remember people here getting castigated for using the term 'type
"The LIA−1 type conversions are the following type casts:"

That wording is no longer there in more recent editions. Annex H refers
to LIA-1 (Language Independent Arithmetic), ISO/IEC 10967–1; perhaps
that standard use the term "type casts". (I'm not going to pay CHF 216
to find out.)

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Bart

2024-08-09 23:32:39 UTC

Post by Keith Thompson

Post by James Kuyper
A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

Are you sure? What else would they be known as?

I believe James made a small error here, either omitting a "not" or
accidentally writing "explicitly" rather than "implicitly".
With that correction, I presume what James wrote is clear enough.
I ask you not to pretend that you still don't understand it.

I guessed it was some sort of mistake. That's why I politely asked if he
was sure.

Post by Keith Thompson
I ask you not to pretend that you still don't understand it.

I might also ask you not to pretend you don't know what is meant by
'implicit cast'. Or worse, pretending that other people might be confused.

I'm sure most of those won't have gone anywhere near the standard, so
they won't be aware that in the 700 pages of N1570.PDF, 'explicit cast'
occurs all of once, while 'implicit cast' occurs one time less often,
which appears to be the sole reason for have a go at anyone who commits
the sin of using that expression.

Post by Keith Thompson
[...]

Post by Bart
Here it uses the term 'explicit cast'. Why is that; isn't the term
'cast' unambiguous without needing to say 'explicit'?

It's redundant.

OK. Does that mean we're allowed to use redundant terms too?

Post by Keith Thompson

Post by Bart
Also, what is exactly is the difference between 'explicit conversion'
and 'explicit cast'?

The both mean the same thing in C, but "explicit cast" is redundant.

Post by Bart
Why can't there also be a similar correlation between 'implicit
conversion' and 'implicit cast'? The only reason I can see is that out
of these four terms, only 3 of them happen to appear in the standard.

Because a cast is an explicit operator.

It's something that is done in the code to alter the evaluation of some
expression. An 'explicit cast', which has to be requested in the source
code, necessarily has some syntax associated with it.

An 'implicit cast' (by which I mean, for your benefit as it puzzles
nobody else, implicit type conversion), obviously /doesn't/ have any
relevant syntax! If it had, then it would be explicit...

Post by Keith Thompson

Post by Bart
I don't see that as a compelling reason why that term should be
considered absolutely wrong; 'implicit cast' just never came up.
It's not as though the standard provides an official glossary,

In fact it does. Some terms are defined in section 3, and others are
defined elsewhere in the standard (definitions are denoted by italics).
I don't see a definition for the term "cast", but it's clearly described
in N3220 6.5.4 "Cast operators".
Array indexing is defined in terms of pointer arithmetic. The
evaluation of `a[i]` involves an implicit addition operation. It does
not involve an implicit "+" symbol.

You've snipped my quote from H.2.4p5 where it says:

... conversions (casts) ...

Here they are also 'mixing up' operations and syntax.

Here's the signature of a function from my C compiler which deals with
conversions:

func docast(unit p, int t, hard=1)unit =

(A 'unit' is an AST mode; 't' is a type code.) 'hard' is 1 for an
explicit conversion, and 0 for an implicit one.

(Explicit or 'hard' casts enable some conversions that would be
otherwise be invalid. And in that last sentence, I used both 'cast' and
'conversion' just to avoid using the same term in quick succession. I'm
sure many of the word choices in the standard are for similar reasons.)

Post by Keith Thompson
You're allowed to write whatever nonsense you like, and the rest of us
are allowed to tell you that you're wrong.

And some of us are allowed to think or say that you're being hopelessly
pedantic.

Post by Keith Thompson
Explain why using the word "cast" incorrectly is better than using it
correctly. Explain why you can't just refer to explicit and implicit
conversions. Do you have a motivation other than being annoying?

'Cast' is shorter, and more directly is associated with conversions to
do with types.

In my compiler for example, '*conv*' is used in many different contexts
(eg. case conversion). '*cast*' is only used in two functions, both to
do with C type conversions.

But I also use both terms just to mix it up.

Post by Keith Thompson

Post by Bart
I remember people here getting castigated for using the term 'type
"The LIA−1 type conversions are the following type casts:"

That wording is no longer there in more recent editions. Annex H refers
to LIA-1 (Language Independent Arithmetic), ISO/IEC 10967–1; perhaps
that standard use the term "type casts".

My point is that even the professionals who write such documents get it
'wrong', according to you.

Keith Thompson

2024-08-10 00:12:31 UTC

[...]

Post by Bart
I might also ask you not to pretend you don't know what is meant by
'implicit cast'. Or worse, pretending that other people might be confused.

Of course I understand exactly what you mean by "implicit cast". You
mean "implicit conversion". A phrase can be both clear and incorrect.
If you used the phrase "male cow" I would assume you mean "bull" and/or
"steer".

Post by Bart
I'm sure most of those won't have gone anywhere near the standard, so
they won't be aware that in the 700 pages of N1570.PDF, 'explicit
cast' occurs all of once, while 'implicit cast' occurs one time less
often, which appears to be the sole reason for have a go at anyone who
commits the sin of using that expression.

Nobody has accused you of any "sin". Don't exaggerate.

I have pointed out that you are misusing a word that has a clear
definition, and that using the correct term "implicit conversion" would
be at least as clear and would cost you nothing. (Compare the extra 6
letters to the volume of text you've emitted arguing about it.)

Post by Keith Thompson
[...]

Post by Bart
Here it uses the term 'explicit cast'. Why is that; isn't the term
'cast' unambiguous without needing to say 'explicit'?

It's redundant.

OK. Does that mean we're allowed to use redundant terms too?

Nobody said you aren't.

[...]

Post by Bart
... conversions (casts) ...

That wording does not appear in more recent drafts.

[...]

Post by Keith Thompson
Explain why using the word "cast" incorrectly is better than using it
correctly. Explain why you can't just refer to explicit and implicit
conversions. Do you have a motivation other than being annoying?

'Cast' is shorter, and more directly is associated with conversions to
do with types.
In my compiler for example, '*conv*' is used in many different
contexts (eg. case conversion). '*cast*' is only used in two
functions, both to do with C type conversions.
But I also use both terms just to mix it up.

Post by Keith Thompson

Post by Bart
I remember people here getting castigated for using the term 'type
"The LIA−1 type conversions are the following type casts:"

That wording is no longer there in more recent editions. Annex H
refers to LIA-1 (Language Independent Arithmetic), ISO/IEC 10967–1;
perhaps that standard use the term "type casts".

My point is that even the professionals who write such documents get
it 'wrong', according to you.

Yes, and unlike you, they willingly correct their errors.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

2024-08-09 22:29:02 UTC

...

Post by James Kuyper
A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

Are you sure? What else would they be known as?

As Keith said, that's a typo - the second "explicitly" should have been
"implicitly".

[...]

Post by Bart
Here it uses the term 'explicit cast'. Why is that; isn't the term
'cast' unambiguous without needing to say 'explicit'?

It's redundant, and occurs only once in the entire standard. The purpose
of that redundancy was to emphasize that what the conversion it
describes never happens implicitly (unlike many of the other conversions).

Post by Bart
Also, what is exactly is the difference between 'explicit conversion'
and 'explicit cast'?

None

Post by Bart
Why can't there also be a similar correlation between 'implicit
conversion' and 'implicit cast'?

The C standard defines "implicit conversion" and "explicit conversion"
in 6.3p1, and the definition it provides for "explicit conversion" is
"those [conversions] that result from a cast operation". it provides a
grammar production for a cast expression, and none for a implicit cast
expression.

Post by Bart
even if it did, surely people ought to be allowed to use alternate
terms for an informal discussion? This is a not a committee meeting.

Every time you use a term with a standard-define meaning in a way that
doesn't match the meaning defined for it by the standard, you create
potential confusion. If that's what you want to do, go ahead, but it
seems an odd thing to do.

Keith Thompson

2024-08-09 21:29:14 UTC

James Kuyper <***@alumni.caltech.edu> writes:
[...]

Post by James Kuyper
A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

I think you omitted a "not" in the above, or meant to write "implicitly"
rather than "explicitly".

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

2024-08-09 22:35:12 UTC

Post by Keith Thompson
[...]

Post by James Kuyper
A cast is a piece of syntax that is used to explicitly request that a
conversion be performed. Conversions that are explicitly requested in C
code are referred to as casts only by people who don't understand what
they're saying - the standard never refers to them as such.

I think you omitted a "not" in the above, or meant to write "implicitly"
rather than "explicitly".

The latter - sorry for the confusion!

Kaz Kylheku

2024-08-09 21:30:33 UTC

Post by James Kuyper
...

Post by Keith Thompson
There is no such thing as an "implicit cast" in C.

x = a; // implicit 'conversion'
x = (double)a; // explicit 'conversion'
1 00009 r64---|---2 convert: sfloat_c i32 => r64
1 00009 i32---|---|---1 name: t.main.a.1
1 00010 r64---|---2 convert: sfloat_c i32 => r64
1 00010 i32---|---|---1 name: t.main.a.1

Of course - an implicit conversion has exactly the same effect as a
explicit conversion, if the source and destination types are the same.
That doesn't make it correct to use the term "cast" to describe anything
other than an explicit conversion.

That's all very neat and clean. However, the problem is that in C,
some of the implicit conversions are unsafe.

Implicit conversions can:

- truncate an integer value to fit a narrower type.

- convert between floating point and integer in a way that the value
is out of range, with undefined behavior ensuing.

- change the value: e.g -1 becomes UINT_MAX.

- subvert the type system, e.g. (foo *) -> (void *) -> (bar *).

In computer science, we refer to unsafe conversion as coercion.
THe cast notation is C's coercion operation.

In some languages, some of what C allows to be an implicit conversion
would be classified as requiring a coercion.

It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Keith Thompson

2024-08-09 21:57:06 UTC

Kaz Kylheku <643-408-***@kylheku.com> writes:
[...]

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

I disagree, because that's not what "cast" means.

Certainly unsafe implicit conversions are worth discussing, but let's
not misuse existing terms.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Kaz Kylheku

2024-08-09 23:14:48 UTC

Post by Keith Thompson
[...]

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

I disagree, because that's not what "cast" means.

"cast" means to (try to) project a value into another type.

In C though, the nuance is something like "conversion that is mediated
by the presence of the cast notation", where "mediated" includes the
possibility that the cast notation has no effect at all
(e.g. 2 + (int) 3).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Keith Thompson

2024-08-09 23:58:00 UTC

Post by Kaz Kylheku

Post by Keith Thompson
[...]

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

I disagree, because that's not what "cast" means.

"cast" means to (try to) project a value into another type.

*Looks around* sorry, are we still in comp.lang.c?

"Cast" has a number of meanings in contexts outside C, applicable to
dice, eyes, fishing lines, ballots, magic spells, actors, liquid metal,
and broken limbs, among other things. In C, it means what the standard
says it means, even if some people misuse it to refer to implicit
conversions.

Post by Kaz Kylheku
In C though, the nuance is something like "conversion that is mediated
by the presence of the cast notation", where "mediated" includes the
possibility that the cast notation has no effect at all
(e.g. 2 + (int) 3).

I hadn't noticed before that the standard does have a formal definition
of the term "cast", as well as "explicit conversion" and "implicit
conversion".

Quoting N3220, and using *...* to denote italics (definitions) (earlier
editions are identical as far as I can tell) :

6.3p1 :
Several operators convert operand values from one type to another
automatically. This subclause specifies the result required from
such an *implicit conversion*, as well as those that result from a
cast operation (an *explicit conversion*).
p2 :
Unless explicitly stated otherwise, conversion of an operand value
to a compatible type causes no change to the value or the
representation.

(I don't know of anything that explicitly states otherwise.)

6.5.5p6 :
Preceding an expression by a parenthesized type name converts the
value of the expression to the unqualified, non-atomic version of
the named type. This construction is called a *cast*. A cast that
specifies no conversion has no effect on the type or value of an
expression.

The syntax for a cast-expression is :

cast-expression :
unary-expression
( type-name ) cast-expression

The term "cast" refers to a cast-expression that matches the second
alternative.

The "no conversion" wording is odd. Most likely the intent is that
casting an expression to its own type, like `(int)3`, "specifies no
conversion". This is supported by the statement in 6.3: "Several
operators convert operand values from one type **to another**
automatically." (emphasis added).

My own preference would be to say that a conversion of an expression to
its own type is still a conversion, but a trivial one.

If it's really the case that not every cast specifies a conversion, then
defining a "cast" as an "explicit conversion" is not quite correct.
That is IMHO be unfortunate. It's also inconsistent with the statement
two sentences earlier in the same paragraph, which says unconditionally
that a cast converts the value.

In any case, while there may be some ambiguity about whether all casts
specify conversions, it is unambiguous that an implicit conversion is
not a cast.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Kaz Kylheku

2024-08-10 00:06:10 UTC

Post by Keith Thompson

Post by Kaz Kylheku

Post by Keith Thompson
[...]

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

I disagree, because that's not what "cast" means.

"cast" means to (try to) project a value into another type.

*Looks around* sorry, are we still in comp.lang.c?
"Cast" has a number of meanings in contexts outside C, applicable to
dice, eyes, fishing lines, ballots, magic spells, actors, liquid metal,
and broken limbs, among other things. In C, it means what the standard
says it means, even if some people misuse it to refer to implicit
conversions.

Post by Kaz Kylheku
In C though, the nuance is something like "conversion that is mediated
by the presence of the cast notation", where "mediated" includes the
possibility that the cast notation has no effect at all
(e.g. 2 + (int) 3).

I hadn't noticed before that the standard does have a formal definition
of the term "cast", as well as "explicit conversion" and "implicit
conversion".

Based on surveying all you quoted, I basically nailed it above.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Keith Thompson

2024-08-10 00:27:19 UTC

Post by Kaz Kylheku

Post by Keith Thompson

Post by Kaz Kylheku

Post by Keith Thompson
[...]

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

I disagree, because that's not what "cast" means.

"cast" means to (try to) project a value into another type.

*Looks around* sorry, are we still in comp.lang.c?
"Cast" has a number of meanings in contexts outside C, applicable to
dice, eyes, fishing lines, ballots, magic spells, actors, liquid metal,
and broken limbs, among other things. In C, it means what the standard
says it means, even if some people misuse it to refer to implicit
conversions.

Post by Kaz Kylheku
In C though, the nuance is something like "conversion that is mediated
by the presence of the cast notation", where "mediated" includes the
possibility that the cast notation has no effect at all
(e.g. 2 + (int) 3).

I hadn't noticed before that the standard does have a formal definition
of the term "cast", as well as "explicit conversion" and "implicit
conversion".

Based on surveying all you quoted, I basically nailed it above.

I honestly don't know what you mean by that. Feel free to explain if
you wish.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

2024-08-10 00:31:44 UTC

...

Post by Kaz Kylheku

Post by Keith Thompson

Post by Kaz Kylheku
"cast" means to (try to) project a value into another type.

...

Post by Kaz Kylheku

Post by Keith Thompson
I hadn't noticed before that the standard does have a formal definition
of the term "cast", ...

Neither had I - noticing that earlier would have shortened this
discussion (I hope).

That definition is:
"Preceding an expression by a parenthesized type name converts the value
of the expression to the unqualified, non-atomic version of the named
type. This construction is called a cast." (6.5.4p6)

The word "cast" is italicized, an ISO convention indicating that the
sentence in which it occurs is the official definition of that term.

Post by Kaz Kylheku
Based on surveying all you quoted, I basically nailed it above.

Not quite. A cast is "this construction", namely "preceding an
expression by a parenthesized type name". It describes a portion of the
text of a program. What you described as a cast is the semantics
associated with that construct when the program is compiled, not the
construct itself. (float)3 is a cast. "Convert 3 to a float" is not.

Bart

2024-08-10 00:11:08 UTC

Post by Keith Thompson
In any case, while there may be some ambiguity about whether all casts
specify conversions, it is unambiguous that an implicit conversion is
not a cast.

Here are some comments from the Tiny C source code:

/* XXX: implicit cast ? */
/* compute bigger type and do implicit casts */

This is from some gcc code:

..and some compilers cast it to int implicitly ...

Come on, everybody's at it. Unless we're trying do to a reference
document, then /it really doesn't matter/.

Tim Rentsch

2024-08-12 00:32:54 UTC

Post by Kaz Kylheku

Post by Keith Thompson
[...]

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

I disagree, because that's not what "cast" means.

"cast" means to (try to) project a value into another type.

In programming, "cast" means to force a value to be expressed
in a particular type, regardless of whether the output type
differs from the input type.

James Kuyper

2024-08-09 22:35:52 UTC

...

Post by Kaz Kylheku

Post by James Kuyper
Of course - an implicit conversion has exactly the same effect as a
explicit conversion, if the source and destination types are the same.
That doesn't make it correct to use the term "cast" to describe anything
other than an explicit conversion.

That's all very neat and clean. However, the problem is that in C,
some of the implicit conversions are unsafe.
- truncate an integer value to fit a narrower type.
- convert between floating point and integer in a way that the value
is out of range, with undefined behavior ensuing.
- change the value: e.g -1 becomes UINT_MAX.
- subvert the type system, e.g. (foo *) -> (void *) -> (bar *).

The implicit conversions are implicit precisely because they tend to be
safer than the ones that cannot be done implicitly. That doesn't mean
that they're safe.

Post by Kaz Kylheku
In computer science, we refer to unsafe conversion as coercion.
THe cast notation is C's coercion operation.

Yes, and in the C standard, conversions are simply called conversions,
regardless of how unsafe they are - for instance, you can't get much
less safe than (void (*))printf; but it's still a conversion. It's
confusing to discuss C using terms that have a standard-defined meaning,
if you insist on using them with a conflicting meaning.

Post by Kaz Kylheku
It almost makes sense to speak of "implicit cast" (i.e. coercion) in C,
because of what happens implicitly being so unsafe.

No, it doesn't. If lack of safety is what makes them implicit, then
casts should be even more implicit than implicit conversions.

Tim Rentsch

2024-08-12 00:27:29 UTC

Post by Kaz Kylheku
In computer science, we refer to unsafe conversion as coercion.

In most programming languages, "coercion" refers to any implicit
conversion (perhaps requiring a change in type), regardless of
whether the conversion is safe or is not.

Post by Kaz Kylheku
THe cast notation is C's coercion operation.

Absolutely not. "Coercion" usually means an implicitly caused
conversion, following the term being introduced in Algol 68.
A cast in C is exactly the opposite, always explicit.

Disclaimer: the above comments based on cursory research to
reinforce my memory of literature read years ago, and supported
by Wikipedia.

Keith Thompson

2024-08-09 19:23:34 UTC

Bart <***@freeuk.com> writes:
[...]

Post by Bart
Since C likes to use the term 'cast' for such conversions,

C never uses the term "cast" to refer to implicit conversions.

Post by Bart
I don't see
a problem with talking about implicit and explicit versions.
It just seems to irk the pedantics here.

That does seem to be your goal.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Bart

2024-08-09 20:31:23 UTC

Post by Keith Thompson
[...]

Post by Bart
Since C likes to use the term 'cast' for such conversions,

C never uses the term "cast" to refer to implicit conversions.

Post by Bart
I don't see
a problem with talking about implicit and explicit versions.
It just seems to irk the pedantics here.

That does seem to be your goal.
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

I was replying to Thiago Adams.

I didn't choose to muddy the waters. Dragging C standard legalese really
doesn't help here.

(90% of the discussion here could be shut down with people refered to
the C standard. What else is needed to answer nearly all questions asked?

Could it possibly be a more human, informal element?)

Keith Thompson

2024-08-09 20:49:10 UTC

Bart <***@freeuk.com> writes:
[...]

Post by Bart
I didn't choose to muddy the waters.

Yes you did.

Post by Bart
Dragging C standard legalese
really doesn't help here.

Yes it does.

You seem to believe, or you're pretending to believe, that using
terminology incorrectly is better than using it correctly, even when
using to correctly is just as easy.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Bart

2024-08-09 21:01:37 UTC

Post by Keith Thompson
[...]

Post by Bart
I didn't choose to muddy the waters.

Yes you did.

Post by Bart
Dragging C standard legalese
really doesn't help here.

Yes it does.
You seem to believe, or you're pretending to believe, that using
terminology incorrectly is better than using it correctly, even when
using to correctly is just as easy.

See my reply to JK.

I don't think it's me who's trying to confuse by banning terms what
everyone understands perfectly well.

Tim Rentsch

2024-08-12 07:33:38 UTC

[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it
clear.

[...]

Since C likes to use the term 'cast' for such conversions, [...]

The C standard uses the term 'cast' only for explicit conversions,
and never uses the term 'cast' for implicit conversions. You
should do the same.

It just seems to irk the pedantics here.

What bothers people is not you using the wrong terminology. What
bothers people is you being a self-centered jerk, and deliberately
using incorrect terminology just to annoy people.

Incidentally, the word "pedantic" is an adjective. The noun form
is "pedant". People who complain about you using terminology
incorrectly are not being pedants. They simply are offended by
your never-ending efforts to be a pest and an asshole.

Tim Rentsch

2024-08-12 00:46:58 UTC

Post by Keith Thompson
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]
The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.

The statement assigning to x performs two conversions: an explicit
one caused by the cast, and an implicit one caused by the assignment
operation.

Bart

2024-08-12 01:00:15 UTC

Post by Tim Rentsch

Post by Keith Thompson
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]
The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.

The statement assigning to x performs two conversions: an explicit
one caused by the cast, and an implicit one caused by the assignment
operation.

The 'x' term is the other side of the cast from the 'a' term.

So after '(double)a' has been evaluated, both sides of '=' have the type
'double', so no further conversion is needed.

Keith Thompson

2024-08-12 03:23:20 UTC

Post by Tim Rentsch

Post by Keith Thompson
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]
The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.

The statement assigning to x performs two conversions: an explicit
one caused by the cast, and an implicit one caused by the assignment
operation.

The 'x' term is the other side of the cast from the 'a' term.
So after '(double)a' has been evaluated, both sides of '=' have the
type 'double', so no further conversion is needed.

Bart, I'm sure you don't care about this, but others might. Please do
us all a favor and don't argue about it.

The standard says that "In *simple assignment* (=), the value of the
right operand is converted to the type of the assignment expression and
replaces the value stored in the object designated by the left operand."

This is unambiguous, and for your example it means that the result of
the evaluating the RHS is *converted* from double to double.

"Conversion of an operand value to a compatible type causes no change to
the value or the representation." double is compatible with itself, so
you presumably won't find any evidence of a conversion if you examine
the generated code.

The standard could have described this differently, perhaps by saying
that no conversion is performed if the LHS and RHS have the same type,
but it says what it says.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Tim Rentsch

2024-08-12 03:37:08 UTC

Post by Tim Rentsch

Post by Keith Thompson
[...]

Post by Bart
int a; double x;
x = (double)a;
The cast is implicit here but I've written it out to make it clear.

[...]
The *conversion* could be done implicitly, but you've used a cast (i.e.,
an explicit conversion) to make it clear.

The statement assigning to x performs two conversions: an explicit
one caused by the cast, and an implicit one caused by the assignment
operation.

The 'x' term is the other side of the cast from the 'a' term.
So after '(double)a' has been evaluated, both sides of '=' have the
type 'double', so no further conversion is needed.

The C standard requires that a conversion take place as part of
the assignment, even when the types are the same.

Furthermore, there are cases where having to do a conversion from
one type to the same type has semantic consequences, even though
the types are the same.

Keith Thompson

2024-08-12 04:33:23 UTC

Tim Rentsch <***@z991.linuxsc.com> writes:
[...]

Post by Tim Rentsch
Furthermore, there are cases where having to do a conversion from
one type to the same type has semantic consequences, even though
the types are the same.

What are these cases?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Thiago Adams

2024-08-09 10:57:17 UTC

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not need
to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions work. They
are not performed 'in-place', any more than writing `a + 1` changes the
value of `a`.
    int a; double x;
    x = (double)a;
The cast is implicit here but I've written it out to make it clear. My C
compiler produces intermediate code like this before converting it to
    push x   r64                   # r64 means float64
    fix      r64 -> i32
    pop a   i32
I could choose to interprete this code just as it is. Then, in this
execution model, there are no registers at all, only a stack that can
hold data of any type.
The 'fix' instruction pops the double value from the stack, converts it
to int (which involves changing both the bit-pattern, and the
bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real machine. But
you seem more concerned with safety and correctness than performance, so
there's probably no real need to look at actual generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

This part was always clear to me:

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."

Lets take double to int.

In this case the bits of double needs to be reinterpreted (copied to) int.

So the answer "how it works" can be

always/generally machine has a instruction to do this

or.. this is defined by the IIE ... standard as ...

Bart

2024-08-09 15:25:41 UTC

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions work.
They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it clear. My
C compiler produces intermediate code like this before converting it
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in this
execution model, there are no registers at all, only a stack that can
hold data of any type.
The 'fix' instruction pops the double value from the stack, converts
it to int (which involves changing both the bit-pattern, and the
bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real machine. But
you seem more concerned with safety and correctness than performance,
so there's probably no real need to look at actual generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this

If it supports those types in hardware, then it will probably have
conversion instructions.

But I've also used machines without FP hardware, so it had to be
emulated in software. Then such a conversion would be done by a function
in the runtime library.

(Typically, also, such a machine had a FP type which might have been
twice the width of a machine word, eg. `int` was `i16`, and `float` was
`f32`. So those functions had to deal with that too.

Currently we're in a golden age where nearly everything is 64 bits, and
both 64-bit ints and floats suffice for most purposes. So that aspect is
no longer an issue.

Other than you now have to juggle 4 integer widths instead of just 2 or 3.)

Keith Thompson

2024-08-09 19:06:48 UTC

Bart <***@freeuk.com> writes:
[...]

Post by Bart
Currently we're in a golden age where nearly everything is 64 bits,
and both 64-bit ints and floats suffice for most purposes. So that
aspect is no longer an issue.

Most modern hosted implementations have 32-bit int and 32-bit float.
They do have 64-bit integer types and 64-bit floating-point types. As
you know, "int" and "float" are specific types, not just abbreviations
for "integer type" and "floating-point type".

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

David Brown

2024-08-09 17:20:27 UTC

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions work.
They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it clear. My
C compiler produces intermediate code like this before converting it
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in this
execution model, there are no registers at all, only a stack that can
hold data of any type.
The 'fix' instruction pops the double value from the stack, converts
it to int (which involves changing both the bit-pattern, and the
bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real machine. But
you seem more concerned with safety and correctness than performance,
so there's probably no real need to look at actual generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

It would be helpful if you made more of an effort to write clearly here.
(We know you can do so when you want to.) It is very difficult to
follow what you are referring to here - what is "this case" here? A
conversion from a double to an int certainly does not re-interpret or
copy bits - like other conversions, it copies the /value/ to the best
possible extent given the limitations of the types.

Thiago Adams

2024-08-09 18:54:19 UTC

Post by David Brown

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions work.
They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it clear.
My C compiler produces intermediate code like this before converting
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in this
execution model, there are no registers at all, only a stack that can
hold data of any type.
The 'fix' instruction pops the double value from the stack, converts
it to int (which involves changing both the bit-pattern, and the
bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real machine.
But you seem more concerned with safety and correctness than
performance, so there's probably no real need to look at actual
generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

It would be helpful if you made more of an effort to write clearly here.
(We know you can do so when you want to.) It is very difficult to
follow what you are referring to here - what is "this case" here? A
conversion from a double to an int certainly does not re-interpret or
copy bits - like other conversions, it copies the /value/ to the best
possible extent given the limitations of the types.

Everything is a bit mixed up, but I'll try to explain the part about
registers that I have in mind.

In C, when you have an expression like char + char, each char is
promoted to int. The computation then occurs as int + int.

On the other hand, when you have float + float, it remains as float + float.

My guess for this design is that computations involving char are done
using registers that are the size of an int.

But, float + float is not promoted to double, so I assume that the
computer has specific float registers or similar operation instructions
for float.

Regarding the part about signed/unsigned registers and operations, I
must admit that I'm not sure. I was planning to check on Compiler
Explorer, but I haven't done that yet.

I can frame the question like this: Does the computer make a distinction
when adding signed versus unsigned integers? Are there specific assembly
instructions for signed versus unsigned operations, covering all
possible combinations?

Thiago Adams

2024-08-09 19:05:21 UTC

Post by Thiago Adams

Post by David Brown

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins
of integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions work.
They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it clear.
My C compiler produces intermediate code like this before converting
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in this
execution model, there are no registers at all, only a stack that
can hold data of any type.
The 'fix' instruction pops the double value from the stack, converts
it to int (which involves changing both the bit-pattern, and the
bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real machine.
But you seem more concerned with safety and correctness than
performance, so there's probably no real need to look at actual
generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

It would be helpful if you made more of an effort to write clearly
here. (We know you can do so when you want to.) It is very
difficult to follow what you are referring to here - what is "this
case" here? A conversion from a double to an int certainly does not
re-interpret or copy bits - like other conversions, it copies the
/value/ to the best possible extent given the limitations of the types.

Everything is a bit mixed up, but I'll try to explain the part about
registers that I have in mind.
In C, when you have an expression like char + char, each char is
promoted to int. The computation then occurs as int + int.
On the other hand, when you have float + float, it remains as float + float.
My guess for this design is that computations involving char are done
using registers that are the size of an int.
But, float + float is not promoted to double, so I assume that the
computer has specific float registers or similar operation instructions
for float.
Regarding the part about signed/unsigned registers and operations, I
must admit that I'm not sure. I was planning to check on Compiler
Explorer, but I haven't done that yet.
I can frame the question like this: Does the computer make a distinction
when adding signed versus unsigned integers? Are there specific assembly
instructions for signed versus unsigned operations, covering all
possible combinations?

and I am still curious if _Bool/bool makes the programs slower (more
instructions) compared with
"typedef int bool" because the generated code has to convert bool->int
int->bool all the time.

David Brown

2024-08-09 19:43:44 UTC

Post by Thiago Adams
and I am still curious if _Bool/bool makes the programs slower (more
instructions) compared with
"typedef int bool" because the generated code has to convert bool->int
int->bool all the time.

I'll answer this one first - the answer is, it depends. Some things
will be faster, some slower, some stay much the same. Overall, expect
proper booleans to be somewhat more efficient.

Keith Thompson

2024-08-09 20:28:15 UTC

Thiago Adams <***@gmail.com> writes:
[...]

Post by Thiago Adams
and I am still curious if _Bool/bool makes the programs slower (more
instructions) compared with
"typedef int bool" because the generated code has to convert bool->int
int->bool all the time.

Maybe. The only way to answer that would be perform measurements.
And you'll have to be careful to ensure that the operations aren't
optimized away. Writing useful benchmarks can be tricky.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

David Brown

2024-08-09 20:01:22 UTC

Post by Thiago Adams
Everything is a bit mixed up, but I'll try to explain the part about
registers that I have in mind.
In C, when you have an expression like char + char, each char is
promoted to int. The computation then occurs as int + int.

Yes. In C, there are no arithmetic operations on types smaller than "int".

Post by Thiago Adams
On the other hand, when you have float + float, it remains as float + float.

Yes.

The rules for promotions are quite clear in the standards. You can also
read about them here: <https://en.cppreference.com/w/c/language/conversion>

Post by Thiago Adams
My guess for this design is that computations involving char are done
using registers that are the size of an int.

It is quite possible that this was the original motivation.

But you keep referring to "the computer". There is no "the computer" in
C. There are processors with 128 64-bit integer registers, and
processors with a single 8-bit register. Some have no floating point
hardware at all, some have 128-bit floating point hardware. C is
defined in a manner that is mostly independent of the hardware, with
only a few points that are dependent (such as the width of integer
types). But design decisions in C can certainly have been inspired by
real existing hardware.

Post by Thiago Adams
But, float + float is not promoted to double, so I assume that the
computer has specific float registers or similar operation instructions
for float.
Regarding the part about signed/unsigned registers and operations, I
must admit that I'm not sure. I was planning to check on Compiler
Explorer, but I haven't done that yet.

I don't know what you are referring to here. But if you are using
compiler explorer, I encourage you to look at the generated output for a
wide range of targets, including 8-bit AVR, 16-bit MSP430, 32-bit ARM,
and 64-bit x86. Use gcc -O1 or -O2 in every case. (Ignore Bart's
ignorant blatherings about optimisation.)

Post by Thiago Adams
I can frame the question like this: Does the computer make a distinction
when adding signed versus unsigned integers? Are there specific assembly
instructions for signed versus unsigned operations, covering all
possible combinations?

Without specifying "the computer", the question is not particularly
meaningful. However, it's fair to say that on most processors most
arithmetic operations are the same for signed and unsigned types as long
as the operation is done at a size that the target supports (otherwise
it may need sign or zero extensions if it only supports larger sizes).

Bart

2024-08-10 10:17:39 UTC

I don't know what you are referring to here. But if you are using
compiler explorer, I encourage you to look at the generated output for a
wide range of targets, including 8-bit AVR, 16-bit MSP430, 32-bit ARM,
and 64-bit x86. Use gcc -O1 or -O2 in every case.

When would you choose -O2 over -O1 or vice versa? Could a similar
circumstance cause you to choose -O0? Why not -O3?

In fact, why is there a -O0 option at all?

(Ignore Bart's

ignorant blatherings about optimisation.)

[To TA:]

Yes do. But don't complaint to me when your test code results in
meaningless or misleading output, or no output at all.

Actually, I would recommend looking at both (eg. -O0 and -O1) so that
you can see if the compiler's optimiser has been over-zealous in
eliminating code, or has chopped out key bits, so that you might modify
your test code.

I would recommend also looking at the Tiny C option on godbolt when
comparing x86 code.

Thiago Adams

2024-08-10 13:15:28 UTC

I don't know what you are referring to here. But if you are using
compiler explorer, I encourage you to look at the generated output for
a wide range of targets, including 8-bit AVR, 16-bit MSP430, 32-bit
ARM, and 64-bit x86. Use gcc -O1 or -O2 in every case.

When would you choose -O2 over -O1 or vice versa? Could a similar
circumstance cause you to choose -O0? Why not -O3?
In fact, why is there a -O0 option at all?
(Ignore Bart's

ignorant blatherings about optimisation.)

[To TA:]
Yes do. But don't complaint to me when your test code results in
meaningless or misleading output, or no output at all.
Actually, I would recommend looking at both (eg. -O0 and -O1) so that
you can see if the compiler's optimiser has been over-zealous in
eliminating code, or has chopped out key bits, so that you might modify
your test code.
I would recommend also looking at the Tiny C option on godbolt when
comparing x86 code.

Bart, Does your compiler support the `bool` type, where the value is
always either 1 or 0?

Bart

2024-08-10 16:14:19 UTC

Post by Thiago Adams

I don't know what you are referring to here. But if you are using
compiler explorer, I encourage you to look at the generated output
for a wide range of targets, including 8-bit AVR, 16-bit MSP430,
32-bit ARM, and 64-bit x86. Use gcc -O1 or -O2 in every case.

When would you choose -O2 over -O1 or vice versa? Could a similar
circumstance cause you to choose -O0? Why not -O3?
In fact, why is there a -O0 option at all?
(Ignore Bart's

ignorant blatherings about optimisation.)

[To TA:]
Yes do. But don't complaint to me when your test code results in
meaningless or misleading output, or no output at all.
Actually, I would recommend looking at both (eg. -O0 and -O1) so that
you can see if the compiler's optimiser has been over-zealous in
eliminating code, or has chopped out key bits, so that you might
modify your test code.
I would recommend also looking at the Tiny C option on godbolt when
comparing x86 code.

Bart, Does your compiler support the `bool` type, where the value is
always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

Thiago Adams

2024-08-10 23:01:36 UTC

Post by Thiago Adams
Bart, Does your compiler support the `bool` type, where the value is
always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

I do the same in my compiler , when I transpile from C99 to C89.
I was thinking how to make it conforming.
For instance on each write.

bool b = 123; -> unsigned char b = !!(123);
The problem this does not fix unions, writing on int and reading from char.

Keith Thompson

2024-08-11 00:10:42 UTC

Post by Thiago Adams

Post by Thiago Adams
Bart, Does your compiler support the `bool` type, where the value
is always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

I do the same in my compiler , when I transpile from C99 to C89.
I was thinking how to make it conforming.
For instance on each write.
bool b = 123; -> unsigned char b = !!(123);
The problem this does not fix unions, writing on int and reading from char.

I don't think you need to fix that.

In the following, I'll refer to _Bool. The same type is also called
bool if you have `#include <stdbool.h>` *or* if you have a C23 compiler.

It's always going to be possible to use type punning (memcpy, pointer
casting, union) to force a representation other than 00000000 or
00000001 into a _Bool object.

The standard doesn't have a rule that says a _Bool object can only have
the value 0 or 1. It says that *conversion* to _Bool yields a result of
0 or 1. And yes, you have to deal with that if you're translating C99
or later to C90, for both explicit and implicit conversions.

Suppose you do something like this:

_Bool b;
*(unsigned char*)&b = 0xff; // assume sizeof (_Bool) == 1
int i = b;

What is the value of b?

Under C23 rules, _Bool has 1 value bit and N-1 (typically 7) padding
bits. Any non-zero padding bits *either* create a trap representation
(C23 calls it a non-value representation) *or* are ignored when
determining the value of the object.

(_Bool can have either 254 trap representations or none. It's possible
that it might have some different number of trap representations, but
that's unlikely.)

If 11111111 is a trap/non-value representation, the behavior of
`int i = b;` is undefined; setting i to 255 or -1 are two of many
possible behaviors. If the padding bits are ignored, it must set i to 1.

Experiment shows that gcc sets i to 255 (implying that it's a trap
representation) while clang sets i to 1 (which could imply that it's not
a trap representation, but that's still a possible result of UB).

Summary:

Conversion from any scalar type to _Bool is well defined, and must yield
0 or 1.

It's possible to force a representation other than 0 or 1 into a _Bool
object, bypassing any value conversion.

Conversion from _Bool to any scalar type is well defined if the
operand is a _Bool object holding a representation of 0 or 1.

Conversion from _Bool to any scalar type for an object holding some
representation other than 0 or 1 either yields 0 or 1 (depending
on the low-order bit) or has undefined behavior.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Thiago Adams

2024-08-11 12:23:19 UTC

Post by Keith Thompson

Post by Thiago Adams

Post by Thiago Adams
Bart, Does your compiler support the `bool` type, where the value
is always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

I do the same in my compiler , when I transpile from C99 to C89.
I was thinking how to make it conforming.
For instance on each write.
bool b = 123; -> unsigned char b = !!(123);
The problem this does not fix unions, writing on int and reading from char.

I don't think you need to fix that.

[....]

Post by Keith Thompson
Conversion from any scalar type to _Bool is well defined, and must yield
0 or 1.

I will fix in terns of expressions types.

- In this case cast to bool
- Assignment to bool

Post by Keith Thompson
It's possible to force a representation other than 0 or 1 into a _Bool
object, bypassing any value conversion.
Conversion from _Bool to any scalar type is well defined if the
operand is a _Bool object holding a representation of 0 or 1.
Conversion from _Bool to any scalar type for an object holding some
representation other than 0 or 1 either yields 0 or 1 (depending
on the low-order bit) or has undefined behavior.

I did a sample now..

#include <stdio.h>

int main() {
union {
int i;
_Bool b;
} data;
data.i = 123;
printf("%d", data.b);
}

it printed 123 not 1.
So I think the assignment and cast covers all/most cases.
(From some previous tests I thought this was printing 1)

The motivation for C89 in cake was not to support old compilers, but
generate code that is compatible with C++98. In this aspect bool was
already there in C++98.(This just gave me idea to add target c++98)

Bart

2024-08-11 12:30:08 UTC

Post by Thiago Adams

Post by Keith Thompson

Post by Thiago Adams

Post by Thiago Adams
Bart, Does your compiler support the `bool` type, where the value
is always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

I do the same in my compiler , when I transpile from C99 to C89.
I was thinking how to make it conforming.
For instance on each write.
bool b = 123; -> unsigned char b = !!(123);
The problem this does not fix unions, writing on int and reading from char.

I don't think you need to fix that.

[....]

Post by Keith Thompson
Conversion from any scalar type to _Bool is well defined, and must yield
0 or 1.

I will fix in terns of expressions types.
- In this case cast to bool
- Assignment to bool

Post by Keith Thompson
It's possible to force a representation other than 0 or 1 into a _Bool
object, bypassing any value conversion.
Conversion from _Bool to any scalar type is well defined if the
operand is a _Bool object holding a representation of 0 or 1.
Conversion from _Bool to any scalar type for an object holding some
representation other than 0 or 1 either yields 0 or 1 (depending
on the low-order bit) or has undefined behavior.

I did a sample now..
#include <stdio.h>
int main() {
    union {
        int i;
        _Bool b;
    } data;
    data.i = 123;
    printf("%d", data.b);
}
it printed 123 not 1.
So I think the assignment and cast covers all/most cases.
(From some previous tests I thought this was printing 1)

That's little different from this example:

#include <stdio.h>

int main() {
union {
int i;
float b;
} data;
data.i = 123;
printf("%e", data.b);
}

I get some arbitrary float value printed. You're supposed to know what
you are doing with unions.

It's not something I'd worry about. If you're trying to make a safer C,
then you'd have to ban unions, or ban bools inside unions that could be
read out as a different type, or introduce tagged unions so that runtime
checking can be done.

Thiago Adams

2024-08-11 17:16:00 UTC

Post by Thiago Adams

Post by Keith Thompson

Post by Thiago Adams

Post by Thiago Adams
Bart, Does your compiler support the `bool` type, where the value
is always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

I do the same in my compiler , when I transpile from C99 to C89.
I was thinking how to make it conforming.
For instance on each write.
bool b = 123; -> unsigned char b = !!(123);
The problem this does not fix unions, writing on int and reading from char.

I don't think you need to fix that.

[....]

Post by Keith Thompson
Conversion from any scalar type to _Bool is well defined, and must yield
0 or 1.

I will fix in terns of expressions types.
- In this case cast to bool
- Assignment to bool

Post by Keith Thompson
It's possible to force a representation other than 0 or 1 into a _Bool
object, bypassing any value conversion.
Conversion from _Bool to any scalar type is well defined if the
operand is a _Bool object holding a representation of 0 or 1.
Conversion from _Bool to any scalar type for an object holding some
representation other than 0 or 1 either yields 0 or 1 (depending
on the low-order bit) or has undefined behavior.

I did a sample now..
#include <stdio.h>
int main() {
     union {
         int i;
         _Bool b;
     } data;
     data.i = 123;
     printf("%d", data.b);
}
it printed 123 not 1.
So I think the assignment and cast covers all/most cases.
(From some previous tests I thought this was printing 1)

#include <stdio.h>
int main() {
     union {
         int i;
         float b;
     } data;
     data.i = 123;
     printf("%e", data.b);
}
I get some arbitrary float value printed. You're supposed to know what
you are doing with unions.

One of my tests led me to the wrong conclusion that reading a boolean
value would cause the compiler to add a conversion on read.
I did something wrong..I don't remember. I will try to keep all tests
next time.

But now, everything is back to normal.

union {
int i;
_Bool b;
} data;
data.b = 123;
printf("%d", data.b); //prints 1 as expected

union {
int i;
_Bool b;
} data;
data.i = 123;
printf("%d", data.b); //prints 123 as expected

It's not something I'd worry about. If you're trying to make a safer C,
then you'd have to ban unions, or ban bools inside unions that could be
read out as a different type, or introduce tagged unions so that runtime
checking can be done.

Something that could be done is to check in local context the last write
type is the same of last read. Then we can have a warning if they are
different.

Keith Thompson

2024-08-11 20:38:23 UTC

Post by Thiago Adams

Post by Keith Thompson

Post by Thiago Adams

Post by Thiago Adams
Bart, Does your compiler support the `bool` type, where the value
is always either 1 or 0?

There is a bool type, but it is treated like unsigned char, so is
non-conforming.

I do the same in my compiler , when I transpile from C99 to C89.
I was thinking how to make it conforming.
For instance on each write.
bool b = 123; -> unsigned char b = !!(123);
The problem this does not fix unions, writing on int and reading from char.

I don't think you need to fix that.

[....]

Post by Keith Thompson
Conversion from any scalar type to _Bool is well defined, and must yield
0 or 1.

I will fix in terns of expressions types.
- In this case cast to bool
- Assignment to bool

You need to cover all cases where a scalar value is converted to _Bool.
That includes (explicit) casts, assignment, initialization, argument
passing, and returning from a _Bool function.

Ideally you'd have one place in your code that handles conversions, but
you'll want to test all those cases and more.

Post by Thiago Adams

Post by Keith Thompson
It's possible to force a representation other than 0 or 1 into a _Bool
object, bypassing any value conversion.
Conversion from _Bool to any scalar type is well defined if the
operand is a _Bool object holding a representation of 0 or 1.
Conversion from _Bool to any scalar type for an object holding some
representation other than 0 or 1 either yields 0 or 1 (depending
on the low-order bit) or has undefined behavior.

I did a sample now..
#include <stdio.h>
int main() {
union {
int i;
_Bool b;
} data;
data.i = 123;
printf("%d", data.b);
}
it printed 123 not 1.

I get 123 if I compile with gcc, 1 if I compile with clang.
Both results are valid.

Post by Thiago Adams
So I think the assignment and cast covers all/most cases.
(From some previous tests I thought this was printing 1)
The motivation for C89 in cake was not to support old compilers, but
generate code that is compatible with C++98. In this aspect bool was
already there in C++98.(This just gave me idea to add target c++98)

C++ may have different rules, which you can discuss in comp.lang.c++.
A lot of your articles have multiple blank lines at the end. Can you
try to avoid that?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson

2024-08-09 20:26:39 UTC

Post by Thiago Adams

Post by David Brown

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins
of integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions
work. They are not performed 'in-place', any more than writing `a
+ 1` changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it
clear. My C compiler produces intermediate code like this before
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in
this execution model, there are no registers at all, only a stack
that can hold data of any type.
The 'fix' instruction pops the double value from the stack,
converts it to int (which involves changing both the bit-pattern,
and the bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real
machine. But you seem more concerned with safety and correctness
than performance, so there's probably no real need to look at
actual generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

It would be helpful if you made more of an effort to write clearly
here. (We know you can do so when you want to.) It is very
difficult to follow what you are referring to here - what is "this
case" here? A conversion from a double to an int certainly does not
re-interpret or copy bits - like other conversions, it copies the
/value/ to the best possible extent given the limitations of the
types.

Everything is a bit mixed up, but I'll try to explain the part about
registers that I have in mind.

Why you keep talking about registers? The C standard does not talk
about registers, and the promotion rules are based on which *operations*
are available, not what kinds of registers a given CPU happens to have.

Post by Thiago Adams
In C, when you have an expression like char + char, each char is
promoted to int. The computation then occurs as int + int.

Right. And the most important thing to keep in mind is that it works
that way *because the C standard says so*.

If you're looking for a rationale for this, it's probably because some
early C compilers were for target systems that didn't provide operations
on narrow types. (The PDP-11 does have instructions that operate on
8-bit bytes, but the arithmetic instructions operate only on 16-bit
words.) But the C standard rules apply to all conforming C
implementations. If a target CPU happens to provide a 1-byte add
instruction, a C compiler can't use it unless it can prove that the
result is consistent with C semantics.

Knowing why the C standard says what it says can be interesting, but
it's not necessarily directly useful.

Post by Thiago Adams
On the other hand, when you have float + float, it remains as float + float.

That's implementation-defined. Read the "Characteristics of floating
types <float.h>" section of the C standard (it's 5.2.5.3.3 in the N3220
draft) and look for FLT_EVAL_METHOD. If FLT_EVAL_METHOD (defined in
<float.h>) is 0, float operations are done in type float. If it's 1,
float operations are done in type double. If it's 2, all floating-point
operations are done in type long double.

Post by Thiago Adams
My guess for this design is that computations involving char are done
using registers that are the size of an int.

Who says the target CPU even has registers, or that computations can't
be done directly on values in memory?

Post by Thiago Adams
But, float + float is not promoted to double, so I assume that the
computer has specific float registers or similar operation
instructions for float.

Again, some CPUs have floating-point registers and some do not. Those
that don't might store floating-point values in the same registers used
for integer values, applying different instructions to operate on them.
Of those that do have floating-point registers, some have registers that
can hold a double value (typically 64 bits); they may or may not be able
to treat a half-register as a 32-bot float value.

The rules in the C standard are, for the most part, based on the
capabilities of CPUs that existed when the standard was written, not
necessarily on modern CPUs (though there have been tweaks in later
editions).

Post by Thiago Adams
Regarding the part about signed/unsigned registers and operations, I
must admit that I'm not sure. I was planning to check on Compiler
Explorer, but I haven't done that yet.
I can frame the question like this: Does the computer make a
distinction when adding signed versus unsigned integers? Are there
specific assembly instructions for signed versus unsigned operations,
covering all possible combinations?

Maybe.

What is your goal here? Are you trying to understand the history behind
the rules given in the C standard, or trying to understand the rules
themselves, or both?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Thiago Adams

2024-08-09 21:01:09 UTC

Post by Keith Thompson

Post by Thiago Adams

Post by David Brown

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins
of integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions
work. They are not performed 'in-place', any more than writing `a
+ 1` changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it
clear. My C compiler produces intermediate code like this before
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in
this execution model, there are no registers at all, only a stack
that can hold data of any type.
The 'fix' instruction pops the double value from the stack,
converts it to int (which involves changing both the bit-pattern,
and the bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real
machine. But you seem more concerned with safety and correctness
than performance, so there's probably no real need to look at
actual generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

It would be helpful if you made more of an effort to write clearly
here. (We know you can do so when you want to.) It is very
difficult to follow what you are referring to here - what is "this
case" here? A conversion from a double to an int certainly does not
re-interpret or copy bits - like other conversions, it copies the
/value/ to the best possible extent given the limitations of the
types.

Everything is a bit mixed up, but I'll try to explain the part about
registers that I have in mind.

Why you keep talking about registers? The C standard does not talk
about registers, and the promotion rules are based on which *operations*
are available, not what kinds of registers a given CPU happens to have.

I use "register" to talk about how hardware works.

Post by Keith Thompson

Post by Thiago Adams
In C, when you have an expression like char + char, each char is
promoted to int. The computation then occurs as int + int.

Right. And the most important thing to keep in mind is that it works
that way *because the C standard says so*.
If you're looking for a rationale for this, it's probably because some
early C compilers were for target systems that didn't provide operations
on narrow types. (The PDP-11 does have instructions that operate on
8-bit bytes, but the arithmetic instructions operate only on 16-bit
words.) But the C standard rules apply to all conforming C
implementations. If a target CPU happens to provide a 1-byte add
instruction, a C compiler can't use it unless it can prove that the
result is consistent with C semantics.
Knowing why the C standard says what it says can be interesting, but
it's not necessarily directly useful.

Post by Thiago Adams
On the other hand, when you have float + float, it remains as float + float.

That's implementation-defined. Read the "Characteristics of floating
types <float.h>" section of the C standard (it's 5.2.5.3.3 in the N3220
draft) and look for FLT_EVAL_METHOD. If FLT_EVAL_METHOD (defined in
<float.h>) is 0, float operations are done in type float. If it's 1,
float operations are done in type double. If it's 2, all floating-point
operations are done in type long double.

Post by Thiago Adams
My guess for this design is that computations involving char are done
using registers that are the size of an int.

Who says the target CPU even has registers, or that computations can't
be done directly on values in memory?

Post by Thiago Adams
But, float + float is not promoted to double, so I assume that the
computer has specific float registers or similar operation
instructions for float.

Again, some CPUs have floating-point registers and some do not. Those
that don't might store floating-point values in the same registers used
for integer values, applying different instructions to operate on them.
Of those that do have floating-point registers, some have registers that
can hold a double value (typically 64 bits); they may or may not be able
to treat a half-register as a 32-bot float value.
The rules in the C standard are, for the most part, based on the
capabilities of CPUs that existed when the standard was written, not
necessarily on modern CPUs (though there have been tweaks in later
editions).

Post by Thiago Adams
Regarding the part about signed/unsigned registers and operations, I
must admit that I'm not sure. I was planning to check on Compiler
Explorer, but I haven't done that yet.
I can frame the question like this: Does the computer make a
distinction when adding signed versus unsigned integers? Are there
specific assembly instructions for signed versus unsigned operations,
covering all possible combinations?

Maybe.
What is your goal here? Are you trying to understand the history behind
the rules given in the C standard, or trying to understand the rules
themselves, or both?

both.

I have fixed cake constant expressions that were not doing integer
promotion correctly. So I implemented the conversion rules.(6.3.1.8
Usual arithmetic conversions)

To implement the constant expression for instance (char)1234 I had to
implement the cast rule "as script" creating the combinations of all
types x types.

I was trying to implement some cast without all combinations type x type
then I started the topic to understand how cast works to try to emulate
what hardware does using bits..but at the end I did all combinations
then the C compiler that compiles by C compiler will do what is necessary.

In the future I also planning creating a backend so I want to understand
better the low level.
Apart of that I like to understand how/why things works.

Keith Thompson

2024-08-09 21:53:01 UTC

Post by Thiago Adams

Post by Keith Thompson

Post by Thiago Adams

Post by David Brown

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins
of integer promotions.
We don't have "char" register or signed/unsigned register. But I
believe we may have double and float registers. So float does not
need to be converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

The rules have little to do with concrete machines with registers.
Your initial post showed come confusion about how conversions
work. They are not performed 'in-place', any more than writing `a
+ 1` changes the value of `a`.
     int a; double x;
     x = (double)a;
The cast is implicit here but I've written it out to make it
clear. My C compiler produces intermediate code like this before
     push x   r64                   # r64 means float64
     fix      r64 -> i32
     pop a   i32
I could choose to interprete this code just as it is. Then, in
this execution model, there are no registers at all, only a stack
that can hold data of any type.
The 'fix' instruction pops the double value from the stack,
converts it to int (which involves changing both the bit-pattern,
and the bit-width), and pushes it back onto the stack.
Registers come into it when running it directly on a real
machine. But you seem more concerned with safety and correctness
than performance, so there's probably no real need to look at
actual generated native code.
That'll just be confusing (especially if you follow the advice to
generate only optimised code).

"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

It would be helpful if you made more of an effort to write clearly
here. (We know you can do so when you want to.) It is very
difficult to follow what you are referring to here - what is "this
case" here? A conversion from a double to an int certainly does not
re-interpret or copy bits - like other conversions, it copies the
/value/ to the best possible extent given the limitations of the
types.

Everything is a bit mixed up, but I'll try to explain the part about
registers that I have in mind.

Why you keep talking about registers? The C standard does not talk
about registers, and the promotion rules are based on which *operations*
are available, not what kinds of registers a given CPU happens to have.

I use "register" to talk about how hardware works.

I don't see how that makes sense. Not all CPUs have registers. For
CPUs that do have registers, some support operations directly on memory.
Register size does not constrain the set of operations that are
available; for example a CPU might have 64-bit registers and support
both 64-bit and 32-bit operations.

Registers are largely irrelevant.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson

2024-08-09 19:03:56 UTC

Thiago Adams <***@gmail.com> writes:
[...]

Post by Thiago Adams
"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.

The word "reinterpreted" usually refers to taking a given representation
(sequence of bits) and treating it as a value of a specified type
*without changing the bits*. (<OT>C++ even has "reintrepret_cast" as a
keyword.</OT>)

Converting from int to double does not reinterpret the bits. It creates
a new value that is mathematically equal (or nearly so) to the operand
value.

Post by Thiago Adams
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

Do you mean the IEEE standard? As always, the C standard defines the
semantics of conversions. For an implementation that claims to conform
to IEEE floating-point standard (ISO/IEC 60559), the semantics are
specified more precisely.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Thiago Adams

2024-08-09 19:22:30 UTC

Post by Keith Thompson
[...]

Post by Thiago Adams
"They are not performed 'in-place', any more than writing `a + 1`
changes the value of `a`."
Lets take double to int.
In this case the bits of double needs to be reinterpreted (copied to) int.

The word "reinterpreted" usually refers to taking a given representation
(sequence of bits) and treating it as a value of a specified type
*without changing the bits*. (<OT>C++ even has "reintrepret_cast" as a
keyword.</OT>)
Converting from int to double does not reinterpret the bits. It creates
a new value that is mathematically equal (or nearly so) to the operand
value.

Post by Thiago Adams
So the answer "how it works" can be
always/generally machine has a instruction to do this
or.. this is defined by the IIE ... standard as ...

Do you mean the IEEE standard? As always, the C standard defines the
semantics of conversions. For an implementation that claims to conform
to IEEE floating-point standard (ISO/IEC 60559), the semantics are
specified more precisely.

Yes. maybe the conversion double to int for instance, is related with
truncation, and this is specified on IEEE.
The inverse int to double is similar. For instance ULLONGMAX to double.

I guess all computers languages works in the same way today because also
hardware works following the floating point standards.

--

We already have many topics in one...but I am adding one more. :D
_Decimal32 , _Decimal64 etc..

C23 have these new types. I think it has to me emulated because it is
not something implemented on hardware.

C started as being something more hardware related, but now we have
complex types, bool, _Decimal.. I think all these types are "emulated".

David Brown

2024-08-08 22:36:56 UTC

Post by Thiago Adams

Post by Keith Thompson
[...]

Post by Thiago Adams

Post by Dan Purgert

Post by Thiago Adams
How about floating point?

Floating point is a huge mess, and has a few variations for
encoding;
though I think most C implementations use the one from the IEEE on 1985
(uh, IEEE754, I think?)

I didn't specify properly , but my question was more about floating
point registers. I think in this case they have specialized registers.

Who is "they"?
Some CPUs have floating-point registers, some don't. C says nothing
about registers.
What exactly is your question? Is it not already answered by reading
the "Conversions" section of the C standard?

This part is related with the previous question about the origins of
integer promotions.
We don't have "char" register or signed/unsigned register. But I believe
we may have double and float registers. So float does not need to be
converted to double.
There is no specif question here, just trying to understand the
rationally behind the conversions rules.

Stop trying to think about registers or implementations until you have
understood the meaning of the conversions, as detailed in the C
standards. Implementation details are just that - implementation
details, which vary from target to target, compiler to compiler, and
according to the circumstances and the rest of the surrounding code.

In general, conversions try to preserve values (with the exceptions and
limitations detailed in the standards). If that involves changing the
underlying bit representations, then those changes are made. If no
changes are needed, no changes will be made.

Stefan Ram

2024-08-08 12:44:46 UTC

Post by Dan Purgert
I don't know what happens when you're changing datatype lengths, but if
they're the same length, it's just telling the compiler what the
variable should be treated as (e.g. [8-bit] int to char)

Casting doesn't tweak a value right where it sits, so you don't
have to stress about resizing memory. (It hands you an rvalue,
not an lvalue.)

Casting isn't just about variables; it's all about expressions.

The whole casting concept hails from Algol.

Basically, casting is like flipping a value from one type
to another (as specified by the cast).

But you got to tackle each pair of data types on its own,
and that's way more than we can dive into here!

Dan Purgert

2024-08-08 14:08:14 UTC

Post by Stefan Ram

Post by Dan Purgert
I don't know what happens when you're changing datatype lengths, but if
they're the same length, it's just telling the compiler what the
variable should be treated as (e.g. [8-bit] int to char)

Casting doesn't tweak a value right where it sits, so you don't
have to stress about resizing memory. (It hands you an rvalue,
not an lvalue.)

Yeah, I have to admit I've been driving myself mad trying to learn
AVR-Assembly lately, so ... types don't exist :|

--
|_|O|_|
|_|_|O| Github: https://github.com/dpurgert
|O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860

Lawrence D'Oliveiro

2024-08-09 02:42:55 UTC

Post by Stefan Ram
The whole casting concept hails from Algol.

Type conversions are as old as the concept of types themselves.

Algol 68 introduced the term “coercion” for explicit type conversions.

Keith Thompson

2024-08-07 20:08:42 UTC

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?

It depends on the source and target types.

A cast is an explicit conversion. It takes a value of some type and
yields a value of some (other) type -- or of the same type. The
semantics of a conversion are not affected by whether it's done
explicitly or implicitly.

The rules depend on the source and target types, and are specified in
the standard, section 6.3 "Conversions". You'll need to read the whole
thing.

Post by Thiago Adams
For instance, any 4 bytes type, cast to 2 bytes type is just the lower
2 bytes?
[A][B][C][D]
->
[A][B]

For some combinations of type, a conversion *might* be implemented that
way. Converting a 4-byte float to a 2-byte short definitely does not.

Conversions between pointer types, and between a pointer and an integer
of the same size, are *typically* implemented by copying or
reinterpreting the bits, but that's not universal. An implementation
could have different representations, even different sizes, for pointers
of different types.

Post by Thiago Adams
I also would like to understand better signed and unsigned.
There is no such think as "signed" or "unsigned" register, right?
How about floating point?

The C standard has no concept of registers (the "register" keyword
notwithstanding). All conversions are defined in terms of values.
(Some CPUs have distinct floating-point registers, but that affect
the semantics of floating-point operations.) An implementation will use
registers and operations on them to implement the semantics defined in
the standard (assuming the target CPU has registers).

Post by Thiago Adams
The motivation problem.
I have a union, with unsigned int, unsigned char etc.(all types)
I need to execute a cast in runtime (like script).
The problem is that this causes an explosion of combinations that I am
trying to avoid.

I don't think you're going to be able to avoid it.

Again, read section 6.3 "Conversions" of the standard. It sounds like
you're going to have to explicitly cover all the cases in that section
(except for null pointer constants, which exist only in source code).

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Thiago Adams

2024-08-08 11:35:06 UTC

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?

I also curious about how bool works.

Values converted to bool became 0 or 1.
When this conversion happens, at read or write? Both?

How much does it cost?

I also learned we cannot cast a pointer to nullptr_t.

But we can read a union with nullptr. In this case the value is 0
similar of what happens with bool.

Sample

----------------

#include <stdio.h>

struct value {
int type;
union{
unsigned int ui;
_Bool b;
typeof(nullptr) p;
} ;
};

int main(){

struct value v;
v.ui = 123;
printf("%d / ", v.ui); //123
printf("%d / ", v.b); //1
printf("%p", v.p); //(nil)
}

--------

Using compiler explorer to understand bool.

void f(int i)
{
bool b = i;
printf("%d / ", b);
}

cmp dword ptr [rbp - 4], 0 //i think it checks 0 here
setne al
and al, 1
mov byte ptr [rbp - 5], al

I think it convert at write in this case..but for union it converts
when we read. This is why
printf("%d / ", v.b); //1
prints 1

void f()
{
union {
unsigned int ui;
_Bool b;
} v;
v.ui = 123;
printf("%d / ", v.b);
}

Where b.b is converted to 1?

f:
push rbp
mov rbp, rsp
sub rsp, 16
mov dword ptr [rbp - 4], 123
mov al, byte ptr [rbp - 4]
and al, 1
movzx esi, al
lea rdi, [rip + .L.str]
mov al, 0
call ***@PLT
add rsp, 16
pop rbp
ret

I think is at
al, 1
so the compiler uses 123 & 1 to cast number to bool.

Keith Thompson

2024-08-08 19:39:14 UTC

Post by Thiago Adams

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?

I also curious about how bool works.
Values converted to bool became 0 or 1.
When this conversion happens, at read or write? Both?

I don't know what you mean by that.

A conversion takes a value of some type and yields a value of some
(other) type. If the target type is bool, the result is false (0) if
the operand is equal to zero, true (1) otherwise.

A cast is simply an explicit conversion. Explict and implicit
conversions do the same thing. (Some conversions can only be done
explicitly.)

A converted value does not "become" 0 or 1; the conversion *yields*
0 or 1. The value of (bool)42 is 1, but 42 is still 42.

Post by Thiago Adams
How much does it cost?

It depends on what code the compiler generates; the language doesn't
address that. It could be free if the result isn't used and the
compiler optimizes it away.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

2024-08-08 22:40:15 UTC

Thiago Adams <***@gmail.com> writes:
...

Post by Thiago Adams
I also curious about how bool works.
Values converted to bool became 0 or 1.
When this conversion happens, at read or write? Both?

You can take a value obtained by reading an object, or a value produced
by evaluating an expression, and convert that value to a different type.
That value can later be stored in an object, or it could be used as one
of the operands for an expression. The conversion isn't associated with
either the read or the write. Many conversions occur implicitly, a cast
is used to explicitly make a conversion occur.

int x = 3;
bool b = x;

In the above code, an implicit conversion from int to bool occurs after
reading the value of 3 from x, and occurs before writing to bool.

b = !(bool)(x-3);

In this code, the conversion occurs after the value of 3 is retrieved
from x, and after 3 is subtracted from it. That result of 0 is then
converted to bool, and then the ! operator is applied to it. Finally,
the result is written to b. So you see, it doesn't make sense to connect
the conversion with either the read or the write.

Kenny McCormack

2024-08-08 16:19:44 UTC

Post by Keith Thompson

Post by Thiago Adams
How cast works?
Does it changes the memory?
For instance, from "unsigned int" to "signed char".
Is it just like discarding bytes or something else?

It depends on the source and target types.
A cast is an explicit conversion. It takes a value of some type and
yields a value of some (other) type -- or of the same type. The
semantics of a conversion are not affected by whether it's done
explicitly or implicitly.
The rules depend on the source and target types, and are specified in
the standard, section 6.3 "Conversions". You'll need to read the whole
thing.

What if he is a Trumper and can't read? What then?

--
Modern Conservative: Someone who can take time out from demanding more
flag burning laws, more abortion laws, more drug laws, more obscenity
laws, and more police authority to make warrantless arrests to remind
us that we need to "get the government off our backs".

Lawrence D'Oliveiro

2024-08-07 23:03:31 UTC

Post by Thiago Adams
The problem is that this causes an explosion of combinations that I am
trying to avoid.

The only way I can think of to minimize the combinatorial explosion is to
look at the precise types you are dealing with, and do some grouping of
them, doing the conversion in two steps via some intermediate “universal”
type, which is different for each group. E.g.

* Is either the source or destination type a floating-point type? Then use
the highest-available-precision float type as the intermediate type.
* Are both source and destination types integer types? Then use some
largest-available integer type as the intermediate type.

120 Replies
11 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Thiago Adams 2024-08-07 11:28:09 UTC

Thiago Adams 2024-08-07 11:33:38 UTC

Keith Thompson 2024-08-07 20:13:40 UTC

Tim Rentsch 2024-08-12 00:43:32 UTC

Dan Purgert 2024-08-07 20:00:28 UTC

Keith Thompson 2024-08-07 20:26:12 UTC

Lawrence D'Oliveiro 2024-08-07 23:00:17 UTC

Thiago Adams 2024-08-08 11:14:25 UTC

Bart 2024-08-08 13:23:44 UTC

Michael S 2024-08-08 16:32:03 UTC

Thiago Adams 2024-08-08 17:11:36 UTC

Bart 2024-08-08 17:29:40 UTC

Thiago Adams 2024-08-08 17:50:01 UTC

Thiago Adams 2024-08-08 17:57:48 UTC

Bart 2024-08-08 18:01:56 UTC

Thiago Adams 2024-08-08 18:13:06 UTC

Keith Thompson 2024-08-08 19:29:02 UTC

David Brown 2024-08-08 17:58:34 UTC

Bart 2024-08-08 19:09:56 UTC

David Brown 2024-08-08 22:32:00 UTC

Keith Thompson 2024-08-08 23:14:09 UTC

Lawrence D'Oliveiro 2024-08-09 02:47:08 UTC

Keith Thompson 2024-08-09 05:55:52 UTC

James Kuyper 2024-08-09 06:08:04 UTC

David Brown 2024-08-09 16:16:19 UTC

Keith Thompson 2024-08-09 19:18:29 UTC

Tim Rentsch 2024-08-12 00:07:56 UTC

Keith Thompson 2024-08-12 03:14:01 UTC

Bart 2024-08-09 00:56:15 UTC

David Brown 2024-08-09 17:08:42 UTC

Bart 2024-08-10 10:03:02 UTC

Lawrence D'Oliveiro 2024-08-09 02:45:43 UTC

Keith Thompson 2024-08-08 19:42:16 UTC

Thiago Adams 2024-08-08 20:34:04 UTC

Bart 2024-08-08 21:41:47 UTC

Keith Thompson 2024-08-08 23:17:55 UTC

Bart 2024-08-09 10:04:35 UTC

David Brown 2024-08-09 17:12:47 UTC

James Kuyper 2024-08-09 17:57:59 UTC

Bart 2024-08-09 20:59:44 UTC

Keith Thompson 2024-08-09 21:47:58 UTC

Bart 2024-08-09 23:32:39 UTC

Keith Thompson 2024-08-10 00:12:31 UTC

James Kuyper 2024-08-09 22:29:02 UTC

Keith Thompson 2024-08-09 21:29:14 UTC

James Kuyper 2024-08-09 22:35:12 UTC

Kaz Kylheku 2024-08-09 21:30:33 UTC

Keith Thompson 2024-08-09 21:57:06 UTC

Kaz Kylheku 2024-08-09 23:14:48 UTC

Keith Thompson 2024-08-09 23:58:00 UTC

Kaz Kylheku 2024-08-10 00:06:10 UTC

Keith Thompson 2024-08-10 00:27:19 UTC

James Kuyper 2024-08-10 00:31:44 UTC

Bart 2024-08-10 00:11:08 UTC

Tim Rentsch 2024-08-12 00:32:54 UTC

James Kuyper 2024-08-09 22:35:52 UTC

Tim Rentsch 2024-08-12 00:27:29 UTC

Keith Thompson 2024-08-09 19:23:34 UTC

Bart 2024-08-09 20:31:23 UTC

Keith Thompson 2024-08-09 20:49:10 UTC

Bart 2024-08-09 21:01:37 UTC

Tim Rentsch 2024-08-12 07:33:38 UTC

Tim Rentsch 2024-08-12 00:46:58 UTC

Bart 2024-08-12 01:00:15 UTC

Keith Thompson 2024-08-12 03:23:20 UTC

Tim Rentsch 2024-08-12 03:37:08 UTC

Keith Thompson 2024-08-12 04:33:23 UTC

Thiago Adams 2024-08-09 10:57:17 UTC

Bart 2024-08-09 15:25:41 UTC

Keith Thompson 2024-08-09 19:06:48 UTC

David Brown 2024-08-09 17:20:27 UTC

Thiago Adams 2024-08-09 18:54:19 UTC

Thiago Adams 2024-08-09 19:05:21 UTC

David Brown 2024-08-09 19:43:44 UTC

Keith Thompson 2024-08-09 20:28:15 UTC

David Brown 2024-08-09 20:01:22 UTC

Bart 2024-08-10 10:17:39 UTC

Thiago Adams 2024-08-10 13:15:28 UTC

Bart 2024-08-10 16:14:19 UTC

Thiago Adams 2024-08-10 23:01:36 UTC

Keith Thompson 2024-08-11 00:10:42 UTC

Thiago Adams 2024-08-11 12:23:19 UTC

Bart 2024-08-11 12:30:08 UTC

Thiago Adams 2024-08-11 17:16:00 UTC

Keith Thompson 2024-08-11 20:38:23 UTC

Keith Thompson 2024-08-09 20:26:39 UTC

Thiago Adams 2024-08-09 21:01:09 UTC

Keith Thompson 2024-08-09 21:53:01 UTC

Keith Thompson 2024-08-09 19:03:56 UTC

Thiago Adams 2024-08-09 19:22:30 UTC

David Brown 2024-08-08 22:36:56 UTC

Stefan Ram 2024-08-08 12:44:46 UTC

Dan Purgert 2024-08-08 14:08:14 UTC

Lawrence D'Oliveiro 2024-08-09 02:42:55 UTC

Keith Thompson 2024-08-07 20:08:42 UTC

Thiago Adams 2024-08-08 11:35:06 UTC

Keith Thompson 2024-08-08 19:39:14 UTC

James Kuyper 2024-08-08 22:40:15 UTC

Kenny McCormack 2024-08-08 16:19:44 UTC

Lawrence D'Oliveiro 2024-08-07 23:03:31 UTC

about - legalese

Loading...