So You Think You Can Const?

Discussion:

So You Think You Can Const?

Add Reply

Julio Di Egidio

2025-01-07 19:32:50 UTC

Reply

Hi everybody,

I am back to programming in C after many years:
indeed I have forgotten so many things, including
how much I love this language. :)

In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

To the question, I was reading this, but I am not
sure what the quoted passage means:

Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

I do not understand if just declaring that a pointer
is to constant data may incur in that problem even
if the pointed data was in fact allocated with malloc.
I would say of course not, but I am not sure.

E.g. consider this little internal helper of mine
(which implements an interface that is public to
do an internal thing...), where I am casting to
pointer to non-constant data in order to free the
pointed data (i.e. without warning):

```c
static int MyStruct_free_(MyStruct_t const *pT) {
assert(pT);

free((MyStruct_t *)pT);

return 0;
}
```

Assuming, as said, that the data was originally
allocated with malloc, is that code safe or
something can go wrong even in that case?

Thank in advance for any help/insight,

Julio

Kaz Kylheku

2025-01-07 22:11:42 UTC

Reply

Post by Julio Di Egidio
Hi everybody,
indeed I have forgotten so many things, including
how much I love this language. :)
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

Post by Julio Di Egidio
I do not understand if just declaring that a pointer
is to constant data may incur in that problem even
if the pointed data was in fact allocated with malloc.
I would say of course not, but I am not sure.

A pointer whose referenced type is const does not define an object of
that type. A const-qualified object may point to data which is not const
qualified. It may be converted to a pointer from whose type the
qualifier is removed, and then the converted pointer can be used to
modify the data.

Post by Julio Di Egidio
E.g. consider this little internal helper of mine
(which implements an interface that is public to
do an internal thing...), where I am casting to
pointer to non-constant data in order to free the
```c
static int MyStruct_free_(MyStruct_t const *pT) {
assert(pT);
free((MyStruct_t *)pT);

The prototype of free is

void free(void *ptr);

when it comes to pointers, the C language permits implicit conversions
from "pointer to T" to "pointer to const T". Implicit meaning that
no cast is required: you simply pass the "T *" value as a "const T *"
function argument, or assign it to a "const T *" variable, etc.

If yuo have some malloced storage which you are referencing with a
"const T *" type, then you have a constraint violation if you free
that pointer; hence the cast is required.

Objects coming from malloc are not defined by a declaration.

ISO C defines the term /effective type/ (// indicates italics)
for the purposes of stating some rules regarding expressions accessing
objects. "The /effective type/ of an object that is not a byte array,
for an access to its stored value, is the declared type of the object"
says the N3301 draft of C23 in section 6.5.1 Expressions/General.
A footnote to this sentence clarifies that "allocated objects have no
declared type", almost certainly meaning dynamically allocated by
the malloc family.

A chunk of memory from malloc is a kind of byte array, so the
subsequents words apply to it:

"If a value is stored into a byte array through an lvalue having a type
that is not a byte type, then the type of the lvalue becomes the
effective type of the object for that access and for subsequent accesses
that do not modify the stored value."

When we write values into the bytes of a malloced object, it takes on
that type for subsequent reads.

In the same section, rules are given regarding what type an expression
may have which is accessing an object, in relation to that object's
effective type.

Indeed, the rules prohibit an object whose effective type is some "const
T" from being accessed as a plain "T".

However: it is not possible for a dynamically allocated object to
have an effective type of "const T"!!!

The reason is simple: the effective type of an allocated is established
when an object is written, and then holds for subsequent reads. An
object cannot be written through a "const T" lvalue.

How you got the "const MyStruct_t *" pointer is that you first
treated the object as "MyStruct_t *", and filled in its members.
Then you cast the pointer to "const MyStruct_t *".

Casting a pointer doesn't do anything to the referenced object's
effective type; it is not a write operation on the object.

Post by Julio Di Egidio
Assuming, as said, that the data was originally
allocated with malloc, is that code safe or
something can go wrong even in that case?

So yes, it is safe to treat malloced objects as const and then remove
the const qualifier (as inescapably required by the API) when freeing.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Julio Di Egidio

2025-01-08 14:02:23 UTC

Reply

<snipped>

Post by Kaz Kylheku

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

Overall, I am surmising this and only this might go write-protected:

MyStruct_t const T = {...};

While this one allocates a "byte-array", i.e. irrespective of how the
pointer we are assigning it is declared:

MyStruct_t const *pT = malloc(...);

Is my understanding (to that point) correct?

Post by Kaz Kylheku
Objects coming from malloc are not defined by a declaration.

I vaguely remember the use of "declaration" vs "definition" I used to
find confusing at the time, but some three decades have passed, maybe
now I can do better...

Post by Kaz Kylheku
ISO C defines the term /effective type/ (// indicates italics)
for the purposes of stating some rules regarding expressions accessing
objects. "The /effective type/ of an object that is not a byte array,
for an access to its stored value, is the declared type of the object"
says the N3301 draft of C23 in section 6.5.1 Expressions/General.
A footnote to this sentence clarifies that "allocated objects have no
declared type", almost certainly meaning dynamically allocated by
the malloc family.

Thank you so much overall, for the explanations and the final
assessment: I couldn't hope for a better answer.

Post by Kaz Kylheku
How you got the "const MyStruct_t *" pointer is that you first
treated the object as "MyStruct_t *", and filled in its members.
Then you cast the pointer to "const MyStruct_t *".

I was actually going out of my way trying to const-ify everything while
trying not to cast anywhere: I totally misremembered what these things
actually mean in C, and adding const everywhere possible is one of the
habits I have meanwhile acquired from other languages. But it's coming
back... I *will* do as you say. :)

Thanks again very much,

Julio

Julio Di Egidio

2025-01-08 14:05:43 UTC

Reply

Sorry, should read: how the pointer we are assigning it to is...

-Julio

Ben Bacarisse

2025-01-08 15:16:03 UTC

Reply

Post by Julio Di Egidio
<snipped>

Post by Kaz Kylheku

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

An object (in C) is a contiguous region of storage, the contents of
which can represent values.

Post by Julio Di Egidio
MyStruct_t const T = {...};

Yes, though you should extend your concern beyond what might be
write-protected. Modifying an object whose type is const qualified is
undefined, even if the object is in writable storage. A compiler may
assume that such an object has not changed because in a program that has
undefined behaviour, all bets are off. For example, under gcc with
almost any optimisation this program prints 42:

#include <stdio.h>

void f(const int *ip)
{
*(int *)ip = 0;
}

int main(void)
{
const int a = 42;
f(&a);
printf("%d\n", a);
}

Post by Julio Di Egidio
While this one allocates a "byte-array", i.e. irrespective of how the
MyStruct_t const *pT = malloc(...);
Is my understanding (to that point) correct?

Technically you get an object with no effective type. David's reply
included some references to find out more about the effective type of an
object, but it is safe to say that these only come into play if you are
messing about with the way you access the allocated storage (for example
accessing it as a MyStruct but then later as a floating point object).

More relevant to a discussion of const is to ask what you plan to do
with pT since you can't (without a cast) assign any useful value to the
allocated object.

It is generally better to use a non const-qualified pointer for the
allocation but, when using the pointer, to pass it to functions that use
the right type depending on whether they modify the pointed-to object or
not. For example:

MyStack *sp = malloc(*sp);
...
stack_push(sp, 99);
...
if (stack_empty(sp)) ...
...
stack_free(sp);

we would have

void stack_push(MyStack *sp, int v) { ... }
bool stack_empty(MyStack const *sp) { ... }
void stack_free(MyStack *sp) { ... }

--
Ben.

David Brown

2025-01-08 15:53:44 UTC

Reply

Post by Ben Bacarisse
Technically you get an object with no effective type. David's reply
included some references to find out more about the effective type of an
object, but it is safe to say that these only come into play if you are
messing about with the way you access the allocated storage (for example
accessing it as a MyStruct but then later as a floating point object).

My turn for the little correction - it was Kaz that gave the helpful
references, not me :-)

But I can give the OP a useful reference - the site cppreference.com has
a lot of accurate reference information about C (and C++, for those that
want it). It is usually a little easier to read than the C standards,
while still being very accurate.

<https://en.cppreference.com/w/c/language>
<https://en.cppreference.com/w/c/language/object>
<https://en.cppreference.com/w/c/language/const>
<https://en.cppreference.com/w/c/language/declarations>

Julio Di Egidio

2025-01-08 16:05:24 UTC

Reply

<snipped>

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Kaz Kylheku

Post by Julio Di Egidio
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

An object (in C) is a contiguous region of storage, the contents of
which can represent values.

Is that regardless of the stack/heap distinction, or is an "object"
about heap-allocated/dynamic memory only? -- Anyway, I should in fact
re-acquaint myself with the language reference instead of asking this
question.)

Post by Ben Bacarisse

Post by Julio Di Egidio
MyStruct_t const T = {...};

Yes, though you should extend your concern beyond what might be
write-protected. Modifying an object whose type is const qualified is
undefined, even if the object is in writable storage.

Yes, I am being a bit quick, but I definitely agree with that and indeed
the priority of "defined behaviour" as a concern.

Post by Ben Bacarisse

Post by Julio Di Egidio
While this one allocates a "byte-array", i.e. irrespective of how the
MyStruct_t const *pT = malloc(...);
Is my understanding (to that point) correct?

Technically you get an object with no effective type.

OK.

Post by Ben Bacarisse
More relevant to a discussion of const is to ask what you plan to do
with pT since you can't (without a cast) assign any useful value to the
allocated object.

Say my program unit implements AVL trees, with (conceptually speaking)
constructors/destructors, navigation and retrieval, and of course
manipulation (inserting, deleting, etc.).

My idea (but I would think this is pretty "canonical" and, if it isn't,
I am missing the mark) is: my public functions take/give "sealed"
instances (with const members to const data), as the user is not
supposed to directly manipulate/edit the data, OTOH of course my
implementation is all about in-place editing...

-Julio

Julio Di Egidio

2025-01-08 16:24:05 UTC

Reply

<snip>

Post by Julio Di Egidio

Post by Ben Bacarisse
More relevant to a discussion of const is to ask what you plan to do
with pT since you can't (without a cast) assign any useful value to the
allocated object.

Say my program unit implements AVL trees, with (conceptually speaking)
constructors/destructors, navigation and retrieval, and of course
manipulation (inserting, deleting, etc.).
My idea (but I would think this is pretty "canonical" and, if it isn't,
I am missing the mark) is: my public functions take/give "sealed"
instances (with const members to const data), as the user is not
supposed to directly manipulate/edit the data, OTOH of course my
implementation is all about in-place editing...

P.S. To be clear, as I am still being a bit quick: I do not also mean
"public destructors" should take a const pointer in input, i.e. apply as
appropriate...

And here is what my construction/destruction code is looking like at the
moment, which should also make clear what I meant by "a private method
implementing a public interface" and why:

```c
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;

pT = malloc(sizeof(AvlTree_t));

if (!pT) {
return NULL;
}

pT->pk = pk;
pT->pL = pL;
pT->pR = pR;

return pT;
}

static int AvlTree_free_(AvlTree_t const *pT) {
assert(pT);

free((AvlTree_t *)pT);

return 0;
}

AvlTree_t const *AvlTree_create(void const *pk) {
return AvlTree_node(pk, NULL, NULL);
}

void AvlTree_destroy(AvlTree_t *pT) {
AvlTree_visitPost(AvlTree_free_, pT);
}
```

-Julio

Chris M. Thomasson

2025-01-08 22:55:33 UTC

Reply

Post by Julio Di Egidio
<snip>

Post by Julio Di Egidio

Post by Ben Bacarisse
More relevant to a discussion of const is to ask what you plan to do
with pT since you can't (without a cast) assign any useful value to the
allocated object.

Say my program unit implements AVL trees, with (conceptually speaking)
constructors/destructors, navigation and retrieval, and of course
manipulation (inserting, deleting, etc.).
My idea (but I would think this is pretty "canonical" and, if it
isn't, I am missing the mark) is: my public functions take/give
"sealed" instances (with const members to const data), as the user is
not supposed to directly manipulate/edit the data, OTOH of course my
implementation is all about in-place editing...

P.S. To be clear, as I am still being a bit quick: I do not also mean
"public destructors" should take a const pointer in input, i.e. apply as
appropriate...
And here is what my construction/destruction code is looking like at the
moment, which should also make clear what I meant by "a private method

Sometimes I think the following can be kind of useful:

struct foo
{
[...]
};

void foo_bar(struct foo const* const self, [...])
{
// self cannot change without a warning
// self is const pointer to a const struct foo

self = NULL; // zap!
}

;^)

Post by Julio Di Egidio
```c
static AvlTree_t const *AvlTree_node(
    void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
    AvlTree_t *pT;
    pT = malloc(sizeof(AvlTree_t));
    if (!pT) {
        return NULL;
    }
    pT->pk = pk;
    pT->pL = pL;
    pT->pR = pR;
    return pT;
}
static int AvlTree_free_(AvlTree_t const *pT) {
    assert(pT);
    free((AvlTree_t *)pT);
    return 0;
}
AvlTree_t const *AvlTree_create(void const *pk) {
    return AvlTree_node(pk, NULL, NULL);
}
void AvlTree_destroy(AvlTree_t *pT) {
    AvlTree_visitPost(AvlTree_free_, pT);
}
```
-Julio

Ben Bacarisse

2025-01-09 01:09:20 UTC

Reply

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
write:

static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);

if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}

I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

--
Ben.

Kaz Kylheku

2025-01-09 04:24:56 UTC

Reply

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}

More generally:

foo_handle *foo = foo_create();
bar_handle *bar = foo ? bar_create(foo) : 0; // doesn't like null
xyzzy_handle *xyz = xyzzy_create(42, bar, arg);
container *con = malloc(sizeof *con);

if (foo && bar && xyz && con) {
// happy case: we have all three resources

con->foo = foo;
con->bar = bar;
con->xyz = xyz;

return con;
}

xyzzy_destroy(xyz);
xyzzy_destroy(bar);
if (foo)
xyzzy_destroy(foo); // stupidly doesn't like null

return 0;

Post by Ben Bacarisse
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

I might just have made the case. When more resources need to be
acquired that might fail, it consolidates the happy case under one
conjunctive test, and consolidates the cleanup in the unhappy case.
Effectively it's almost if we have only two cases.

A minor disadvantage is that in the unhappy flow, we may allocate
resources past the point where it is obvious they are not going to be
needed: if foo_create() failed, we are pointlessly calling
xyzzy_create() and malloc for the container. It's possible that these
succeed, and we are just going to turn around and free them.

It's a form of consolidated error checking, like when we make
several system calls and check them for errors as a batch;
e.g. call fprintf several times and check for disk full (etc)
just once.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

David Brown

2025-01-09 09:35:52 UTC

Reply

Post by Kaz Kylheku

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}

foo_handle *foo = foo_create();
bar_handle *bar = foo ? bar_create(foo) : 0; // doesn't like null
xyzzy_handle *xyz = xyzzy_create(42, bar, arg);
container *con = malloc(sizeof *con);
if (foo && bar && xyz && con) {
// happy case: we have all three resources
con->foo = foo;
con->bar = bar;
con->xyz = xyz;
return con;
}
xyzzy_destroy(xyz);
xyzzy_destroy(bar);
if (foo)
xyzzy_destroy(foo); // stupidly doesn't like null
return 0;

Post by Ben Bacarisse
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

I might just have made the case. When more resources need to be
acquired that might fail, it consolidates the happy case under one
conjunctive test, and consolidates the cleanup in the unhappy case.
Effectively it's almost if we have only two cases.
A minor disadvantage is that in the unhappy flow, we may allocate
resources past the point where it is obvious they are not going to be
needed: if foo_create() failed, we are pointlessly calling
xyzzy_create() and malloc for the container. It's possible that these
succeed, and we are just going to turn around and free them.

How about taking the idea slightly further and making the later
allocations conditional too?

foo_handle *foo = foo_create();
bar_handle *bar = foo ? bar_create(foo) : 0; // doesn't like null
xyzzy_handle *xyz = bar ? xyzzy_create(42, bar, arg) : 0;
container *con = xyz ? malloc(sizeof *con) : 0;

if (con) {
// happy case: we have all three resources

...

If you are going to use that style (and I not arguing for or against
it), go all in!

Post by Kaz Kylheku
It's a form of consolidated error checking, like when we make
several system calls and check them for errors as a batch;
e.g. call fprintf several times and check for disk full (etc)
just once.

Chris M. Thomasson

2025-01-09 21:37:09 UTC

Reply

Post by David Brown

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
     void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
     AvlTree_t *pT;
     pT = malloc(sizeof(AvlTree_t));
     if (!pT) {
         return NULL;
     }
     pT->pk = pk;
     pT->pL = pL;
     pT->pR = pR;
     return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
      void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
      AvlTree_t *pT = malloc(*pT);
      if (pT) {
          pT->pk = pk;
          pT->pL = pL;
          pT->pR = pR;
      }
      return pT;
}

    foo_handle *foo = foo_create();
    bar_handle *bar = foo ? bar_create(foo) : 0; // doesn't like null
    xyzzy_handle *xyz = xyzzy_create(42, bar, arg);
    container *con = malloc(sizeof *con);
    if (foo && bar && xyz && con) {
      // happy case: we have all three resources
      con->foo = foo;
      con->bar = bar;
      con->xyz = xyz;
      return con;
    }
    xyzzy_destroy(xyz);
    xyzzy_destroy(bar);
    if (foo)
       xyzzy_destroy(foo); // stupidly doesn't like null
    return 0;

Post by Ben Bacarisse
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

I might just have made the case. When more resources need to be
acquired that might fail, it consolidates the happy case under one
conjunctive test, and consolidates the cleanup in the unhappy case.
Effectively it's almost if we have only two cases.
A minor disadvantage is that in the unhappy flow, we may allocate
resources past the point where it is obvious they are not going to be
needed: if foo_create() failed, we are pointlessly calling
xyzzy_create() and malloc for the container. It's possible that these
succeed, and we are just going to turn around and free them.

How about taking the idea slightly further and making the later
allocations conditional too?
   foo_handle *foo = foo_create();
   bar_handle *bar = foo ? bar_create(foo) : 0; // doesn't like null
   xyzzy_handle *xyz = bar ? xyzzy_create(42, bar, arg) : 0;
   container *con = xyz ? malloc(sizeof *con) : 0;
   if (con) {
     // happy case: we have all three resources
   ...
If you are going to use that style (and I not arguing for or against
it), go all in!

Indeed! :^D

Post by David Brown

It's a form of consolidated error checking, like when we make
several system calls and check them for errors as a batch;
e.g. call fprintf several times and check for disk full (etc)
just once.

Julio Di Egidio

2025-01-09 06:49:47 UTC

Reply

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

That is *more* error prone, all the more so if it's not a 5 liner...

-Julio

Ben Bacarisse

2025-01-09 23:23:33 UTC

Reply

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

--
Ben.

Julio Di Egidio

2025-01-09 23:37:56 UTC

Reply

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional experience
in software engineering and programming and doing it properly since day
one: just think about that code and what I said for what it's worth, in
particular I haven't mentioned 5 liners by chance, things are quite more
complicated not in vitro.

And please do not hold a grudge about that: it's not me who was trying
to say how to write code... ;)

HTH,

-Julio

Julio Di Egidio

2025-01-09 23:45:44 UTC

Reply

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
      void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
      AvlTree_t *pT;
      pT = malloc(sizeof(AvlTree_t));
      if (!pT) {
          return NULL;
      }
      pT->pk = pk;
      pT->pL = pL;
      pT->pR = pR;
      return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
   static AvlTree_t const *AvlTree_node(
       void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
   ) {
       AvlTree_t *pT = malloc(*pT);
         if (pT) {
           pT->pk = pk;
           pT->pL = pL;
           pT->pR = pR;
       }
       return pT;
   }
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional experience
in software engineering and programming and doing it properly since day
one: just think about that code and what I said for what it's worth, in
particular I haven't mentioned 5 liners by chance, things are quite more
complicated not in vitro.
And please do not hold a grudge about that: it's not me who was trying
to say how to write code... ;)
HTH,

BTW, I hadn't mention it, but have you noticed the second one is
misindented? Between me and you, I can tell how long a piece of code
will take to break when in production by just looking at it... A lot of
fun. :)

-Julio

Tim Rentsch

2025-01-10 01:43:50 UTC

Reply

Post by Julio Di Egidio

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you
want!) -- I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional
experience in software engineering and programming and doing it
properly since day one: just think about that code and what I said
for what it's worth, in particular I haven't mentioned 5 liners by
chance, things are quite more complicated not in vitro.
And please do not hold a grudge about that: it's not me who was
trying to say how to write code... ;)

BTW, I hadn't mention it, but have you noticed the second one is
misindented? Between me and you, I can tell how long a piece of
code will take to break when in production by just looking at
it... A lot of fun. :)

The indentation was correct in Ben's original posting.

The misindentation first appeared in your followup to that
posting, where the quoted portion had been changed to remove a
blank line and over-indent the if().

Julio Di Egidio

2025-01-10 02:14:13 UTC

Reply

Post by Tim Rentsch

Post by Julio Di Egidio

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you
want!) -- I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional
experience in software engineering and programming and doing it
properly since day one: just think about that code and what I said
for what it's worth, in particular I haven't mentioned 5 liners by
chance, things are quite more complicated not in vitro.
And please do not hold a grudge about that: it's not me who was
trying to say how to write code... ;)

BTW, I hadn't mention it, but have you noticed the second one is
misindented? Between me and you, I can tell how long a piece of
code will take to break when in production by just looking at
it... A lot of fun. :)

The indentation was correct in Ben's original posting.
The misindentation first appeared in your followup to that
posting, where the quoted portion had been changed to remove a
blank line and over-indent the if().

But indeed the point is what happens in the long run: if you look above
mine is still better indented... :) But of course it is not
indentation per se the problem: for example check the return value as
soon as the function returns a possibly null pointer or an error value
is certainly more widely applicable, and quite less error prone,
especially if it's not a 5 liner... Anyway, I also truly believe there
is no point in belabouring the point: it's the overall picture that one
must get to see.

-Julio

Julio Di Egidio

2025-01-10 02:51:16 UTC

Reply

Post by Julio Di Egidio

Post by Tim Rentsch

Post by Julio Di Egidio

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you
want!) -- I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional
experience in software engineering and programming and doing it
properly since day one: just think about that code and what I said
for what it's worth, in particular I haven't mentioned 5 liners by
chance, things are quite more complicated not in vitro.
And please do not hold a grudge about that: it's not me who was
trying to say how to write code... ;)

BTW, I hadn't mention it, but have you noticed the second one is
misindented? Between me and you, I can tell how long a piece of
code will take to break when in production by just looking at
it... A lot of fun. :)

The indentation was correct in Ben's original posting.
The misindentation first appeared in your followup to that
posting, where the quoted portion had been changed to remove a
blank line and over-indent the if().

But indeed the point is what happens in the long run: if you look above
mine is still better indented... :) But of course it is not indentation
per se the problem: for example check the return value as soon as the
function returns a possibly null pointer or an error value is certainly
more widely applicable, and quite less error prone, especially if it's

I meant: immediately check the return value and bail out if needed. The
other approach does not even simplify on the clean-up, by the way...

Post by Julio Di Egidio
not a 5 liner... Anyway, I also truly believe there is no point in
belabouring the point: it's the overall picture that one must get to see.

-Julio

Julio Di Egidio

2025-01-10 03:05:05 UTC

Reply

Post by Julio Di Egidio

Post by Tim Rentsch

Post by Julio Di Egidio

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you
want!) -- I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional
experience in software engineering and programming and doing it
properly since day one: just think about that code and what I said
for what it's worth, in particular I haven't mentioned 5 liners by
chance, things are quite more complicated not in vitro.
And please do not hold a grudge about that: it's not me who was
trying to say how to write code... ;)

BTW, I hadn't mention it, but have you noticed the second one is
misindented? Between me and you, I can tell how long a piece of
code will take to break when in production by just looking at
it... A lot of fun. :)

The indentation was correct in Ben's original posting.
The misindentation first appeared in your followup to that
posting, where the quoted portion had been changed to remove a
blank line and over-indent the if().

But indeed the point is what happens in the long run: if you look
above mine is still better indented... :) But of course it is not
indentation per se the problem: for example check the return value as
soon as the function returns a possibly null pointer or an error value
is certainly more widely applicable, and quite less error prone,
especially if it's

I meant: immediately check the return value and bail out if needed. The
other approach does not even simplify on the clean-up, by the way...

Post by Julio Di Egidio
not a 5 liner... Anyway, I also truly believe there is no point in
belabouring the point: it's the overall picture that one must get to see.

A last one for the more theoretically inclined: every "if" is a code
path, but not all code paths are "if"s...

Cheers,

-Julio

Ben Bacarisse

2025-01-10 01:04:01 UTC

Reply

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT;
pT = malloc(sizeof(AvlTree_t));
if (!pT) {
return NULL;
}
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
return pT;
}

Just on a side issue, I prefer to make tests like this positive so I'd
static AvlTree_t const *AvlTree_node(
void const *pk, AvlTree_t const *pL, AvlTree_t const *pR
) {
AvlTree_t *pT = malloc(*pT);
if (pT) {
pT->pk = pk;
pT->pL = pL;
pT->pR = pR;
}
return pT;
}
I'm not going to "make a case" for this (though I will if you want!) --
I just think it helps to see lots of different styles.

That is *more* error prone,

I would be happy for you to expand on why you say that.

Post by Julio Di Egidio
all the more so if it's not a 5 liner...

There is no such thing as expanding 40 years of professional experience in
just think about that code and what I said for what it's worth, in
particular I haven't mentioned 5 liners by chance, things are quite more
complicated not in vitro.
And please do not hold a grudge about that: it's not me who was trying to
say how to write code... ;)

I was not trying to tell anyone how to write code. You made a claim --
not even a qualified one, a categorical one -- the justification for
which I would have liked to know more about.

HTH,

Not really, but no one is under any obligation here. It's just a
discussion.

--
Ben.

James Kuyper

2025-01-08 19:10:54 UTC

Reply

...

Post by Julio Di Egidio

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Kaz Kylheku
An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

An object (in C) is a contiguous region of storage, the contents of
which can represent values.

Is that regardless of the stack/heap distinction, or is an "object"
about heap-allocated/dynamic memory only? -- Anyway, I should in fact
re-acquaint myself with the language reference instead of asking this
question.)

The standard makes no distinction between heap and stack memory. Those
are merely implementation details, outside the scope of the standard. C
can be implemented on hardware that provides no support for either a
heap or a stack.
Objects can have any one of four different storage durations: static,
thread, automatic, and allocated. On implementations with a stack,
objects with static or automatic storage can be implemented using the
stack. On implementations with a heap, objects with allocated storage
duration can be implemented using the heap. I'm not sufficiently
familiar with multi-threaded programming to comment on how objects with
thread storage duration may be implemented.

The key issue that's connected to your questions is the fact that the
relevant rule is about objects that are defined as being 'const'.
Objects with automatic storage duration are never defined, so they
cannot be defined to be 'const'. You can only obtain a void* value that
points at such objects by calling one of the memory allocation functions
(aligned_alloc, malloc, calloc, or realloc). You can convert that value
into a pointer to non-const object type and then use that pointer to
write an object of that type into that memory. Unless the object type is
a character type, doing so will give that memory that type. You can also
convert that value into a pointer to a const-qualified type, but that
doesn't make the object it points at const.

James Kuyper

2025-01-08 19:20:00 UTC

Reply

On 1/8/25 14:10, James Kuyper wrote:
...

Post by James Kuyper
Objects with automatic storage duration are never defined, so they
cannot be defined to be 'const'.

That was supposed to be "allocated", not "automatic".
Objects with automatic storage duration can be defined as const:

void func(void) {
const int the_answer = 42;
}

On some implementations, they may even be stored in read-only memory.

Scott Lurndal

2025-01-08 20:14:40 UTC

Reply

<snip>

Post by James Kuyper
I'm not sufficiently
familiar with multi-threaded programming to comment on how objects with
thread storage duration may be implemented.

Generally by reserving a register to point to the base of a
per-thread region of memory allocated by the runtime. In x86,
one of the otherwise useless segment registers (%gs) was used
as the base address of the per-thread region (%fs is used by the kernel
as a 'per-cpu' reg base address pointer). I believe Aarch64 uses
x18.

Tim Rentsch

2025-01-08 20:12:37 UTC

Reply

Julio Di Egidio <***@diegidio.name> writes:

[...]

Post by Julio Di Egidio
Say my program unit implements AVL trees, with (conceptually
speaking) constructors/destructors, navigation and retrieval, and
of course manipulation (inserting, deleting, etc.).
My idea (but I would think this is pretty "canonical" and, if it
isn't, I am missing the mark) is: my public functions take/give
"sealed" instances (with const members to const data), as the user
is not supposed to directly manipulate/edit the data, OTOH of
course my implementation is all about in-place editing...

A better choice is to put the AVL code in a separate .c file,
and give out only opaque types to clients. For example (disclaimer:
not compiled):

// in "avl.h"
typedef struct avl_node_s *AVLTree;
// note that the struct contents are not defined in the .h file

... declare interfaces that accept and return AVLTree values ...

// in "avl.c"
#include "avl.h"
struct avl_node_s {
// whatever members are needed
};

... implementation of public interfaces and any supporting
... functions needed

I might mention that some people don't like declaring a type name
that includes the pointerness ('*') as part of the type. I think
doing that is okay (and in fact more than just okay; better) in the
specific case where the type name is being offered as an opaque
type.

Of course you could also make the opaque type be a pointer to a
'const' struct type, if you wanted to, but the extra "protection" of
const-ness doesn't add much, and might actually cost more than it
buys you because of the additional casting that would be needed.

Julio Di Egidio

2025-01-09 07:12:58 UTC

Reply

Post by Tim Rentsch
[...]

Post by Julio Di Egidio
Say my program unit implements AVL trees, with (conceptually
speaking) constructors/destructors, navigation and retrieval, and
of course manipulation (inserting, deleting, etc.).
My idea (but I would think this is pretty "canonical" and, if it
isn't, I am missing the mark) is: my public functions take/give
"sealed" instances (with const members to const data), as the user
is not supposed to directly manipulate/edit the data, OTOH of
course my implementation is all about in-place editing...

A better choice is to put the AVL code in a separate .c file,
// in "avl.h"
typedef struct avl_node_s *AVLTree;
// note that the struct contents are not defined in the .h file
... declare interfaces that accept and return AVLTree values ...
// in "avl.c"
#include "avl.h"
struct avl_node_s {
// whatever members are needed
};
... implementation of public interfaces and any supporting
... functions needed
I might mention that some people don't like declaring a type name
that includes the pointerness ('*') as part of the type. I think
doing that is okay (and in fact more than just okay; better) in the
specific case where the type name is being offered as an opaque
type.
Of course you could also make the opaque type be a pointer to a
'const' struct type, if you wanted to, but the extra "protection" of
const-ness doesn't add much, and might actually cost more than it
buys you because of the additional casting that would be needed.

Thank you, I like that...

-Julio

Chris M. Thomasson

2025-01-08 22:48:06 UTC

Reply

Post by Julio Di Egidio
<snipped>

Post by Ben Bacarisse

Post by Kaz Kylheku

Post by Julio Di Egidio
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
      to place any const declarations in read-only storage,
      so if you attempt to hack around the const blocks,
      you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

An object (in C) is a contiguous region of storage, the contents of
which can represent values.

Is that regardless of the stack/heap distinction, or is an "object"
about heap-allocated/dynamic memory only? -- Anyway, I should in fact
re-acquaint myself with the language reference instead of asking this
question.)

Post by Ben Bacarisse

MyStruct_t const T = {...};

Yes, though you should extend your concern beyond what might be
write-protected. Modifying an object whose type is const qualified is
undefined, even if the object is in writable storage.

Yes, I am being a bit quick, but I definitely agree with that and indeed
the priority of "defined behaviour" as a concern.

Post by Ben Bacarisse

While this one allocates a "byte-array", i.e. irrespective of how the
MyStruct_t const *pT = malloc(...);
Is my understanding (to that point) correct?

Technically you get an object with no effective type.

OK.

Post by Ben Bacarisse
More relevant to a discussion of const is to ask what you plan to do
with pT since you can't (without a cast) assign any useful value to the
allocated object.

Say my program unit implements AVL trees, with (conceptually speaking)
constructors/destructors, navigation and retrieval, and of course
manipulation (inserting, deleting, etc.).
My idea (but I would think this is pretty "canonical" and, if it isn't,
I am missing the mark) is: my public functions take/give "sealed"
instances (with const members to const data), as the user is not
supposed to directly manipulate/edit the data, OTOH of course my
implementation is all about in-place editing...

Off topic, but for some reason you are making me think of the mutable
keyword in C++:

https://en.cppreference.com/w/cpp/language/cv

;^)

Ben Bacarisse

2025-01-09 01:04:27 UTC

Reply

Post by Julio Di Egidio
<snipped>

Post by Ben Bacarisse

Post by Julio Di Egidio

Post by Kaz Kylheku

Post by Julio Di Egidio
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

An object (in C) is a contiguous region of storage, the contents of
which can represent values.

Is that regardless of the stack/heap distinction, or is an "object" about
heap-allocated/dynamic memory only?

That's what an object is in all cases.

Post by Julio Di Egidio
-- Anyway, I should in fact
re-acquaint myself with the language reference instead of asking this
question.)

Post by Ben Bacarisse

Post by Julio Di Egidio
MyStruct_t const T = {...};

Yes, though you should extend your concern beyond what might be
write-protected. Modifying an object whose type is const qualified is
undefined, even if the object is in writable storage.

Yes, I am being a bit quick, but I definitely agree with that and indeed
the priority of "defined behaviour" as a concern.

Post by Ben Bacarisse

Post by Julio Di Egidio
While this one allocates a "byte-array", i.e. irrespective of how the
MyStruct_t const *pT = malloc(...);
Is my understanding (to that point) correct?

Technically you get an object with no effective type.

OK.

Post by Ben Bacarisse
More relevant to a discussion of const is to ask what you plan to do
with pT since you can't (without a cast) assign any useful value to the
allocated object.

Say my program unit implements AVL trees, with (conceptually speaking)
constructors/destructors, navigation and retrieval, and of course
manipulation (inserting, deleting, etc.).
My idea (but I would think this is pretty "canonical" and, if it isn't, I
am missing the mark) is: my public functions take/give "sealed" instances
(with const members to const data), as the user is not supposed to directly
manipulate/edit the data, OTOH of course my implementation is all about
in-place editing...

See Tim's reply -- the best way to implement "sealed" instances is to
use an opaque type where the "user code" simply can't see anything but a
pointer to an otherwise unknown struct.

A slight variation to what Tim was suggesting would be to take the
pointer out of the typedef because that can allow you to define an
interface with pointers to const and to non-const AVLtree objects:

typedef struct AVLtree AVLtree;

AVLtree *avl_create(void);
void avl_add(AVLtree *tree, ...);
void *avl_lookup(const AVLtree *tree, ...);

and so on. Of course, if you use a more function style as Tim was
suggesting there is no value in having this distinction.

[I must say it's great to have a discussion about C programming for a
change instead of endless threads about how awful C is and how this or
that feature should be added to make it more like Rust/C++/Whatever.]

--
Ben.

Julio Di Egidio

2025-01-09 06:56:42 UTC

Reply

<snip>

Post by Ben Bacarisse

Post by Julio Di Egidio
My idea (but I would think this is pretty "canonical" and, if it isn't, I
am missing the mark) is: my public functions take/give "sealed" instances
(with const members to const data), as the user is not supposed to directly
manipulate/edit the data, OTOH of course my implementation is all about
in-place editing...

See Tim's reply -- the best way to implement "sealed" instances is to
use an opaque type where the "user code" simply can't see anything but a
pointer to an otherwise unknown struct.

You say "best", I'd prefer "canonical" meaning what the people do who
know what they are doing. :) But OK, that sounds indeed reasonable:
I'll see what I get...

-Julio

Tim Rentsch

2025-01-08 20:43:52 UTC

Reply

Post by Ben Bacarisse

Post by Julio Di Egidio
<snipped>

Post by Kaz Kylheku

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
to place any const declarations in read-only storage,
so if you attempt to hack around the const blocks,
you could get undefined behavior. >>

An object defined with a type that is const-qualified
could be put into write-protected storage.

What do you/we mean by "object" in this context? (Sorry, I do have
forgotten, the glossary to begin with.)

An object (in C) is a contiguous region of storage, the contents of
which can represent values.

Post by Julio Di Egidio
MyStruct_t const T = {...};

Yes, though you should extend your concern beyond what might be
write-protected. Modifying an object whose type is const qualified
is undefined, even if the object is in writable storage. A compiler
may assume that such an object has not changed because in a program
that has undefined behaviour, all bets are off. [...]

We need to be careful about what is being asserted here. There
are cases where a compiler may not assume that a const object
has not changed, despite the rule that assigning to a const
object is undefined behavior:

#include <stdio.h>
typedef union { const int foo; int bas; } Foobas;

int
main(){
Foobas fb = { 0 };

printf( " fb.foo is %d\n", fb.foo );
fb.bas = 7;
printf( " fb.foo is %d\n", fb.foo );
return 0;
}

The object fb.foo is indeed a const object, but an access of
fb.foo must not assume that it retains its original value after
the assignment to fb.bas.

Ben Bacarisse

2025-01-09 00:49:05 UTC

Reply

Post by Tim Rentsch

... Modifying an object whose type is const qualified
is undefined, even if the object is in writable storage. A compiler
may assume that such an object has not changed because in a program
that has undefined behaviour, all bets are off. [...]

We need to be careful about what is being asserted here. There
are cases where a compiler may not assume that a const object
has not changed, despite the rule that assigning to a const
#include <stdio.h>
typedef union { const int foo; int bas; } Foobas;
int
main(){
Foobas fb = { 0 };
printf( " fb.foo is %d\n", fb.foo );
fb.bas = 7;
printf( " fb.foo is %d\n", fb.foo );
return 0;
}
The object fb.foo is indeed a const object, but an access of
fb.foo must not assume that it retains its original value after
the assignment to fb.bas.

Yes, good point. There is also the case of

volatile const int x;

which the compiler can't assume won't change even though the program
can't change it directly.

--
Ben.

David Brown

2025-01-08 08:46:46 UTC

Reply

Post by Julio Di Egidio
Hi everybody,
indeed I have forgotten so many things, including
how much I love this language. :)
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

What devices do not have at least C99 compilers available - and yet /do/
have standard C90 compilers available? What sort of code are you
writing that should ideally run on an AVR Tiny with 2K of flash and 64
bytes of ram, a DSP with 24-bit chars, and a TOP100 supercomputer? Have
you thought about this in more detail?

People who say they want their code to run on anything are invariably
wildly exaggerating. People who say they want to write strictly
standards-conforming code, especially C90, so that it will run
everywhere, misunderstand the relationship between the C standards and
real-world tools.

I would say that the most portable language standard to use would be a
subset of C99. Avoid complex numbers, VLAs, and wide/multibyte
characters, and it will be compilable on all but the most obscure
compilers. The use of <stdint.h> types make it far easier to write
clear portable code while keeping good efficiency, and many C99 features
let you write clearer, safer, and more efficient code. C90 was probably
a good choice for highly portable code 15-20 years ago, but not now.
(Your use of "malloc" eliminates far more potential devices for the code
than choosing C99 ever could.)

Post by Julio Di Egidio
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
   to place any const declarations in read-only storage,
   so if you attempt to hack around the const blocks,
   you could get undefined behavior. >>
I do not understand if just declaring that a pointer
is to constant data may incur in that problem even
if the pointed data was in fact allocated with malloc.
I would say of course not, but I am not sure.

You are mixing up declarations and definitions. It's not surprising -
the article you reference is full of mistakes and bad advice.

If you write "const uint32_t hello = 3;", you are /defining/ the object
"hello", so it is a const object. It's value may not be changed in any
way during its lifetime - attempting to do so is undefined behaviour,
and the compiler will complain if it sees a direct attempt to change it.
If the const object has program lifetime - it is a file-scope
variable, or a static variable - the compiler can put it in read-only
memory of some sort. On a microcontroller, that might mean flash memory
- on a PC, it might mean a write-protected read-only memory page. For a
local variable, it's much more likely that it will go in a register, on
the stack, or be eliminated by optimisation - but if the initialiser is
always the same, it could hypothetically also be placed in read-only memory.

Suppose you have :

int v = 123; // Non-const object definition
const int * cp = &v; // Const pointer to non-const data
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Change the target data

This is allowed, because the original object definition was not a const
definition.

However, with this:

int v = 123; // Const object definition
const int * cp = &v; // Const pointer to const data
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Undefined behaviour

You can make the pointer to non-const, but trying to change an object
that was /defined/ as const is undefined behaviour (even if it was not
placed in read-only memory).

When you use dynamic memory, however, you are not defining an object in
the same way. If you write :

const int * cp = malloc(sizeof(int));

you are defining the object "p" as a pointer to type "const int" - but
you are not defining a const int. You can cast "cp" to "int *" and use
that new pointer to change the value.

Post by Julio Di Egidio
E.g. consider this little internal helper of mine
(which implements an interface that is public to
do an internal thing...), where I am casting to
pointer to non-constant data in order to free the
```c
static int MyStruct_free_(MyStruct_t const *pT) {
    assert(pT);
    free((MyStruct_t *)pT);
    return 0;
}
```
Assuming, as said, that the data was originally
allocated with malloc, is that code safe or
something can go wrong even in that case?

The code is safe in that it is not undefined behaviour - data allocated
with malloc is never defined const. However, it is /unsafe/ in that it
is doing something completely unexpected, given the function signature.

When you have a function with a parameter of type "const T * p", this
tells people reading it that the function will only read data via "p",
and will never use "p" to change the data. The compiler will enforce
this unless you specifically use casts to tell the compiler "I'm doing
something that looks wrong - but I know it is right".

Don't lie to your compiler. Don't lie to your fellow programmers (or
yourself). Use const for things that you won't change - it is extremely
rare that it is appropriate to cast away constness.

Ben Bacarisse

2025-01-08 11:25:53 UTC

Reply

Post by David Brown
int v = 123; // Non-const object definition
const int * cp = &v; // Const pointer to non-const data
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Change the target data
This is allowed, because the original object definition was not a const
definition.
int v = 123; // Const object definition
const int * cp = &v; // Const pointer to const data
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Undefined behaviour

I think missed out the crucial "const" on the first line of the second
example! It's always the way.

Post by David Brown
You can make the pointer to non-const, but trying to change an object that
was /defined/ as const is undefined behaviour (even if it was not placed in
read-only memory).
When you use dynamic memory, however, you are not defining an object in the
const int * cp = malloc(sizeof(int));

I prefer

const int *cp = malloc(sizeof *cp);

Post by David Brown
you are defining the object "p" as a pointer to type "const int" - but you
are not defining a const int. You can cast "cp" to "int *" and use that
new pointer to change the value.

--
Ben.

David Brown

2025-01-08 12:25:12 UTC

Reply

Post by Ben Bacarisse

Post by David Brown
int v = 123; // Non-const object definition
const int * cp = &v; // Const pointer to non-const data
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Change the target data
This is allowed, because the original object definition was not a const
definition.
int v = 123; // Const object definition

Correction:
const int v = 123;

Post by Ben Bacarisse

Post by David Brown
const int * cp = &v; // Const pointer to const data
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Undefined behaviour

I think missed out the crucial "const" on the first line of the second
example! It's always the way.

Fortunately, Usenet is a self-correcting medium :-) Thanks for pointing
out that mistake, and I hope the OP sees your correction before getting
confused by my copy-pasta error.

Post by Ben Bacarisse

Post by David Brown
You can make the pointer to non-const, but trying to change an object that
was /defined/ as const is undefined behaviour (even if it was not placed in
read-only memory).
When you use dynamic memory, however, you are not defining an object in the
const int * cp = malloc(sizeof(int));

I prefer
const int *cp = malloc(sizeof *cp);

That's a common preference. Personally, I prefer the former - I think
it makes it clearer that we are allocating space for an int. Hopefully
the OP will hang around this group and we'll get a chance to give advice
and suggestions on many different aspects of C programming.

Post by Ben Bacarisse

Post by David Brown
you are defining the object "p" as a pointer to type "const int" - but you
are not defining a const int. You can cast "cp" to "int *" and use that
new pointer to change the value.

Julio Di Egidio

2025-01-08 14:42:56 UTC

Reply

<snipped>

Post by David Brown

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

People who say they want their code to run on anything are invariably
wildly exaggerating.

:) I do have embedded, and FPGAs, and even transpiling to e.g. Wasm,
etc. in mind, my overall idea for now simply being: as long as the
device comes with a C compiler that is not too broken. (I am also
planning to distribute source files only: it also makes my life and
coding so much easier, at the cost of not being able to
"micro-optimize": where I am rather hoping that optimization can still
come down the line if needed as an added pre or post processing step.)

So, you might very well be right that "C90" isn't the best possible
choice not even for my requirement, anyway I am at a pre-alpha stage, I
am sure I will be tightening it up.

Post by David Brown
People who say they want to write strictly
standards-conforming code, especially C90, so that it will run
everywhere, misunderstand the relationship between the C standards and
real-world tools.

So, now that I have qualified it with "any device coming with a C
compiler (that is not too broken)", would you think coding it in "ANSI
C" makes some sense?

Post by David Brown
I would say that the most portable language standard to use would be a
subset of C99. Avoid complex numbers, VLAs, and wide/multibyte
characters, and it will be compilable on all but the most obscure
compilers. The use of <stdint.h> types make it far easier to write
clear portable code while keeping good efficiency, and many C99 features
let you write clearer, safer, and more efficient code. C90 was probably
a good choice for highly portable code 15-20 years ago, but not now.
(Your use of "malloc" eliminates far more potential devices for the code
than choosing C99 ever could.)

Assuming I don't in fact care if and how well a compiler does its job
(in fact my policy for now is: as long as it compiles with GCC with
those flags), what is wrong with "malloc"?

Post by David Brown
When you have a function with a parameter of type "const T * p", this
tells people reading it that the function will only read data via "p",

Never mind, it's a private (static) method, so I am not "lying" to
anybody: rather const and cast and almost everything in C is altogether
something else...

-Julio

David Brown

2025-01-08 16:18:21 UTC

Reply

Post by Julio Di Egidio
<snipped>

Post by David Brown

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

People who say they want their code to run on anything are invariably
wildly exaggerating.

:) I do have embedded, and FPGAs, and even transpiling to e.g. Wasm,
etc. in mind, my overall idea for now simply being: as long as the
device comes with a C compiler that is not too broken. (I am also
planning to distribute source files only: it also makes my life and
coding so much easier, at the cost of not being able to
"micro-optimize": where I am rather hoping that optimization can still
come down the line if needed as an added pre or post processing step.)

Do you have experience with embedded programming (if so, what kind of
devices)?

Post by Julio Di Egidio
So, you might very well be right that "C90" isn't the best possible
choice not even for my requirement, anyway I am at a pre-alpha stage, I
am sure I will be tightening it up.

Post by David Brown
People who say they want to write strictly standards-conforming code,
especially C90, so that it will run everywhere, misunderstand the
relationship between the C standards and real-world tools.

So, now that I have qualified it with "any device coming with a C
compiler (that is not too broken)", would you think coding it in "ANSI
C" makes some sense?

No.

There are basically two classes of small embedded devices - those that
are usually programmed with gcc (and sometimes clang, and occasionally
vastly expensive commercial tools), and those that are programmed using
non-standard, limited and often expensive sort-of-C compilers. For
people using gcc, clang, Green Hills, Code Warrior, or other quality
tools on a 16-bit or 32-bit microcontroller, C99 is not a problem. C23
is not a problem for the most popular toolchain for the most popular
microcontroller core.

For people using 8051, COP8, 68HC05, PIC16 or other long outdated
brain-dead microcontrollers, you don't get standard C support at all.
You program these in a device-specific variant of C full of extensions
and extra restrictions - and the support is as close to the subset of
C99 that I described as it is to standard C90.

Those kinds of microcontrollers are now pretty much only used in legacy
hardware or where companies have too much code investment that cannot
reasonably be ported to something modern. (There are also a small
number of niche use-cases.) So you can be confident that almost anyone
using your software in embedded systems will be using a 32-bit core -
most likely an ARM Cortex-M, but possibly RISC-V. And they will
probably be using a toolchain that supports at least C17 (some people
are still on older toolchains), whether it is gcc, clang, or commercial.
Certainly solid C99 support is guaranteed. Everything else is niche,
and no one will be using your software on niche systems.

Post by Julio Di Egidio

Post by David Brown
I would say that the most portable language standard to use would be a
subset of C99. Avoid complex numbers, VLAs, and wide/multibyte
characters, and it will be compilable on all but the most obscure
compilers. The use of <stdint.h> types make it far easier to write
clear portable code while keeping good efficiency, and many C99
features let you write clearer, safer, and more efficient code. C90
was probably a good choice for highly portable code 15-20 years ago,
but not now. (Your use of "malloc" eliminates far more potential
devices for the code than choosing C99 ever could.)

Assuming I don't in fact care if and how well a compiler does its job
(in fact my policy for now is: as long as it compiles with GCC with
those flags), what is wrong with "malloc"?

If that's your starting assumption, then can we also assume that you
don't expect anyone to use your code - certainly not on embedded
systems? People often place too much emphasis on code efficiency, but
in small embedded systems, efficient code means smaller, cheaper and
lower power microcontrollers which is almost always relevant.

The problem with malloc, however, is nothing to do with code efficiency.
Dynamic memory is generally banned, or at least highly restricted, in
serious embedded programming as it is a huge reliability risk. Most
code on PC's can tolerate leaks - programs run for a bit, then stop and
any leaked memory is recovered by the OS. Embedded programs usually
never stop. PC programs can usually assume unlimited memory - embedded
systems have very limited memory. PC OS's have memory managers to
re-arrange memory and see few issues with fragmentation - embedded
systems are easily killed by heap fragmentation. PC programs expect to
have wildly varying timings - embedded systems are often real-time and
do not do well with the non-deterministic timing you usually see with
malloc/free implementations.

Sometimes dynamic memory usage is unavoidable, but good code for
embedded systems uses alternative solutions where possible, and usually
uses specialised pools rather than a generic heap with malloc.

Post by Julio Di Egidio

Post by David Brown
When you have a function with a parameter of type "const T * p", this
tells people reading it that the function will only read data via "p",

Never mind, it's a private (static) method, so I am not "lying" to
anybody: rather const and cast and almost everything in C is altogether
something else...

You are lying to yourself - and that is not a good start.

Don't use const everywhere - use it where it is /helpful/. Don't use
casts unless you have very good reason for it - get the types and
qualifiers correct from the start. The article you referenced is rubbish.

Julio Di Egidio

2025-01-08 16:35:42 UTC

Reply

Post by David Brown

Post by Julio Di Egidio
<snipped>

Post by David Brown

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

People who say they want their code to run on anything are invariably
wildly exaggerating.

:) I do have embedded, and FPGAs, and even transpiling to e.g. Wasm,
etc. in mind, my overall idea for now simply being: as long as the
device comes with a C compiler that is not too broken.

<snip>

Post by David Brown
Do you have experience with embedded programming (if so, what kind of
devices)?

TL;DR nearly zero. Siemens PLCs for small industrial automation, my own
experimenting with Intel FPGAs (mainly for coprocessors), plus coding
against device drivers for systems integration.

Post by David Brown

Post by Julio Di Egidio
So, now that I have qualified it with "any device coming with a C
compiler (that is not too broken)", would you think coding it in "ANSI
C" makes some sense?

No.

Cool. :) Please give me few hours, maybe less: I will be reading your
reply with great interest.

-Julio

David Brown

2025-01-08 18:39:20 UTC

Reply

Post by Julio Di Egidio

Post by David Brown

Post by Julio Di Egidio
<snipped>

Post by David Brown

Post by Julio Di Egidio
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

People who say they want their code to run on anything are
invariably wildly exaggerating.

:) I do have embedded, and FPGAs, and even transpiling to e.g. Wasm,
etc. in mind, my overall idea for now simply being: as long as the
device comes with a C compiler that is not too broken.

<snip>

Post by David Brown
Do you have experience with embedded programming (if so, what kind of
devices)?

TL;DR nearly zero. Siemens PLCs for small industrial automation, my own
experimenting with Intel FPGAs (mainly for coprocessors), plus coding
against device drivers for systems integration.

OK. Small-systems embedded programming is quite a bit different from
"big" systems programming.

Post by Julio Di Egidio

Post by David Brown

Post by Julio Di Egidio
So, now that I have qualified it with "any device coming with a C
compiler (that is not too broken)", would you think coding it in
"ANSI C" makes some sense?

No.

Cool. :) Please give me few hours, maybe less: I will be reading your
reply with great interest.

No problem. I'm glad you think it looks like it is worth reading!

Phillip

2025-01-08 16:45:01 UTC

Reply

For people using 8051, COP8, 68HC05, PIC16 or other long outdated brain-
dead microcontrollers, you don't get standard C support at all. You
program these in a device-specific variant of C full of extensions and
extra restrictions - and the support is as close to the subset of C99
that I described as it is to standard C90.

Just a point of reference, there are still several "brain-dead" systems
in modern use today that aren't old, some being invested as late as
2019. That being said, your comment isn't completely accurate in that,
there are some modern uses of things like the 6502 that can use
standards-based C. In fact, you can use ANSI C89 and C90 with the 6502.
I've done this for several modern pace makers as well as a smart
prosthetic. So your statement is correct in 90% of cases but not all cases.

(Also most car manufacturer's use the 6502 and other variants for their
digital input analog gauges and warning light controls on their dashboards.)

C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

But this is all based on the OP's specific use case for their
application. I just wanted to chime in since I do primarily work on
modern embedded systems that don't use "modern" microcontrollers and
CPU's since they are still used in a wide range of modern devices that
people don't even realize.

--
Phillip Frabott
----------
- Adam: Is a void really a void if it returns?
- Jack: No, it's just nullspace at that point.
----------

Tim Rentsch

2025-01-08 19:52:12 UTC

Reply

C89 and C90 are better for 8-bit systems then C99 and newer. Not
that you can't do 8-bit on C99 but it's just not designed as well
for it since C99 assumes you've moved on to at least 16-bit.

Which parts of the C99 standard support this assertion?

Keith Thompson

2025-01-08 20:20:05 UTC

Reply

Phillip <***@fulltermprivacy.com> writes:
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)

Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Phillip

2025-01-08 20:27:18 UTC

Reply

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

Yes, this is what I was saying.

--
Phillip Frabott
----------
- Adam: Is a void really a void if it returns?
- Jack: No, it's just nullspace at that point.
----------

Keith Thompson

2025-01-08 21:41:43 UTC

Reply

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Phillip

2025-01-09 05:09:31 UTC

Reply

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific instructions
generated by the compiler are different between c90 and c99 where TOE's
matter (timing of execution). For example, there are cases (sorry I
don't have examples because it's been a long time since I've gone
through this) where c99, in order to be more efficient, will output a
different set of instructions, but in certain cases, those instructions,
while more efficient, take longer to process on the CPU or
microcontroller. Whereas C89 and C90 may be more inefficient but the
instructions execute faster. It might only be that C99 adds an extra 1-3
clock cycles, and in most cases this isn't a problem or noticeable. But
when you are dealing with devices that are responsible for keeping a
human alive (such as a pace maker) the extra cycles can add up over time
and will cause trouble down the road. So this was the purpose behind my
point of reference earlier was just to say, that there are niche cases
where the statement that was made, wouldn't be accurate.

For pace makers the GNU GCC implementation was used and for the smart
prosthetic the CLANG implementation was used. GCC was using C90 and
CLANG was using C89 (ANSI).

Although above I couldn't provide a specific example (again sorry about
that) I do have the result report from back when I was testing out pace
makers with C99 over C90 (2007) and the process found that with C99 the
processor would be behind by around 500 cycles within 19h 41m 19s from
program start. This had a +/- of 12 cycles differential with repeat
testing. That would mean the heart would miss a beat ever 17 days.

--
Phillip Frabott
----------
- Adam: Is a void really a void if it returns?
- Jack: No, it's just nullspace at that point.
----------

Keith Thompson

2025-01-09 05:34:37 UTC

Reply

Post by Phillip

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific instructions
generated by the compiler are different between c90 and c99 where
TOE's matter (timing of execution). For example, there are cases
(sorry I don't have examples because it's been a long time since I've
gone through this) where c99, in order to be more efficient, will
output a different set of instructions, but in certain cases, those
instructions, while more efficient, take longer to process on the CPU
or microcontroller. Whereas C89 and C90 may be more inefficient but
the instructions execute faster. It might only be that C99 adds an
extra 1-3 clock cycles, and in most cases this isn't a problem or
noticeable. But when you are dealing with devices that are responsible
for keeping a human alive (such as a pace maker) the extra cycles can
add up over time and will cause trouble down the road. So this was the
purpose behind my point of reference earlier was just to say, that
there are niche cases where the statement that was made, wouldn't be
accurate.

Are you saying that, for example, "gcc -std=c90" and "gcc -std=c99"
are generating different instruction sequences for the same code,
with the same version of gcc in both cases?

Hmm. I can't think of anything in the changes from C90 to C99 that
would necessarily cause that kind of thing. Unless I'm missing
something, it's not C99 that results in the "more efficient"
instructions, it's the behavior of the compiler in C99 mode.
It could as easily have been the other way around.

Post by Phillip
For pace makers the GNU GCC implementation was used and for the smart
prosthetic the CLANG implementation was used. GCC was using C90 and
CLANG was using C89 (ANSI).

Note that C89 and C90 are exactly the same language. The 1990 ISO C
standard is identical to the C89 standard, except for some introductory
sections introduced by ISO. (I've heard vague rumors of some other
differences, but as far as I know there's nothing significant.)

Post by Phillip
Although above I couldn't provide a specific example (again sorry
about that) I do have the result report from back when I was testing
out pace makers with C99 over C90 (2007) and the process found that
with C99 the processor would be behind by around 500 cycles within 19h
41m 19s from program start. This had a +/- of 12 cycles differential
with repeat testing. That would mean the heart would miss a beat ever
17 days.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Phillip

2025-01-09 15:30:35 UTC

Reply

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific instructions
generated by the compiler are different between c90 and c99 where
TOE's matter (timing of execution). For example, there are cases
(sorry I don't have examples because it's been a long time since I've
gone through this) where c99, in order to be more efficient, will
output a different set of instructions, but in certain cases, those
instructions, while more efficient, take longer to process on the CPU
or microcontroller. Whereas C89 and C90 may be more inefficient but
the instructions execute faster. It might only be that C99 adds an
extra 1-3 clock cycles, and in most cases this isn't a problem or
noticeable. But when you are dealing with devices that are responsible
for keeping a human alive (such as a pace maker) the extra cycles can
add up over time and will cause trouble down the road. So this was the
purpose behind my point of reference earlier was just to say, that
there are niche cases where the statement that was made, wouldn't be
accurate.

Are you saying that, for example, "gcc -std=c90" and "gcc -std=c99"
are generating different instruction sequences for the same code,
with the same version of gcc in both cases?

Yes. And it did surprise me as well. But it got rejected by the medical
association at the time because the cycle counts would get behind their
requirements. When they sent it back I spent a month thinking it was my
code before someone had suggested to try C90 (which I did) and it turned
out that it actually did change the sequence just enough to make a
difference. Without any code changes I resubmitted the C90 version (and
ask them to do a second run of the C99 as well) and it turned out that
it passed and the second round of tests on the C99 failed. After getting
back both test binaries and disassembling them, I did find there were
two different sequences between the C99 and C90. It was subtle but
present. Enough to fail the test. Ever since then I've been using C90
for pace makers.

Post by Keith Thompson
Hmm. I can't think of anything in the changes from C90 to C99 that
would necessarily cause that kind of thing. Unless I'm missing
something, it's not C99 that results in the "more efficient"
instructions, it's the behavior of the compiler in C99 mode.
It could as easily have been the other way around.

The above includes a response to this.

Post by Keith Thompson

Post by Phillip
For pace makers the GNU GCC implementation was used and for the smart
prosthetic the CLANG implementation was used. GCC was using C90 and
CLANG was using C89 (ANSI).

Note that C89 and C90 are exactly the same language. The 1990 ISO C
standard is identical to the C89 standard, except for some introductory
sections introduced by ISO. (I've heard vague rumors of some other
differences, but as far as I know there's nothing significant.)

I'm aware, I only stated it because you asked the question and I was
being thorough.

Post by Keith Thompson

Post by Phillip
Although above I couldn't provide a specific example (again sorry
about that) I do have the result report from back when I was testing
out pace makers with C99 over C90 (2007) and the process found that
with C99 the processor would be behind by around 500 cycles within 19h
41m 19s from program start. This had a +/- of 12 cycles differential
with repeat testing. That would mean the heart would miss a beat ever
17 days.

--
Phillip Frabott
----------
- Adam: Is a void really a void if it returns?
- Jack: No, it's just nullspace at that point.
----------

Michael S

2025-01-09 16:12:44 UTC

Reply

On Wed, 08 Jan 2025 21:34:37 -0800

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer.
Not that you can't do 8-bit on C99 but it's just not designed
as well for it since C99 assumes you've moved on to at least
16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an
implementation with 8-bit int would be non-conforming under any
edition of the standard, though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit
systems than some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific instructions
generated by the compiler are different between c90 and c99 where
TOE's matter (timing of execution). For example, there are cases
(sorry I don't have examples because it's been a long time since
I've gone through this) where c99, in order to be more efficient,
will output a different set of instructions, but in certain cases,
those instructions, while more efficient, take longer to process on
the CPU or microcontroller. Whereas C89 and C90 may be more
inefficient but the instructions execute faster. It might only be
that C99 adds an extra 1-3 clock cycles, and in most cases this
isn't a problem or noticeable. But when you are dealing with
devices that are responsible for keeping a human alive (such as a
pace maker) the extra cycles can add up over time and will cause
trouble down the road. So this was the purpose behind my point of
reference earlier was just to say, that there are niche cases where
the statement that was made, wouldn't be accurate.

Are you saying that, for example, "gcc -std=c90" and "gcc -std=c99"
are generating different instruction sequences for the same code,
with the same version of gcc in both cases?
Hmm. I can't think of anything in the changes from C90 to C99 that
would necessarily cause that kind of thing. Unless I'm missing
something, it's not C99 that results in the "more efficient"
instructions, it's the behavior of the compiler in C99 mode.
It could as easily have been the other way around.

Post by Phillip
For pace makers the GNU GCC implementation was used and for the
smart prosthetic the CLANG implementation was used. GCC was using
C90 and CLANG was using C89 (ANSI).

Note that C89 and C90 are exactly the same language. The 1990 ISO C
standard is identical to the C89 standard, except for some
introductory sections introduced by ISO. (I've heard vague rumors of
some other differences, but as far as I know there's nothing
significant.)

Post by Phillip
Although above I couldn't provide a specific example (again sorry
about that) I do have the result report from back when I was testing
out pace makers with C99 over C90 (2007) and the process found that
with C99 the processor would be behind by around 500 cycles within
19h 41m 19s from program start. This had a +/- of 12 cycles
differential with repeat testing. That would mean the heart would
miss a beat ever 17 days.

The most likely difference is mentioned by David Brown in the post
above.

int div8(int x) { return x/8; }

C90 compiler can turn a division into arithmetic right shift. C99
compiler can not do it. If compiler wants to avoid division, it will
have to generate more elaborate sequence.

In case of gcc, it does not apply because gcc follows c99 rules
regardless of requested language standard. But it can be the case for
other compilers.

Phillip

2025-01-09 17:40:46 UTC

Reply

Post by Michael S
On Wed, 08 Jan 2025 21:34:37 -0800

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer.
Not that you can't do 8-bit on C99 but it's just not designed
as well for it since C99 assumes you've moved on to at least
16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an
implementation with 8-bit int would be non-conforming under any
edition of the standard, though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit
systems than some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific instructions
generated by the compiler are different between c90 and c99 where
TOE's matter (timing of execution). For example, there are cases
(sorry I don't have examples because it's been a long time since
I've gone through this) where c99, in order to be more efficient,
will output a different set of instructions, but in certain cases,
those instructions, while more efficient, take longer to process on
the CPU or microcontroller. Whereas C89 and C90 may be more
inefficient but the instructions execute faster. It might only be
that C99 adds an extra 1-3 clock cycles, and in most cases this
isn't a problem or noticeable. But when you are dealing with
devices that are responsible for keeping a human alive (such as a
pace maker) the extra cycles can add up over time and will cause
trouble down the road. So this was the purpose behind my point of
reference earlier was just to say, that there are niche cases where
the statement that was made, wouldn't be accurate.

Are you saying that, for example, "gcc -std=c90" and "gcc -std=c99"
are generating different instruction sequences for the same code,
with the same version of gcc in both cases?
Hmm. I can't think of anything in the changes from C90 to C99 that
would necessarily cause that kind of thing. Unless I'm missing
something, it's not C99 that results in the "more efficient"
instructions, it's the behavior of the compiler in C99 mode.
It could as easily have been the other way around.

Post by Phillip
For pace makers the GNU GCC implementation was used and for the
smart prosthetic the CLANG implementation was used. GCC was using
C90 and CLANG was using C89 (ANSI).

Note that C89 and C90 are exactly the same language. The 1990 ISO C
standard is identical to the C89 standard, except for some
introductory sections introduced by ISO. (I've heard vague rumors of
some other differences, but as far as I know there's nothing
significant.)

Post by Phillip
Although above I couldn't provide a specific example (again sorry
about that) I do have the result report from back when I was testing
out pace makers with C99 over C90 (2007) and the process found that
with C99 the processor would be behind by around 500 cycles within
19h 41m 19s from program start. This had a +/- of 12 cycles
differential with repeat testing. That would mean the heart would
miss a beat ever 17 days.

The most likely difference is mentioned by David Brown in the post
above.
int div8(int x) { return x/8; }
C90 compiler can turn a division into arithmetic right shift. C99
compiler can not do it. If compiler wants to avoid division, it will
have to generate more elaborate sequence.
In case of gcc, it does not apply because gcc follows c99 rules
regardless of requested language standard. But it can be the case for
other compilers.

I'd have to go back and look. There isn't that much division in pace
makers but there is some nuance there that is likely the case for more
then just division.

--
Phillip Frabott
----------
- Adam: Is a void really a void if it returns?
- Jack: No, it's just nullspace at that point.
----------

David Brown

2025-01-09 13:53:56 UTC

Reply

Post by Phillip

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an implementation with
8-bit int would be non-conforming under any edition of the standard,
though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit systems than
some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific instructions
generated by the compiler are different between c90 and c99 where TOE's
matter (timing of execution). For example, there are cases (sorry I
don't have examples because it's been a long time since I've gone
through this) where c99, in order to be more efficient, will output a
different set of instructions, but in certain cases, those instructions,
while more efficient, take longer to process on the CPU or
microcontroller. Whereas C89 and C90 may be more inefficient but the
instructions execute faster. It might only be that C99 adds an extra 1-3
clock cycles, and in most cases this isn't a problem or noticeable. But
when you are dealing with devices that are responsible for keeping a
human alive (such as a pace maker) the extra cycles can add up over time
and will cause trouble down the road. So this was the purpose behind my
point of reference earlier was just to say, that there are niche cases
where the statement that was made, wouldn't be accurate.
For pace makers the GNU GCC implementation was used and for the smart
prosthetic the CLANG implementation was used. GCC was using C90 and
CLANG was using C89 (ANSI).
Although above I couldn't provide a specific example (again sorry about
that) I do have the result report from back when I was testing out pace
makers with C99 over C90 (2007) and the process found that with C99 the
processor would be behind by around 500 cycles within 19h 41m 19s from
program start. This had a +/- of 12 cycles differential with repeat
testing. That would mean the heart would miss a beat ever 17 days.

I'm sorry, none of that makes /any/ sense at all to me.

Different compilers and different compiler versions may emit different
instructions with different timings - that is independent of the C
standard version. There is almost no code you can write that is valid
C89/C90 (C89 so-called "ANSI C" and C90 ISO C are identical apart from
the numbering of the chapters) and also valid C99, but that has
different semantics. There were changes in the types of certain integer
constants, and division in C99 with negative values is defined to be
"truncate towards zero", while in C90 an implementation could
alternatively choose "truncate towards negative infinity". However, it
is highly unlikely that a single compiler that supports both C90 and C99
modes would use different signed integer algorithms in the different modes.

It sounds more likely that you are simply using different compilers -
and then it is not surprising that there are differences in the
generated instructions. It's like comparing a green car with a red bus
and concluding that green things go faster than red things.

I also note that you earlier said you used a 6502 on these devices -
neither gcc nor clang have ever supported the 6502 as a target.

And if your pacemaker is relying on the timing of instructions generated
by a C compiler for the timing of heart beats, then you are doing the
whole thing /completely/ wrong. It doesn't matter what compiler and
what language you are using, that's not how you handle important timing
in embedded systems.

Then you talk about missing a heart beat every 17 days. I presume you
realise how absurd that is? Just considering the timing alone, that's a
accuracy of about 0.7 ppm (parts per million) - about a thousand times
more accurate than common oscillators used with small microcontrollers.
It takes a fairly sophisticated and expensive timing circuit to reach
those levels (though in a pacemaker you have the benefit of a stable
environment temperature). But this is not just measuring time for the
sake of it - you are talking about heart beats. If your system is
running 0.7 ppm slower than before, the heart will not skip beats - it
will beat at a rate 0.7 ppm slower, and that will make not the slightest
difference. I am not a doctor (though I have worked on heart rate
measurement devices), but I would not expect any medically significant
differences before you were at least a few percent out on the timing.
In a normal healthy heart, beat-to-beat variation can be 10% while still
considered "normal". In fact, if it is /not/ varying by a few percent,
it's an indication of serious health problems.

But I don't expect anyone will want to use AVL trees in a pacemaker
anyway :-)

Michael S

2025-01-09 15:27:31 UTC

Reply

On Thu, 9 Jan 2025 14:53:56 +0100

Post by David Brown

Post by Phillip

Post by Phillip

Post by Keith Thompson
[...]

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer.
Not that you can't do 8-bit on C99 but it's just not designed
as well for it since C99 assumes you've moved on to at least
16-bit.

There were no changes in the sizes of the integer types from C89/C90 to
C99, aside from the addition of long long. (And an
implementation with 8-bit int would be non-conforming under any
edition of the standard, though it might be useful.)
Perhaps some C89/C90 implementations are better for 8-bit
systems than some C90 implementations?

Yes, this is what I was saying.

I'm curious about the details. What C89/C90 implementation are
you using, and what features make it more suitable for 8-bit
systems? (Any useful extensions could be applied to a C99 or
later implementation. It sounds like the implementer just hasn't
done that.)

Generally this only applies to use cases where specific
instructions generated by the compiler are different between c90
and c99 where TOE's matter (timing of execution). For example,
there are cases (sorry I don't have examples because it's been a
long time since I've gone through this) where c99, in order to be
more efficient, will output a different set of instructions, but in
certain cases, those instructions, while more efficient, take
longer to process on the CPU or microcontroller. Whereas C89 and
C90 may be more inefficient but the instructions execute faster. It
might only be that C99 adds an extra 1-3 clock cycles, and in most
cases this isn't a problem or noticeable. But when you are dealing
with devices that are responsible for keeping a human alive (such
as a pace maker) the extra cycles can add up over time and will
cause trouble down the road. So this was the purpose behind my
point of reference earlier was just to say, that there are niche
cases where the statement that was made, wouldn't be accurate.
For pace makers the GNU GCC implementation was used and for the
smart prosthetic the CLANG implementation was used. GCC was using
C90 and CLANG was using C89 (ANSI).
Although above I couldn't provide a specific example (again sorry
about that) I do have the result report from back when I was
testing out pace makers with C99 over C90 (2007) and the process
found that with C99 the processor would be behind by around 500
cycles within 19h 41m 19s from program start. This had a +/- of 12
cycles differential with repeat testing. That would mean the heart
would miss a beat ever 17 days.

I'm sorry, none of that makes /any/ sense at all to me.
Different compilers and different compiler versions may emit
different instructions with different timings - that is independent
of the C standard version. There is almost no code you can write
that is valid C89/C90 (C89 so-called "ANSI C" and C90 ISO C are
identical apart from the numbering of the chapters) and also valid
C99, but that has different semantics. There were changes in the
types of certain integer constants, and division in C99 with negative
values is defined to be "truncate towards zero", while in C90 an
implementation could alternatively choose "truncate towards negative
infinity". However, it is highly unlikely that a single compiler
that supports both C90 and C99 modes would use different signed
integer algorithms in the different modes.
It sounds more likely that you are simply using different compilers -
and then it is not surprising that there are differences in the
generated instructions. It's like comparing a green car with a red
bus and concluding that green things go faster than red things.
I also note that you earlier said you used a 6502 on these devices -
neither gcc nor clang have ever supported the 6502 as a target.
And if your pacemaker is relying on the timing of instructions
generated by a C compiler for the timing of heart beats, then you are
doing the whole thing /completely/ wrong. It doesn't matter what
compiler and what language you are using, that's not how you handle
important timing in embedded systems.
Then you talk about missing a heart beat every 17 days. I presume
you realise how absurd that is? Just considering the timing alone,
that's a accuracy of about 0.7 ppm (parts per million) - about a
thousand times more accurate than common oscillators used with small
microcontrollers.

Off topic nitpick:
0.7 ppm * 1000 = 700ppm.
Last time I looked, 50-70 ppm crystal oscillators were cheaper than less
precise parts.
Alternatively for cheap crystal-less clock source the expected
precision is closer to 7000 ppm than 700.

Post by David Brown
It takes a fairly sophisticated and expensive
timing circuit to reach those levels (though in a pacemaker you have
the benefit of a stable environment temperature). But this is not
just measuring time for the sake of it - you are talking about heart
beats. If your system is running 0.7 ppm slower than before, the
heart will not skip beats - it will beat at a rate 0.7 ppm slower,
and that will make not the slightest difference. I am not a doctor
(though I have worked on heart rate measurement devices), but I would
not expect any medically significant differences before you were at
least a few percent out on the timing. In a normal healthy heart,
beat-to-beat variation can be 10% while still considered "normal".
In fact, if it is /not/ varying by a few percent, it's an indication
of serious health problems.

I tried to use provided numbers to estimate a clock rate of Phillip's
processor. Got 10 KHz. I am aware of ultra-low-power CPUs that run at
32KHz, but never encountered anything slower than that.

Post by David Brown
But I don't expect anyone will want to use AVL trees in a pacemaker
anyway :-)

I like AVL trees. Not 100% sure how exactly I can apply them in pace
makers, but I'd certainly try my best.

David Brown

2025-01-09 12:15:01 UTC

Reply

Post by Phillip

Post by David Brown
For people using 8051, COP8, 68HC05, PIC16 or other long outdated
brain- dead microcontrollers, you don't get standard C support at all.
You program these in a device-specific variant of C full of extensions
and extra restrictions - and the support is as close to the subset of
C99 that I described as it is to standard C90.

Just a point of reference, there are still several "brain-dead" systems
in modern use today that aren't old, some being invested as late as
2019.

There are very few new versions of 8-bit CISC microcontrollers being
developed - the rate has been dropping steadily since the Cortex M
family rose to dominance. The decay is probably approximately
exponential, in terms of development of new devices, production and sale
of existing devices, and development of new products using these
devices. They are not /all/ gone - but you need /very/ good reason for
considering them for anything new, whether it is software or hardware.

The only 8-bit core that can be viewed as "alive and well", is the AVR.
It is a RISC design and a good deal more C and general software friendly
than these other devices. The most common toolchain for the AVR is gcc,
and you can use normal standard C with it. You do have to consider some
of its idiosyncrasies if you want efficient results, but you /can/ write
normal C code for it.

Post by Phillip
That being said, your comment isn't completely accurate in that,
there are some modern uses of things like the 6502 that can use
standards-based C. In fact, you can use ANSI C89 and C90 with the 6502.

The 6502 is one of the better 8-bit cpu cores - you can get reasonable
code with mostly standard C (C99 if you like) using a compiler like
SDCC. You don't /need/ as many extra target-specific keywords and
extensions to deal with multiple memory banks, independent address
spaces for ram and constant data, long and short pointers, non-recursive
functions, and so on, as you need for useable results on an 8051 or PIC.
But you probably still use some of these to get efficient results,
such as choosing which variables go in the fast zero page area.

Also note that there is a huge difference between being able to write
significant parts of the code for a microcontroller using only standard
C, and being able to use a significant fraction of the standard C
language and library (at least the "freestanding" part) in your code.
For example, I've used a compiler for the PIC16 that supported structs
and arrays, but not arrays of structs or structs containing arrays. If
the code does not use these features, you can write it in standard C -
but plenty of perfectly ordinary standard C code would not work with
that compiler.

There is also a huge difference between being /able/ to write the code
in only standard C, and pure standard C being an appropriate way to
write the code for the microcontroller. Using the target-specific
features of these kinds of devices and their tools makes a massive
difference to code efficiency. Basically, if you are trying to stick to
pure standard C for an 8051 or 68HC05, you are doing it wrong.

Post by Phillip
I've done this for several modern pace makers as well as a smart
prosthetic. So your statement is correct in 90% of cases but not all cases.

I believe that goes under the category of "niche" :-) For some types of
application, you stick to what you have tested - no one wants to have
the first pacemaker with a new microcontroller!

Of course in this field there are always exceptions - no generalisation
is going to be correct in absolutely all cases. But I'd guess my
statement was accurate in 99.9% or more cases.

Post by Phillip
(Also most car manufacturer's use the 6502 and other variants for their
digital input analog gauges and warning light controls on their dashboards.)

Have you a reference for that claim? I am very confident that it is not
the case. The 6502 was primarily developed as a microprocessor core,
not a microcontroller core - it was found in early microcomputers and
games consoles. It was also used in early embedded systems, but once
microcontrollers became common, they dominated quickly in numbers.
(Microprocessors were used for high-end embedded systems, like embedded
PC's with x86 cpus and network equipment with m68k processors.) I don't
know that the 6502 was ever common in the automotive industry - but I
can't believe it was ever used by "most" car manufacturers.

Post by Phillip
C89 and C90 are better for 8-bit systems then C99 and newer. Not that
you can't do 8-bit on C99 but it's just not designed as well for it
since C99 assumes you've moved on to at least 16-bit.

My point is that almost no general-purpose software written now will
ever be used on 8-bit devices, especially not 8-bit CISC cores. For
most software written on these cores, standard C is not a practical
option anyway. And the cores are used in very conservative situations,
such as when there is a lot of legacy code or when many years or decades
of field testing is important - new external software will not be used
by such developers. So it makes no sense to me to restrict the code to
an old standard because of the /tiny/ chance that there is someone who
might want to use it on such devices.

I also disagree completely that C90 is somehow a better fit for 8-bit
devices than C99. It is not. C has /always/ expected at least 16-bit -
it's not something new in C99. Apart from VLA's, there is nothing in
C99 that would result in code generation that is a poorer fit for these
devices than C90. There are some sort-of-C compilers for 8-bit
microcontrollers that have 8-bit ints or do not follow the C rules for
integer promotions - these are, of course, no more standard C90
compilers than they are standard C99 compilers.

Post by Phillip
But this is all based on the OP's specific use case for their
application. I just wanted to chime in since I do primarily work on
modern embedded systems that don't use "modern" microcontrollers and
CPU's since they are still used in a wide range of modern devices that
people don't even realize.

Keith Thompson

2025-01-08 22:12:43 UTC

Reply

David Brown <***@hesbynett.no> writes:
[...]

Post by David Brown
There are basically two classes of small embedded devices - those that
are usually programmed with gcc (and sometimes clang, and occasionally
vastly expensive commercial tools), and those that are programmed
using non-standard, limited and often expensive sort-of-C compilers.
For people using gcc, clang, Green Hills, Code Warrior, or other
quality tools on a 16-bit or 32-bit microcontroller, C99 is not a
problem. C23 is not a problem for the most popular toolchain for the
most popular microcontroller core.

[...]

It's also worth mentioning that the standard specifies two kinds of
implementations, "hosted" and "freestanding".

A hosted implementation must provide the entire standard library (except
for parts that are explicitly optional). The program entry point is
"main". Larger embedded systems (for example, Linux-based systems)
often have "hosted" implementations.

In a freestanding implementation, most of the standard library need not
be provided. Library facilities are implementation-defined, as is the
program entry point. Freestanding implementations generally target
systems with no operating system. There might not be a malloc()
function.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Chris M. Thomasson

2025-01-08 23:00:07 UTC

Reply

Post by Keith Thompson
[...]

Post by David Brown
There are basically two classes of small embedded devices - those that
are usually programmed with gcc (and sometimes clang, and occasionally
vastly expensive commercial tools), and those that are programmed
using non-standard, limited and often expensive sort-of-C compilers.
For people using gcc, clang, Green Hills, Code Warrior, or other
quality tools on a 16-bit or 32-bit microcontroller, C99 is not a
problem. C23 is not a problem for the most popular toolchain for the
most popular microcontroller core.

[...]
It's also worth mentioning that the standard specifies two kinds of
implementations, "hosted" and "freestanding".
A hosted implementation must provide the entire standard library (except
for parts that are explicitly optional). The program entry point is
"main". Larger embedded systems (for example, Linux-based systems)
often have "hosted" implementations.
In a freestanding implementation, most of the standard library need not
be provided. Library facilities are implementation-defined, as is the
program entry point. Freestanding implementations generally target
systems with no operating system. There might not be a malloc()
function.

A long time ago (several decades) I was working with a system that had
no malloc. Iirc, it was a version of Quadros.

David Brown

2025-01-09 13:58:25 UTC

Reply

Post by Keith Thompson
[...]

Post by David Brown
There are basically two classes of small embedded devices - those that
are usually programmed with gcc (and sometimes clang, and occasionally
vastly expensive commercial tools), and those that are programmed
using non-standard, limited and often expensive sort-of-C compilers.
For people using gcc, clang, Green Hills, Code Warrior, or other
quality tools on a 16-bit or 32-bit microcontroller, C99 is not a
problem. C23 is not a problem for the most popular toolchain for the
most popular microcontroller core.

[...]
It's also worth mentioning that the standard specifies two kinds of
implementations, "hosted" and "freestanding".

Yes - I mentioned that in one of my posts. It is quite common for
compilers for embedded systems to be missing a few standard library
features (such as support for locales or wide characters), to have
limited features (such as printf / scanf implementations without
floating point, to reduce the code size), and to have somewhat odd
system-specific implementations of things like file and stream IO functions.

However, all embedded toolchains I have ever used implement a
substantial part of the standard library - pretty much everything that
can reasonably be implemented and used on the embedded targets.

Post by Keith Thompson
A hosted implementation must provide the entire standard library (except
for parts that are explicitly optional). The program entry point is
"main". Larger embedded systems (for example, Linux-based systems)
often have "hosted" implementations.
In a freestanding implementation, most of the standard library need not
be provided. Library facilities are implementation-defined, as is the
program entry point. Freestanding implementations generally target
systems with no operating system. There might not be a malloc()
function.

Julio Di Egidio

2025-01-09 08:07:18 UTC

Reply

<snip>

So you can be confident that almost anyone
using your software in embedded systems will be using a 32-bit core -
most likely an ARM Cortex-M, but possibly RISC-V. And they will
probably be using a toolchain that supports at least C17 (some people
are still on older toolchains), whether it is gcc, clang, or commercial.
Certainly solid C99 support is guaranteed. Everything else is niche,
and no one will be using your software on niche systems.

Even my fridge should be able to run it... I am writing a Prolog
compiler, but more generally I'd be mostly writing algorithms-data
structures things.

That said, one thing nobody has been explaining is why C99 is superior
to C89/C90, except for some coding conveniences as far as I have read
online: and I must say here that I do prefer the good old ways,
including style-wise, in most cases...

But, more concretely, what I do not understand of your reply is, OK the
plethora of architectures and compilers and languages, but I cannot even
begin to cope with that, can I, and why should I when I can e.g. just
have a "config" header (up to even a pre-preprocessor or whatever pre or
post-transformations are needed) where I re-define "malloc" or even
"int" as I like for a specific target?

The underlying idea being I won't care at all, I just pick a reasonable
and reasonably standard variant of the C language as base, and I just
distribute source code, the user must compile it.

-Julio

Julio Di Egidio

2025-01-09 09:07:11 UTC

Reply

Post by Julio Di Egidio
<snip>

So you can be confident that almost anyone using your software in
embedded systems will be using a 32-bit core - most likely an ARM
Cortex-M, but possibly RISC-V. And they will probably be using a
toolchain that supports at least C17 (some people are still on older
toolchains), whether it is gcc, clang, or commercial. Certainly
solid C99 support is guaranteed. Everything else is niche, and no one
will be using your software on niche systems.

Even my fridge should be able to run it... I am writing a Prolog
compiler, but more generally I'd be mostly writing algorithms-data
structures things.
That said, one thing nobody has been explaining is why C99 is superior
to C89/C90, except for some coding conveniences as far as I have read
online: and I must say here that I do prefer the good old ways,
including style-wise, in most cases...
But, more concretely, what I do not understand of your reply is, OK the
plethora of architectures and compilers and languages, but I cannot even
begin to cope with that, can I, and why should I when I can e.g. just
have a "config" header (up to even a pre-preprocessor or whatever pre or
post-transformations are needed) where I re-define "malloc" or even
"int" as I like for a specific target?
The underlying idea being I won't care at all, I just pick a reasonable
and reasonably standard variant of the C language as base, and I just
distribute source code, the user must compile it.

P.S. Of course, I do not expect my code to be super-optimized either, I
only strive for the best I can get by "generic programming", again the
idea being that at least micro-optimization (very local, very low-level)
remains possible by ad-hoc program transformations: or e.g. I'd code
against a "generic" math library that is a thin wrapper by default
around a standard library, but can then be "relinked/redirected" to
anything at compile time... Of course, assuming the code is written to
be conducive of such "parametricity", I do not expect it to come for free.

Does it make sense? Does it work?

-Julio

David Brown

2025-01-09 14:11:28 UTC

Reply

Post by Julio Di Egidio
<snip>

So you can be confident that almost anyone using your software in
embedded systems will be using a 32-bit core - most likely an ARM
Cortex-M, but possibly RISC-V. And they will probably be using a
toolchain that supports at least C17 (some people are still on older
toolchains), whether it is gcc, clang, or commercial. Certainly
solid C99 support is guaranteed. Everything else is niche, and no one
will be using your software on niche systems.

Even my fridge should be able to run it... I am writing a Prolog
compiler, but more generally I'd be mostly writing algorithms-data
structures things.
That said, one thing nobody has been explaining is why C99 is superior
to C89/C90, except for some coding conveniences as far as I have read
online: and I must say here that I do prefer the good old ways,
including style-wise, in most cases...

The C99 features that I consider to make code easier to write, clearer,
safer, more portable, more efficient, and generally better are:

long long int (via <stdint.h> types)
compound literals
designated initialisers
// comments
<stdint.h> types
mixing declaration and code
declaring index variables inside a "for" statement
variadic macros
inline functions
boolean type and <stdbool.h>

(The ordering here is from the changelog in the C standards.)

There are also a few cleanups, like removal of implicit int and implicit
function declaration, but those can be enforced in C90 mode too by
static analysis tools.

bart

2025-01-09 15:28:25 UTC

Reply

Post by David Brown

Post by Julio Di Egidio
<snip>

So you can be confident that almost anyone using your software in
embedded systems will be using a 32-bit core - most likely an ARM
Cortex-M, but possibly RISC-V. And they will probably be using a
toolchain that supports at least C17 (some people are still on older
toolchains), whether it is gcc, clang, or commercial. Certainly
solid C99 support is guaranteed. Everything else is niche, and no
one will be using your software on niche systems.

Even my fridge should be able to run it... I am writing a Prolog
compiler, but more generally I'd be mostly writing algorithms-data
structures things.
That said, one thing nobody has been explaining is why C99 is superior
to C89/C90, except for some coding conveniences as far as I have read
online: and I must say here that I do prefer the good old ways,
including style-wise, in most cases...

The C99 features that I consider to make code easier to write, clearer,
    compound literals
    designated initialisers
    mixing declaration and code
    variadic macros

Funny, I usually find code using such features less clear!

They would also make programs a little less portable, since they now
rely on an implementation that includes support.

David Brown

2025-01-09 20:39:30 UTC

Reply

Post by David Brown

Post by Julio Di Egidio
<snip>

So you can be confident that almost anyone using your software in
embedded systems will be using a 32-bit core - most likely an ARM
Cortex-M, but possibly RISC-V. And they will probably be using a
toolchain that supports at least C17 (some people are still on older
toolchains), whether it is gcc, clang, or commercial. Certainly
solid C99 support is guaranteed. Everything else is niche, and no
one will be using your software on niche systems.

Even my fridge should be able to run it... I am writing a Prolog
compiler, but more generally I'd be mostly writing algorithms-data
structures things.
That said, one thing nobody has been explaining is why C99 is
superior to C89/C90, except for some coding conveniences as far as I
have read online: and I must say here that I do prefer the good old
ways, including style-wise, in most cases...

The C99 features that I consider to make code easier to write,
     compound literals
     designated initialisers
     mixing declaration and code
     variadic macros

Funny, I usually find code using such features less clear!

Opinions on clarity vary - I was giving /my/ opinion.

Post by bart
They would also make programs a little less portable, since they now
rely on an implementation that includes support.

I explained elsewhere how C99 is at least as portable as C90 in real
life - no compiler of relevance can be used with standard C90 and not
with standard C99 (perhaps excluding a few features not on my list -
VLAs, complex numbers and wide character support). And even if you use
nothing else from C99, use of <stdint.h> types improves portability
significantly.

Julio Di Egidio

2025-01-10 14:30:39 UTC

Reply

Post by David Brown

Post by David Brown

Post by Julio Di Egidio
<snip>

So you can be confident that almost anyone using your software in
embedded systems will be using a 32-bit core - most likely an ARM
Cortex-M, but possibly RISC-V. And they will probably be using a
toolchain that supports at least C17 (some people are still on
older toolchains), whether it is gcc, clang, or commercial.
Certainly solid C99 support is guaranteed. Everything else is
niche, and no one will be using your software on niche systems.

Even my fridge should be able to run it... I am writing a Prolog
compiler, but more generally I'd be mostly writing algorithms-data
structures things.
That said, one thing nobody has been explaining is why C99 is
superior to C89/C90, except for some coding conveniences as far as I
have read online: and I must say here that I do prefer the good old
ways, including style-wise, in most cases...

The C99 features that I consider to make code easier to write,
     compound literals
     designated initialisers
     mixing declaration and code
     variadic macros

Funny, I usually find code using such features less clear!

Opinions on clarity vary - I was giving /my/ opinion.

Post by bart
They would also make programs a little less portable, since they now
rely on an implementation that includes support.

I explained elsewhere how C99 is at least as portable as C90 in real
life - no compiler of relevance can be used with standard C90 and not
with standard C99 (perhaps excluding a few features not on my list -
VLAs, complex numbers and wide character support). And even if you use
nothing else from C99, use of <stdint.h> types improves portability
significantly.

OK, I start getting the picture: indeed, let it be C99... Thanks very
much for that, to you in particular but also to those who have chimed
in, greatly appreciated.

-Julio

James Kuyper

2025-01-08 19:10:27 UTC

Reply

...

Post by Julio Di Egidio

Post by David Brown
People who say they want to write strictly
standards-conforming code, especially C90, so that it will run
everywhere, misunderstand the relationship between the C standards and
real-world tools.

So, now that I have qualified it with "any device coming with a C
compiler (that is not too broken)", would you think coding it in "ANSI
C" makes some sense?

I would agree with what you wrote, but probably not with what you meant.
The first C standard, C89, was approved by ANSI. Later on, almost
exactly the same standard was approved as C90 by ISO. They had to add
three sections at the beginning to meet ISO requirements on how
standards are organized. The result is that every section number from
C89 corresponds to a section number 3 higher in C90.
Since that time, every new version of the C standard has first been
adopted by ISO, and then approved without changes by ANSI. Both
organizations have a policy that the new version of a standard replaces
the old one, which is no longer in effect. Therefore, ANSI C should,
properly, refer to the current latest version of C that has been adopted
by ANSI, which is C2023. I suspect that you were using "ANSI C" to refer
to C89.

Keith Thompson

2025-01-08 20:25:18 UTC

Reply

Post by James Kuyper
...

Post by Julio Di Egidio

Post by David Brown
People who say they want to write strictly
standards-conforming code, especially C90, so that it will run
everywhere, misunderstand the relationship between the C standards and
real-world tools.

So, now that I have qualified it with "any device coming with a C
compiler (that is not too broken)", would you think coding it in "ANSI
C" makes some sense?

I would agree with what you wrote, but probably not with what you meant.
The first C standard, C89, was approved by ANSI. Later on, almost
exactly the same standard was approved as C90 by ISO. They had to add
three sections at the beginning to meet ISO requirements on how
standards are organized. The result is that every section number from
C89 corresponds to a section number 3 higher in C90.
Since that time, every new version of the C standard has first been
adopted by ISO, and then approved without changes by ANSI. Both
organizations have a policy that the new version of a standard replaces
the old one, which is no longer in effect. Therefore, ANSI C should,
properly, refer to the current latest version of C that has been adopted
by ANSI, which is C2023. I suspect that you were using "ANSI C" to refer
to C89.

Agreed -- but the term "ANSI C", though it logically should refer to C23
(I presume ANSI has adopted it by now), is almost universally understood
to refer to C89/C90. See, for example, gcc's "-ansi" option. <OT>For
g++, "-ansi" means "-std=c++98" or "-std=c++03".</OT>

I find it's best to avoid any ambiguity and avoid the term "ANSI C", and
instead refer to ISO C90, C99, C11, etc.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Kaz Kylheku

2025-01-08 16:24:45 UTC

Reply

Post by David Brown

Post by Julio Di Egidio
Hi everybody,
indeed I have forgotten so many things, including
how much I love this language. :)
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

What devices do not have at least C99 compilers available - and yet /do/
have standard C90 compilers available? What sort of code are you
writing that should ideally run on an AVR Tiny with 2K of flash and 64
bytes of ram, a DSP with 24-bit chars, and a TOP100 supercomputer? Have
you thought about this in more detail?

I suspect that C90 is mainly related to retrocomputing now. Programs
written in C90 will compile in installations of old operating systems.
These maybe actual old installations on the original hardware, or
historic installations re-created by retrocomputing enthusiasts on the
original hardware or simulated hardware.

There is probably a bit of legacy code out there requiring C90 support
in a compiler, but maintenance on that code doesn't have to continue
in C90.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Keith Thompson

2025-01-08 20:08:21 UTC

Reply

David Brown <***@hesbynett.no> writes:
[...]

Post by David Brown
int v = 123; // Non-const object definition
const int * cp = &v; // Const pointer to non-const data

cp isn't a const pointer, i.e., it's not a pointer object whose value
cannot be changed. It's a pointer to const, specifically a pointer to
const int. You could (and arguably should) make the pointer itself
const by defining:

const int *const cp = &v;

v, as you point out, is a non-const object. *cp provides access to that
object in a way that forbids changing the target via that pointer.

Post by David Brown
int * p = (int *) cp; // Cast to non-const pointer
*p = 456; // Change the target data

You could also write:

*(int*)p = 456;

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Andrey Tarasevich

2025-01-08 16:48:53 UTC

Reply

Post by Julio Di Egidio
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
   to place any const declarations in read-only storage,
   so if you attempt to hack around the const blocks,
   you could get undefined behavior. >>

Strictly speaking, the passage is misleading. It dues not matter whether
the compiler decided to place const data into read-only storage. If you
"hack around" data constness (i.e. if you attempt to modify const data),
you _always_ get undefined behavior, regardless of where the data is
actually stored.

Post by Julio Di Egidio
I do not understand if just declaring that a pointer
is to constant data may incur in that problem

It can't. The original quoted passage is obviously meant to be about
constant data, i.e. about _top-level_ constness.

Meanwhile, pointer declared as "pointer to constant data" is simply a
constant _access path_ to some data. Constness of an access path does
not imply constness of the actual data that path leads to. Forcefully
removing constness from the access path and subsequently modifying the
pointed data is perfectly legal (if inelegant) and causes no undefined
behavior. Of course, this is only valid when the data itself is not const.

Post by Julio Di Egidio
E.g. consider this little internal helper of mine
(which implements an interface that is public to
do an internal thing...), where I am casting to
pointer to non-constant data in order to free the
```c
static int MyStruct_free_(MyStruct_t const *pT) {
    assert(pT);
    free((MyStruct_t *)pT);
    return 0;
}
```
Assuming, as said, that the data was originally
allocated with malloc, is that code safe or
something can go wrong even in that case?

It is perfectly safe. One can even argue that standard declaration if
`free` as `void free(void *)` is defective. It should have been `void
free(const void *)` from the very beginning.

--
Best regards,
Andrey

Tim Rentsch

2025-01-08 20:24:47 UTC

Reply

Post by Julio Di Egidio
Assuming, as said, that the data was originally
allocated with malloc, is [calling free on a pointer
to const something] safe or something can go wrong
even in that case?

It is perfectly safe. One can even argue that standard declaration if
free` as `void free(void *)` is defective. It should have been `void
free(const void *)` from the very beginning.

I think declaring the parameter as 'void *' rather than 'const void *'
is a better choice. There is a fair chance that calling free() on a
pointer to const anything is a programming error, and it would be good
to catch that. If it isn't an error, then fixing the diagnostic is
trivial. If it's a common pattern one could even define an inline
function

static inline void
cfree( const void *p ){ free( (void*)p ); }

and call that instead of free(). (Obviously the 'inline' should be
taken out if compiling as C90.)

Keith Thompson

2025-01-08 21:01:15 UTC

Reply

Post by Andrey Tarasevich

Post by Julio Di Egidio
To the question, I was reading this, but I am not
Matt Stancliff, "So You Think You Can Const?",
<https://matt.sh/sytycc>
<< Your compiler, at its discretion, may also choose
   to place any const declarations in read-only storage,
   so if you attempt to hack around the const blocks,
   you could get undefined behavior. >>

Strictly speaking, the passage is misleading. It dues not matter
whether the compiler decided to place const data into read-only
storage. If you "hack around" data constness (i.e. if you attempt to
modify const data), you _always_ get undefined behavior, regardless of
where the data is actually stored.

And one possible result of undefined behavior (in some sense perhaps
the worst) is that the code behaves just as you expected it to.

The author of the article likely thought of "undefined behavior" as
"the program crashes" or "something goes terribly wrong". In fact
undefined behavior is simply behavior that is not defined; the C
standard says nothing about what happens.

And if the manifestation of that undefined behavior is that the
code quietly does what you thought it would do, it could mean that
you have a latent bug that's difficult to track down, and that will
come back and bite you later.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Kenny McCormack

2025-01-08 21:32:00 UTC

Reply

In article <***@nosuchdomain.example.com>,
Keith Thompson <Keith.S.Thompson+***@gmail.com> wrote:
...

Post by Keith Thompson
The author of the article likely thought of "undefined behavior" as
"the program crashes" or "something goes terribly wrong". In fact
undefined behavior is simply behavior that is not defined; the C
standard says nothing about what happens.
And if the manifestation of that undefined behavior is that the
code quietly does what you thought it would do, it could mean that
you have a latent bug that's difficult to track down, and that will
come back and bite you later.

Or it could mean that you are covered by some higher, more powerful
standard, such as POSIX. Note that a lot of perfectly good,
POSIX-compliant code is UB if you take the view that the C standard is your
only coverage.

Similarly, lots of perfectly good Linux code is UB if viewed through the
lens of the POSIX standards. My point is that it is perfectly fine to rely
on higher/better standards, provided, of course, that you correctly label
your product (make it clear that you are covered by policies in addition to
and superior to the ordinary C standards).

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/BestCLCPostEver

Julio Di Egidio

2025-01-09 08:12:06 UTC

Reply

<snip>

Post by Andrey Tarasevich

Post by Julio Di Egidio
E.g. consider this little internal helper of mine
(which implements an interface that is public to
do an internal thing...), where I am casting to
pointer to non-constant data in order to free the
```c
static int MyStruct_free_(MyStruct_t const *pT) {
     assert(pT);
     free((MyStruct_t *)pT);
     return 0;
}
```
Assuming, as said, that the data was originally
allocated with malloc, is that code safe or
something can go wrong even in that case?

It is perfectly safe. One can even argue that standard declaration if
`free` as `void free(void *)` is defective. It should have been `void
free(const void *)` from the very beginning.

I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

-Julio

Keith Thompson

2025-01-09 11:21:23 UTC

Reply

[...]

Post by Julio Di Egidio

Post by Andrey Tarasevich
It is perfectly safe. One can even argue that standard declaration
if `free` as `void free(void *)` is defective. It should have been
`void free(const void *)` from the very beginning.

I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

No, `free` doesn't (necessarily) change the pointed-to data.
Any attempt to access the allocated data after free() has undefined
behavior, so it might be modified, but all free() needs to do is
make it available for further allocation. It might do so without
touching the data itself.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Julio Di Egidio

2025-01-09 11:26:24 UTC

Reply

Post by Keith Thompson
[...]

Post by Julio Di Egidio

Post by Andrey Tarasevich
It is perfectly safe. One can even argue that standard declaration
if `free` as `void free(void *)` is defective. It should have been
`void free(const void *)` from the very beginning.

I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

No, `free` doesn't (necessarily) change the pointed-to data.
Any attempt to access the allocated data after free() has undefined
behavior,

I would indeed call that a change!

Anyway I see the point, thanks for explaining.

-Julio

Post by Keith Thompson
so it might be modified, but all free() needs to do is
make it available for further allocation. It might do so without
touching the data itself.

Kaz Kylheku

2025-01-09 19:47:49 UTC

Reply

Post by Keith Thompson
[...]

Post by Julio Di Egidio

Post by Andrey Tarasevich
It is perfectly safe. One can even argue that standard declaration
if `free` as `void free(void *)` is defective. It should have been
`void free(const void *)` from the very beginning.

I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

No, `free` doesn't (necessarily) change the pointed-to data.
Any attempt to access the allocated data after free() has undefined
behavior, so it might be modified, but all free() needs to do is
make it available for further allocation. It might do so without
touching the data itself.

It doesn't matter because if free were to change the pointed-to
data, that would be only wrong if the effective type were const.

The only way an allocated block acquires an effective type is
when its value is stored; it then inherits the type of lvalue
expression.

An expression of const type isn't a modifiable lvalue that could
be used to store to the object. (If an implementation allows the
modification, in spite of emitting the required diagnostic at
translation time, the behavior is then no longer defined.)

Therefore, an effective type for an allocated block cannot ever be
const-qualified (in a program that has not already run into undefined
behavior so far prior to the call to impending call to free).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Tim Rentsch

2025-01-10 01:48:08 UTC

Reply

Post by Kaz Kylheku

Post by Keith Thompson
[...]

It is perfectly safe. One can even argue that standard declaration
if `free` as `void free(void *)` is defective. It should have been
`void free(const void *)` from the very beginning.

I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

No, `free` doesn't (necessarily) change the pointed-to data.
Any attempt to access the allocated data after free() has undefined
behavior, so it might be modified, but all free() needs to do is
make it available for further allocation. It might do so without
touching the data itself.

It doesn't matter because if free were to change the pointed-to
data, that would be only wrong if the effective type were const.
[...]

Effective type is irrelevant. This particular undefined behavior
occurs only when an object _defined_ with a const-qualified type
is changed. There are no objects defined in a block of memory
allocated by malloc().

Andrey Tarasevich

2025-01-10 06:01:53 UTC

Reply

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

`free` is destroying the pointed data.

Every object in C object model has to be created (when its lifetime
begins) and has to be eventually destroyed (when its lifetime ends).
This applies to all objects, including `const` ones (!). Lifetime of a
`const` objects also ends eventually, which means that `const` object
has to be destroyable. No way around it.

So, destruction is not really a "modifying" operation. Destruction of an
object is a special case, it's a meta-operaton, which needs special
treatment. We have to be able to destroy `const` objects as well,
regardless of how they were created/allocated. And in C destruction of
dynamically created objects is essentially embodied in `free`. So,
`free` is a meta-operation, which has to be able to destroy `const` objects.

This, for example, is mirrored in C++ as well, where `delete` is
immediately applicable to pointers to `const` objects. No reason to make
`free` behave differently in C.

--
Best regards,
Andrey

Keith Thompson

2025-01-10 07:40:52 UTC

Reply

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

`free` is destroying the pointed data.

Right. In other words, it causes the pointed-to data to reach the end
of its lifetime. "Changing" the data generally means modifying its
value (that's what "const" forbids).

Given:

int *ptr = malloc(sizeof *ptr);
*ptr = 42;
printf("*ptr = %d\n", *ptr);
free(ptr);

After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

Having said that, it's likely that such an attempt will not be
diagnosed, and that the values of ptr and *ptr will be *appear* to be
the same before and after calling free(). (Though the memory management
system might update *ptr, depending on the implementation.) But this is
outside the scope of what C defines, and there are no guarantees of
*anything*.

Post by Andrey Tarasevich
Every object in C object model has to be created (when its lifetime
begins) and has to be eventually destroyed (when its lifetime
ends). This applies to all objects, including `const` ones
(!). Lifetime of a `const` objects also ends eventually, which means
that `const` object has to be destroyable. No way around it.

An object with static storage duration (either defined with the "static"
keyword or defined at file scope) has a lifetime that ends when the
program terminates. In a typical implementation, the destruction of
such an object doesn't do anything other than deallocating its memory.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Michael S

2025-01-10 10:23:53 UTC

Reply

On Thu, 09 Jan 2025 23:40:52 -0800

Post by Keith Thompson

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

`free` is destroying the pointed data.

Right. In other words, it causes the pointed-to data to reach the end
of its lifetime. "Changing" the data generally means modifying its
value (that's what "const" forbids).
int *ptr = malloc(sizeof *ptr);
*ptr = 42;
printf("*ptr = %d\n", *ptr);
free(ptr);
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I believe that the Standard really says that, but find the part about
value of ptr variable ridiculous. It breaks natural hierarchy by which
standard library is somewhat special, but it is not above rules of core
language. free() is defined as function rather than macro. By rules of
core language, a function call can not modify the value of local
variable at caller's scope, unless pointers to the variable was passed
to it explicitly.

Post by Keith Thompson
Having said that, it's likely that such an attempt will not be
diagnosed, and that the values of ptr and *ptr will be *appear* to be
the same before and after calling free(). (Though the memory
management system might update *ptr, depending on the
implementation.) But this is outside the scope of what C defines,
and there are no guarantees of *anything*.

Post by Andrey Tarasevich
Every object in C object model has to be created (when its lifetime
begins) and has to be eventually destroyed (when its lifetime
ends). This applies to all objects, including `const` ones
(!). Lifetime of a `const` objects also ends eventually, which means
that `const` object has to be destroyable. No way around it.

An object with static storage duration (either defined with the
"static" keyword or defined at file scope) has a lifetime that ends
when the program terminates. In a typical implementation, the
destruction of such an object doesn't do anything other than
deallocating its memory.
[...]

James Kuyper

2025-01-10 13:50:07 UTC

Reply

Post by Michael S
On Thu, 09 Jan 2025 23:40:52 -0800

...

Post by Michael S

Post by Keith Thompson
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I believe that the Standard really says that, but find the part about
value of ptr variable ridiculous. It breaks natural hierarchy by which
standard library is somewhat special, but it is not above rules of core
language. free() is defined as function rather than macro. By rules of
core language, a function call can not modify the value of local
variable at caller's scope, unless pointers to the variable was passed
to it explicitly.

And that applies to free() just as much as any user-defined function.

Keep in mind that if p is a value returned by a call to a memory
management function (alloc_aligned(), malloc(), calloc() or realloc()),
the values of all pointers anywhere in the program that point at any
location in the block of memory allocated by the call to that memory
management function become indeterminate at the same time. This doesn't
mean that free() has permission to change the bit pattern stored in any
of those pointer objects. It doesn't. There's no legal way to access the
values stored in those objects; the only thing you can do with those
objects as a whole is to store a new value in them. It is, however,
legal to access the individual bytes that make up the pointer's
representation. Those bytes are themselves objects, and changing the
values stored in those bytes would therefore violate 6.2.4p2: "An object
exists, has a constant address36) , and retains its last-stored value
throughout its lifetime."

What the indeterminate value of those pointers does mean is that
implementations have permission to remove that entire block of memory
from the list of valid pointer locations. On a system which uses memory
maps, that can be as simple as modifying one entry in the memory map table.

By the way, this argument was controversial when I made it in a
discussion on this newsgroup in 2003 in a thread titled "pointer after
free indeterminate (your example James)". Yes, I am the "James" in that
title. You can look up that thread using Google Groups, if you want to
examine the arguments for and against. I don't believe that there's been
any relevant changes in the standard, but it's been two decades and
several revisions of the standard, so I could be wrong about that.

Keith Thompson

2025-01-10 18:37:46 UTC

Reply

Post by Michael S
On Thu, 09 Jan 2025 23:40:52 -0800

Post by Keith Thompson

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

`free` is destroying the pointed data.

Right. In other words, it causes the pointed-to data to reach the end
of its lifetime. "Changing" the data generally means modifying its
value (that's what "const" forbids).
int *ptr = malloc(sizeof *ptr);
*ptr = 42;
printf("*ptr = %d\n", *ptr);
free(ptr);
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I believe that the Standard really says that, but find the part about
value of ptr variable ridiculous. It breaks natural hierarchy by which
standard library is somewhat special, but it is not above rules of core
language. free() is defined as function rather than macro. By rules of
core language, a function call can not modify the value of local
variable at caller's scope, unless pointers to the variable was passed
to it explicitly.

Right, the call to free() doesn't change the value of ptr.

But that value, which is a valid pointer value before the call, becomes
indeterminate after the call.

You can even do something like this:

int *ptr = malloc(sizeof *ptr); // assume malloc succeeded
int *ptr1 = ptr + 1;
free(ptr - 1);

In most or all actual implementations, referring to the value of ptr
after the free() for example with printf("%p", (void*)ptr), won't cause
any problems (unless the compiler peforms some optimization based on the
assumption that its value won't be accessed). But one can imagine a
system that checks pointer values for validity every time they're
accessed; on such a system, reading the value of ptr might cause a trap.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Scott Lurndal

2025-01-10 18:58:20 UTC

Reply

Post by Keith Thompson

Post by Michael S
On Thu, 09 Jan 2025 23:40:52 -0800
I believe that the Standard really says that, but find the part about
value of ptr variable ridiculous. It breaks natural hierarchy by which
standard library is somewhat special, but it is not above rules of core
language. free() is defined as function rather than macro. By rules of
core language, a function call can not modify the value of local
variable at caller's scope, unless pointers to the variable was passed
to it explicitly.

Right, the call to free() doesn't change the value of ptr.
But that value, which is a valid pointer value before the call, becomes
indeterminate after the call.
int *ptr = malloc(sizeof *ptr); // assume malloc succeeded
int *ptr1 = ptr + 1;
free(ptr - 1);

Did you mean to write

free(ptr1 - 1);

Keith Thompson

2025-01-10 19:08:56 UTC

Reply

[...]

Post by Scott Lurndal

Post by Keith Thompson
int *ptr = malloc(sizeof *ptr); // assume malloc succeeded
int *ptr1 = ptr + 1;
free(ptr - 1);

Did you mean to write
free(ptr1 - 1);

Yes, thanks.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Kaz Kylheku

2025-01-10 19:11:05 UTC

Reply

Post by Michael S
On Thu, 09 Jan 2025 23:40:52 -0800

Post by Keith Thompson

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

`free` is destroying the pointed data.

Right. In other words, it causes the pointed-to data to reach the end
of its lifetime. "Changing" the data generally means modifying its
value (that's what "const" forbids).
int *ptr = malloc(sizeof *ptr);
*ptr = 42;
printf("*ptr = %d\n", *ptr);
free(ptr);
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I believe that the Standard really says that, but find the part about
value of ptr variable ridiculous. It breaks natural hierarchy by which
standard library is somewhat special, but it is not above rules of core
language.

The library is above the rules because it is not required to be
written in strictly conforming ISO C.

The free function can communicate with the hardware (obviously in
a completely nonportable way that is outside of the C language) to put a
pointer value into a blacklist, such that any operation doing anything
with that pointer will trap.

Whether a value is valid or is a non-value* representation doesn't
necessarily depend in a static way on the value's bit pattern; the
machine could have a dynamically changing function which determines
whether a bit pattern is a value or non-value.

More practically speaking, suppose we have a situation like this:

free(p);
helper(p);

The helper(p) call can be diagnosed at translation time, because C
implementations can implement arbitrary diagnostics.

If p is an indeterminate value, then the call erroneous regardless of
what the function does with p, so that adds legitimacy to diagnosing it,
and even terminating translation.

In other words, a conforming ISO C implementation can reject a program
which, if it were not for the use of p, would otherwise be strictly
conforming in every other regard, and which stays within all
implementation limits.

---
*. Formerly "trap representations" in the C standard are now called
non-value representations. Use of a non-value representation may
generate a trap.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

James Kuyper

2025-01-11 01:58:55 UTC

Reply

Post by Kaz Kylheku

Post by Michael S
On Thu, 09 Jan 2025 23:40:52 -0800

...

Post by Kaz Kylheku

Post by Michael S

Post by Keith Thompson
int *ptr = malloc(sizeof *ptr);
*ptr = 42;
printf("*ptr = %d\n", *ptr);
free(ptr);
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I believe that the Standard really says that, but find the part about
value of ptr variable ridiculous. It breaks natural hierarchy by which
standard library is somewhat special, but it is not above rules of core
language.

The library is above the rules because it is not required to be
written in strictly conforming ISO C.

True, but not as relevant as you think. The key point is that
interactions between the library and with user-written C code needs to
behave according to the rules of C, whether or not the function is
written in C. In particular, since free() is defined as receiving a
pointer value, rather than a pointer to that pointer, it should have no
way of changing the original pointer. You've described how it can render
that pointer's value invalid, without any change to the pointer itself.
And that is true whether or not malloc() is written in strictly conforming C

Chris M. Thomasson

2025-01-11 04:57:34 UTC

Reply

Post by Keith Thompson

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

`free` is destroying the pointed data.

Right. In other words, it causes the pointed-to data to reach the end
of its lifetime. "Changing" the data generally means modifying its
value (that's what "const" forbids).
int *ptr = malloc(sizeof *ptr);
*ptr = 42;
printf("*ptr = %d\n", *ptr);
free(ptr);
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I must be missing something here. Humm... I thought is was okay to do
something like this:
_____________________________
#include <stdio.h>
#include <stdlib.h>

int main() {
int* a = malloc(sizeof(*a));

if (a)
{
*a = 42;

printf("a = %p\n", (void*)a);
printf("*a = %d\n", *a);

free(a);

printf("a = %p was just freed! do not deref\n", (void*)a);
}

return 0;
}
_____________________________

Is that okay?

[...]

Richard Damon

2025-01-11 14:16:16 UTC

Reply

Post by Chris M. Thomasson

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so
how can `const void *` even be "correct"?

`free` is destroying the pointed data.

Right. In other words, it causes the pointed-to data to reach the end
of its lifetime. "Changing" the data generally means modifying its
value (that's what "const" forbids).
     int *ptr = malloc(sizeof *ptr);
     *ptr = 42;
     printf("*ptr = %d\n", *ptr);
     free(ptr);
After the call to free(), the int object logically no longer exists.
Also, the value of the pointer object ptr becomes indeterminate.
Attempting to refer to the value of either ptr or *ptr has undefined
behavior.

I must be missing something here. Humm... I thought is was okay to do
_____________________________
#include <stdio.h>
#include <stdlib.h>
int main() {
    int* a = malloc(sizeof(*a));
    if (a)
    {
        *a = 42;
        printf("a = %p\n", (void*)a);
        printf("*a = %d\n", *a);
        free(a);
        printf("a = %p was just freed! do not deref\n", (void*)a);
    }
    return 0;
}
_____________________________
Is that okay?
[...]

No, because the value of a has become indeterminate, and operating on
it, even to just look at its value, can trap.

you could save a representation of it either in a char array or as a
uintptr_t value, and work with that (but not try to recreate a pointer
with it, as that pointer "value" has become indeterminate).

This issue CAN occur if the implementation is using segment_tag + offset
pointers, and free invalidates the segment_tag of that the pointer used,
and the implementation will perhaps validate the segment_tag when
looking at the pointer value. (perhaps pointers are loaded into
registers that automatically validate the segment_tag in them).

Kaz Kylheku

2025-01-10 18:37:49 UTC

Reply

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

`free` is destroying the pointed data.
Every object in C object model has to be created (when its lifetime
begins) and has to be eventually destroyed (when its lifetime ends).

That is not so. Literals can be put into a ROM image.

Well, sure, that is created in a factory, and destroyed when recycled.

The point is that the data's lifetime can span over countless
invocations of the program; the program can never observe a time which
is outside of the lifetime of those objects.

Post by Andrey Tarasevich
So, destruction is not really a "modifying" operation. Destruction of an

Destruction by malloc is modifying in any system that recycles the
memory for another allocation.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Keith Thompson

2025-01-10 19:08:03 UTC

Reply

Post by Kaz Kylheku

Post by Andrey Tarasevich

Post by Julio Di Egidio
I do not understand that: `free` is changing the pointed data, so how
can `const void *` even be "correct"?

`free` is destroying the pointed data.
Every object in C object model has to be created (when its lifetime
begins) and has to be eventually destroyed (when its lifetime ends).

That is not so. Literals can be put into a ROM image.
Well, sure, that is created in a factory, and destroyed when recycled.
The point is that the data's lifetime can span over countless
invocations of the program; the program can never observe a time which
is outside of the lifetime of those objects.

Any object with static storage duration has a lifetime that extends over
the entire execution of the program, so the program can never see such
an object outside its lifetime regardless of how it's stored.

If an object happens to be stored in a ROM image that the program
accesses directly, then sure, those bits can survive across multiple
executions. But the C "lifetime" extends only across a single
execution.

Nothing in particular *has* to happen when an object reaches the end of
its lifetime.

Post by Kaz Kylheku

Post by Andrey Tarasevich
So, destruction is not really a "modifying" operation. Destruction of an

Destruction by malloc is modifying in any system that recycles the
memory for another allocation.

A call to malloc or free can modify the bits that make up an object, but
they do so only before or after the object's C lifetime.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Andrey Tarasevich

2025-01-11 16:38:40 UTC

Reply

Post by Kaz Kylheku
Destruction by malloc is modifying in any system that recycles the
memory for another allocation.

From such a radically "physical" point of view, nothing is and nothing
will ever be `const`... Sorry, this is not even close to what
"constness" in C is about.

In C and C++ (as well in virtually all higher level languages)
"constness" is not a physical concept. It is a purely high-level
logic-level concept, designed, implemented and enforced entirely by the
author of the code (of the public interface of the module) in accordance
with their intent.

It has absolutely no relation to any physical modifications that might
occur anywhere in the execution environment. Nobody cares whether
something somewhere gets "modified". It is always a question of whether
_I_ want to recognize such modifications as part of the public interface
designed by me. I'm the one who says whether the operation is "constant"
or not, based purely on my idea of "logical constness".

That's the reason `const` exists in C (and C++).

However (returning to the more narrowly focused matter at hand), two
things - creation and deletion of objects - will always indisputably
stand apart as operations that transcend/defeat/ignore the idea of
"constness" with relation to the object itself. Creation/deletion might
logically be seen as "non-constant" wrt to the surrounding environment
(e.g. memory manager), but wrt to the object itself they shall not (and,
obviously, cannot) care about its "constness" at all.

An object begins being `const` only after the process of its creation
(construction) is complete. An object stops being `const` the moment the
process of its destruction begins. That's how it works in C and C++.
(Again, using C++ as an example, you all know that `const` objects are
not seen as `const` inside their constructors and destructors.) In C we
don't have constructors and destructors, but we still naturally follow
the same ideas in hand-written code. It would've been nice to have
`free` to play along with this.

--
Best regards,
Andrey

David Brown

2025-01-09 14:18:55 UTC

Reply

Post by Andrey Tarasevich
It is perfectly safe. One can even argue that standard declaration if
`free` as `void free(void *)` is defective. It should have been `void
free(const void *)` from the very beginning.

It is common in simple heap implementations for the allocated block to
contain data about the block, such as allocation sizes and pointers to
other blocks, in memory just below the address returned by malloc.
free() then uses its parameter to access that data, and may change it.
So "void free(const void *);" would be lying to the user.

Even without that, since you are now giving away the memory for re-use
by other code, it's reasonable to say that "free" might change the data
pointed to. (And a security-paranoid "free" might zero out the memory
before returning it to the heap for re-use.)

Andrey Tarasevich

2025-01-10 06:04:39 UTC

Reply

Post by David Brown
It is common in simple heap implementations for the allocated block to
contain data about the block, such as allocation sizes and pointers to
other blocks, in memory just below the address returned by malloc.
free() then uses its parameter to access that data, and may change it.
So "void free(const void *);" would be lying to the user.

No, it wouldn't be. A deallocated data no longer exists from the user's
point of view. There's nothing to lie about.

Post by David Brown
Even without that, since you are now giving away the memory for re-use
by other code, it's reasonable to say that "free" might change the data
pointed to. (And a security-paranoid "free" might zero out the memory
before returning it to the heap for re-use.)

Such internal implementation details are completely irrelevant to the
matter at hand, which is purely conceptual in essence. The rest I've
explained above.

--
Best regards,
Andrey

David Brown

2025-01-10 10:31:07 UTC

Reply

Post by Andrey Tarasevich

Post by David Brown
It is common in simple heap implementations for the allocated block to
contain data about the block, such as allocation sizes and pointers to
other blocks, in memory just below the address returned by malloc.
free() then uses its parameter to access that data, and may change it.
So "void free(const void *);" would be lying to the user.

No, it wouldn't be. A deallocated data no longer exists from the user's
point of view. There's nothing to lie about.

Post by David Brown
Even without that, since you are now giving away the memory for re-use
by other code, it's reasonable to say that "free" might change the
data pointed to. (And a security-paranoid "free" might zero out the
memory before returning it to the heap for re-use.)

Such internal implementation details are completely irrelevant to the
matter at hand, which is purely conceptual in essence. The rest I've
explained above.

I appreciate your point of view. And certainly there is no way (without
UB, or at least implementation-specific details) for the code that calls
"free" to see that "free" has changed anything via the pointer parameter
- thus to the calling code, it makes no difference if it is "void *" or
"const void *".

So perhaps it is wrong to say it would be /lying/ to the user. But it
is still, I think, misleading. A pointer-to-const parameter in C is
used primarily to say that the function will only read the pointed-to data.

Another factor here is symmetry - malloc returns a void * pointer, and
it would seem very strange that you should return that pointer to the
heap as a const void * pointer.

If you want a better signature for "free", then I would suggest "void
free(void ** p)" - that (to me) more naturally shows that the function
is freeing the pointer, while also greatly reducing the "use after free"
errors in C code by turning them into "dereferencing a null pointer"
errors which are more easily caught by many OS's.

Keith Thompson

2025-01-10 18:56:04 UTC

Reply

David Brown <***@hesbynett.no> writes:
[...]

Post by David Brown
If you want a better signature for "free", then I would suggest "void
free(void ** p)" - that (to me) more naturally shows that the function
is freeing the pointer, while also greatly reducing the "use after
free" errors in C code by turning them into "dereferencing a null
pointer" errors which are more easily caught by many OS's.

I'm not sure that would work. A void** argument means you need to pass
a pointer to a void* object. If you've assigned the converted result of
malloc() to, say, an int* object, you don't have a void* object. (int*
and void* might not even have the same representation).

Some kind of generic function that takes a pointer to an object of any
object pointer type could work, but the language doesn't support that.
(C++ addressed this by making `new` and `delete` built-in operators
rather than library functions.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

David Brown

2025-01-11 11:14:20 UTC

Reply

Post by Keith Thompson
[...]

Post by David Brown
If you want a better signature for "free", then I would suggest "void
free(void ** p)" - that (to me) more naturally shows that the function
is freeing the pointer, while also greatly reducing the "use after
free" errors in C code by turning them into "dereferencing a null
pointer" errors which are more easily caught by many OS's.

I'm not sure that would work. A void** argument means you need to pass
a pointer to a void* object. If you've assigned the converted result of
malloc() to, say, an int* object, you don't have a void* object. (int*
and void* might not even have the same representation).

Yes, you are right - while "free(void ** p)" might often be feasible in
practice (since on most implementations, pointers are the same size and
representation) it would at a minimum rely on compilers being somewhat
lax about accepting these conversions. Certainly it is not something
that could be part of the standard.

The idea was to place the emphasis on "free" changing the pointer,
rather than the data pointed to. But it could not be done as simply as
I had suggested.

Post by Keith Thompson
Some kind of generic function that takes a pointer to an object of any
object pointer type could work, but the language doesn't support that.
(C++ addressed this by making `new` and `delete` built-in operators
rather than library functions.)

Julio Di Egidio

2025-01-11 16:07:36 UTC

Reply

<snip>

Post by David Brown
The idea was to place the emphasis on "free" changing the pointer,
rather than the data pointed to.

I feel I am still altogether missing the point.

Is my understanding correct that when freeing a pointer: 1) the pointer
value, i.e. the address it holds, does not change; OTOH, 2) the
pointed-to object does change, in the sense that it is marked unusable
(and, supposedly, made available to re-allocation)?

Moreover, while the pointer value has not changed, it is in fact changed
in the sense that it has become invalid, namely the pointer cannot be
used (validly dereferenced) anymore. Not just that, but *every* pointer
to the same object, i.e. holding the same address, has become invalid.

All that considered, how isn't `void free(void *p)`, i.e. with no const
qualifiers anywhere, the only reasonable signature?

-Julio

Julio Di Egidio

2025-01-11 16:21:25 UTC

Reply

Post by Julio Di Egidio
<snip>

Post by David Brown
The idea was to place the emphasis on "free" changing the pointer,
rather than the data pointed to.

I feel I am still altogether missing the point.
Is my understanding correct that when freeing a pointer: 1) the pointer
value, i.e. the address it holds, does not change; OTOH, 2) the
pointed-to object does change, in the sense that it is marked unusable
(and, supposedly, made available to re-allocation)?
Moreover, while the pointer value has not changed, it is in fact changed
in the sense that it has become invalid, namely the pointer cannot be
used (validly dereferenced) anymore. Not just that, but *every* pointer
to the same object, i.e. holding the same address, has become invalid.
All that considered, how isn't `void free(void *p)`, i.e. with no const
qualifiers anywhere, the only reasonable signature?

In fact, along that line, I could see one might insist that "strictly
speaking, it should be `void free(void *const p)` because the pointer
value is not changed" (my considerations above are indeed more
"semantic"), OTOH, I just cannot see a case for `void free(void const
*p)`, not even strictly technically.

-Julio

Julio Di Egidio

2025-01-11 16:33:46 UTC

Reply

Post by Julio Di Egidio

Post by Julio Di Egidio
<snip>

Post by David Brown
The idea was to place the emphasis on "free" changing the pointer,
rather than the data pointed to.

I feel I am still altogether missing the point.
Is my understanding correct that when freeing a pointer: 1) the
pointer value, i.e. the address it holds, does not change; OTOH, 2)
the pointed-to object does change, in the sense that it is marked
unusable (and, supposedly, made available to re-allocation)?
Moreover, while the pointer value has not changed, it is in fact
changed in the sense that it has become invalid, namely the pointer
cannot be used (validly dereferenced) anymore. Not just that, but
*every* pointer to the same object, i.e. holding the same address, has
become invalid.
All that considered, how isn't `void free(void *p)`, i.e. with no
const qualifiers anywhere, the only reasonable signature?

In fact, along that line, I could see one might insist that "strictly
speaking, it should be `void free(void *const p)` because the pointer
value is not changed" (my considerations above are indeed more
"semantic"), OTOH, I just cannot see a case for `void free(void const
*p)`, not even strictly technically.

P.S. Sorry, I might have snipped too much: to be clear, David Brown is
actually referring to a `void free(void ** p)` in his statement above,
so his "changing the pointer" is to be read in that context: I am back
to the question of is there a problem with the standard signature of `free`.

-Julio

Keith Thompson

2025-01-11 22:50:38 UTC

Reply

Post by Julio Di Egidio
<snip>

Post by David Brown
The idea was to place the emphasis on "free" changing the pointer,
rather than the data pointed to.

I feel I am still altogether missing the point.
Is my understanding correct that when freeing a pointer: 1) the
pointer value, i.e. the address it holds, does not change; OTOH, 2)
the pointed-to object does change, in the sense that it is marked
unusable (and, supposedly, made available to re-allocation)?
Moreover, while the pointer value has not changed, it is in fact
changed in the sense that it has become invalid, namely the pointer
cannot be used (validly dereferenced) anymore. Not just that, but
*every* pointer to the same object, i.e. holding the same address, has
become invalid.
All that considered, how isn't `void free(void *p)`, i.e. with no
const qualifiers anywhere, the only reasonable signature?

It is.

The current declaration `void free(void *p)` implies that the free
function does not promise to modify the data pointed to by p. In fact
it may or may not do so. (No program can tell whether the data is
modified without undefined behavior.)

`void free(const void *p)` would imply a promise by free that it will
not modify that data. Since it does so in some implementations and
since any attempt to access that data would have undefined behavior,
that would not be useful.

In `void free(void *const p)`, the "const" would be nearly meaningless.
p is a parameter, a local object in the implementation of free. The
"const", if it appears in the definition (assuming free() is implemented
as a C function), would be a promise not to modify that local object --
something that should be of no relevance to callers. In the
declaration, it means nothing.

The visible declaration of free does not, and cannot, communicate all
the relevant information about it. The allocated data reaches the end
of its lifetime, and the pointer value passed to free therefore becomes
indeterminate. The same thing happens with a pointer to a local object
when that object reaches the end of its lifetime:

{
int *p;
{
int n = 42;
p = &n;
}
printf("p = %p\n", (void*)p); // undefined behavior
}

C's declaration syntax can't express those effects, which is why they're
stated explicitly in the standard.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

2025-01-11 15:58:44 UTC

Reply

[...]

Post by David Brown
If you want a better signature for "free", then I would suggest "void
free(void ** p)" - that (to me) more naturally shows that the function
is freeing the pointer, while also greatly reducing the "use after
free" errors in C code by turning them into "dereferencing a null
pointer" errors which are more easily caught by many OS's.

I'm not sure that would work. A void** argument means you need to pass
a pointer to a void* object. If you've assigned the converted result of
malloc() to, say, an int* object, you don't have a void* object. (int*
and void* might not even have the same representation).

Correct. As a result, that interface would, in principle, require
storing the pointer in a void* object so that it's address could be
passed to free(). In many contexts, that would encourage saving the
original void* value returned by malloc() for passing to free(), while
creating a second pointer for actually using the allocated memory.

Kaz Kylheku

2025-01-10 19:28:22 UTC

Reply

Post by David Brown

Post by Andrey Tarasevich
It is perfectly safe. One can even argue that standard declaration if
`free` as `void free(void *)` is defective. It should have been `void
free(const void *)` from the very beginning.

It is common in simple heap implementations for the allocated block to
contain data about the block, such as allocation sizes and pointers to
other blocks, in memory just below the address returned by malloc.
free() then uses its parameter to access that data, and may change it.
So "void free(const void *);" would be lying to the user.

If the pointer came from the allocator, as required, there is no lie.
The object is writable, and so free may strip away the qualifier
and do whatever it wants, like cover the object with 0xFE bytes.

Quite contrary, if the program calls free(p), yet still somehow cares
about what happens to the contents referenced by p, then the program is
lying to the implementation!

The prototype of free being: void free(const void *) would be helpful.
It would mean that programs which choose to initialize dynamically
allocated objects and then pass them around via pointers to const could
then free the objects without having to use a cast.

Code like this can be free of casts:

const obj *obj_create(int param)
{
obj *o = malloc(sizeof *po); // null check omitted for topic focus
o->param = param;
return o; // after this, objs treated as immutable
}

void obj_destroy(const obj *o)
{
free(o);
}

Free of casts is good!

Might it already be that the features of the current ISO C standard
allow for the possibility of free being generic betwen const and
unqualified? Though I don't suspect you can select on a pointer's
referenced type's qualification, regardless of type.

What are the backward compatibilty considerations of just making
the prototype void free(const void *)?

One is this problem:

void (*pfree)(void *) = free;

but in fact, the type rules should be such that the assignment
is allowed even if free is void(const void *).

The type compatibility rule should be that a function pointer type with
more strictly qualified parameters should be implicitly convertible to a
function pointer type with less strictly qualified parameters.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca

Tim Rentsch

2025-01-08 19:44:14 UTC

Reply

Post by Julio Di Egidio
Hi everybody,
indeed I have forgotten so many things, including
how much I love this language. :)
In particular, I am using C90, and compiling with
`gcc ... -ansi -pedantic -Wall -Wextra` (as I have
the requirement to ideally support any device).

If it is absolutely necessary to have source that conforms to
C90, you should at the very least compile it (on some machine)
under both C90 rules and C99 rules. Frankly I doubt you will
ever have a real need to target a C90-only environment, but if
you do decide to go down that path then at least also compile the
program as C99, to be sure the brokennesses of C90 don't bite
you.

175 Replies
9 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Julio Di Egidio 2025-01-07 19:32:50 UTC

Kaz Kylheku 2025-01-07 22:11:42 UTC

Julio Di Egidio 2025-01-08 14:02:23 UTC

Julio Di Egidio 2025-01-08 14:05:43 UTC

Ben Bacarisse 2025-01-08 15:16:03 UTC

David Brown 2025-01-08 15:53:44 UTC

Julio Di Egidio 2025-01-08 16:05:24 UTC

Julio Di Egidio 2025-01-08 16:24:05 UTC

Chris M. Thomasson 2025-01-08 22:55:33 UTC

Ben Bacarisse 2025-01-09 01:09:20 UTC

Kaz Kylheku 2025-01-09 04:24:56 UTC

David Brown 2025-01-09 09:35:52 UTC

Chris M. Thomasson 2025-01-09 21:37:09 UTC

Julio Di Egidio 2025-01-09 06:49:47 UTC

Ben Bacarisse 2025-01-09 23:23:33 UTC

Julio Di Egidio 2025-01-09 23:37:56 UTC

Julio Di Egidio 2025-01-09 23:45:44 UTC

Tim Rentsch 2025-01-10 01:43:50 UTC

Julio Di Egidio 2025-01-10 02:14:13 UTC

Julio Di Egidio 2025-01-10 02:51:16 UTC

Julio Di Egidio 2025-01-10 03:05:05 UTC

Ben Bacarisse 2025-01-10 01:04:01 UTC

James Kuyper 2025-01-08 19:10:54 UTC

James Kuyper 2025-01-08 19:20:00 UTC

Scott Lurndal 2025-01-08 20:14:40 UTC

Tim Rentsch 2025-01-08 20:12:37 UTC

Julio Di Egidio 2025-01-09 07:12:58 UTC

Chris M. Thomasson 2025-01-08 22:48:06 UTC

Ben Bacarisse 2025-01-09 01:04:27 UTC

Julio Di Egidio 2025-01-09 06:56:42 UTC

Tim Rentsch 2025-01-08 20:43:52 UTC

Ben Bacarisse 2025-01-09 00:49:05 UTC

David Brown 2025-01-08 08:46:46 UTC

Ben Bacarisse 2025-01-08 11:25:53 UTC

David Brown 2025-01-08 12:25:12 UTC

Julio Di Egidio 2025-01-08 14:42:56 UTC

David Brown 2025-01-08 16:18:21 UTC

Julio Di Egidio 2025-01-08 16:35:42 UTC

David Brown 2025-01-08 18:39:20 UTC

Phillip 2025-01-08 16:45:01 UTC

Tim Rentsch 2025-01-08 19:52:12 UTC

Keith Thompson 2025-01-08 20:20:05 UTC

Phillip 2025-01-08 20:27:18 UTC

Keith Thompson 2025-01-08 21:41:43 UTC

Phillip 2025-01-09 05:09:31 UTC

Keith Thompson 2025-01-09 05:34:37 UTC

Phillip 2025-01-09 15:30:35 UTC

Michael S 2025-01-09 16:12:44 UTC

Phillip 2025-01-09 17:40:46 UTC

David Brown 2025-01-09 13:53:56 UTC

Michael S 2025-01-09 15:27:31 UTC

David Brown 2025-01-09 12:15:01 UTC

Keith Thompson 2025-01-08 22:12:43 UTC

Chris M. Thomasson 2025-01-08 23:00:07 UTC

David Brown 2025-01-09 13:58:25 UTC

Julio Di Egidio 2025-01-09 08:07:18 UTC

Julio Di Egidio 2025-01-09 09:07:11 UTC

David Brown 2025-01-09 14:11:28 UTC

bart 2025-01-09 15:28:25 UTC

David Brown 2025-01-09 20:39:30 UTC

Julio Di Egidio 2025-01-10 14:30:39 UTC

James Kuyper 2025-01-08 19:10:27 UTC

Keith Thompson 2025-01-08 20:25:18 UTC

Kaz Kylheku 2025-01-08 16:24:45 UTC

Keith Thompson 2025-01-08 20:08:21 UTC

Andrey Tarasevich 2025-01-08 16:48:53 UTC

Tim Rentsch 2025-01-08 20:24:47 UTC

Keith Thompson 2025-01-08 21:01:15 UTC

Kenny McCormack 2025-01-08 21:32:00 UTC

Julio Di Egidio 2025-01-09 08:12:06 UTC

Keith Thompson 2025-01-09 11:21:23 UTC

Julio Di Egidio 2025-01-09 11:26:24 UTC

Kaz Kylheku 2025-01-09 19:47:49 UTC

Tim Rentsch 2025-01-10 01:48:08 UTC

Andrey Tarasevich 2025-01-10 06:01:53 UTC

Keith Thompson 2025-01-10 07:40:52 UTC

Michael S 2025-01-10 10:23:53 UTC

James Kuyper 2025-01-10 13:50:07 UTC

Keith Thompson 2025-01-10 18:37:46 UTC

Scott Lurndal 2025-01-10 18:58:20 UTC

Keith Thompson 2025-01-10 19:08:56 UTC

Kaz Kylheku 2025-01-10 19:11:05 UTC

James Kuyper 2025-01-11 01:58:55 UTC

Chris M. Thomasson 2025-01-11 04:57:34 UTC

Richard Damon 2025-01-11 14:16:16 UTC

Kaz Kylheku 2025-01-10 18:37:49 UTC

Keith Thompson 2025-01-10 19:08:03 UTC

Andrey Tarasevich 2025-01-11 16:38:40 UTC

David Brown 2025-01-09 14:18:55 UTC

Andrey Tarasevich 2025-01-10 06:04:39 UTC

David Brown 2025-01-10 10:31:07 UTC

Keith Thompson 2025-01-10 18:56:04 UTC

David Brown 2025-01-11 11:14:20 UTC

Julio Di Egidio 2025-01-11 16:07:36 UTC

Julio Di Egidio 2025-01-11 16:21:25 UTC

Julio Di Egidio 2025-01-11 16:33:46 UTC

Keith Thompson 2025-01-11 22:50:38 UTC

James Kuyper 2025-01-11 15:58:44 UTC

Kaz Kylheku 2025-01-10 19:28:22 UTC

Tim Rentsch 2025-01-08 19:44:14 UTC

about - legalese

Loading...