Discussion:
Two questions on arrays with size defined by variables
Add Reply
Janis Papanagnou
2025-02-09 07:50:19 UTC
Reply
Permalink
I need an array whose size is depending on values that are dynamically
determined by the program. (I suppose those are the "VLAs" that I've
read about here occasionally? - Anyway, however they're named...)

I've found examples on the Net where the arrays have been defined in a
function context and the size passed as parameter

f(int n) {
char * arr[n];
...
}

That reminded me on other languages where you'd need at least a block
context for dynamically sized arrays, like

int n = 5;
{
char * arr[n];
...
}

Anyway. I tried it without function or block context

int n = 5;
char * arr[n];
...

and it seemed to work seamlessly like that (with GNU cc, -std=C99).

Q1: Is this a correct (portable) form?


Then, with above setting, I also tried

arr[99] = "foobar";

To my astonishment the compiler did not only accept that but it also
operated without runtime error; but I assume it's an error that may
severely corrupt the memory. - Q2: Is my suspicion correct?

Janis
Andrey Tarasevich
2025-02-09 08:06:56 UTC
Reply
Permalink
Post by Janis Papanagnou
I've found examples on the Net where the arrays have been defined in a
function context and the size passed as parameter
f(int n) {
char * arr[n];
...
}
Yes, that would be a VLA.
Post by Janis Papanagnou
That reminded me on other languages where you'd need at least a block
context for dynamically sized arrays, like
int n = 5;
{
char * arr[n];
...
}
But a function body is in itself a block. Inside a function body you are
already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc, -std=C99).
You mean you did this at file scope? No, VLAs are illegal at file scope.
And I was unable to repeat this feat in GCC.
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind that
support for local declarations of VLA _objects_ is now optional (i.e.
not portable). Support for variably-modified _types_ themselves (VLA
types) is mandatory. But you are not guaranteed to be able to declare an
actual VLA variable.
Post by Janis Papanagnou
Then, with above setting, I also tried
arr[99] = "foobar";
To my astonishment the compiler did not only accept that but it also
operated without runtime error; but I assume it's an error that may
severely corrupt the memory. - Q2: Is my suspicion correct?
Yes. Since the beginning of times the language itself does not
check/enforce array boundaries. When you violate the boundary, the
behavior is undefined, meaning that compilers can do anything (including
implementing array boundary checks, where possible). But for obvious
performance reasons C compilers do not normally enforce array boundaries
in "production" compilation modes.
--
Best regards,
Andrey
Janis Papanagnou
2025-02-09 09:54:36 UTC
Reply
Permalink
Post by Andrey Tarasevich
Post by Janis Papanagnou
I've found examples on the Net where the arrays have been defined in a
function context and the size passed as parameter
f(int n) {
char * arr[n];
...
}
Yes, that would be a VLA.
Post by Janis Papanagnou
That reminded me on other languages where you'd need at least a block
context for dynamically sized arrays, like
int n = 5;
{
char * arr[n];
...
}
But a function body is in itself a block. Inside a function body you are
already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc, -std=C99).
You mean you did this at file scope? No, VLAs are illegal at file scope.
And I was unable to repeat this feat in GCC.
Oh, sorry, no; above I had just written an excerpt. - Actually I had
those two examples above within a main() function. - Sorry again for
my inaccuracy.

What I meant was (with surrounding context) that I knew (from _other_
languages) a syntax like

main ()
{
int n = 5;

{
char * arr[n];
...
}
}

And in "C" (C99) I tried it *without* the _inner block_

main ()
{
int n = 5;
char * arr[n];
...
}

and it seemed to work that way. (In those other languages that wasn't
possible.)
Post by Andrey Tarasevich
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind that
support for local declarations of VLA _objects_ is now optional (i.e.
not portable). Support for variably-modified _types_ themselves (VLA
types) is mandatory. But you are not guaranteed to be able to declare an
actual VLA variable.
I fear I don't understand what you're saying here. - By "now" do you
mean newer versions of the C standards? That you can rely only, say,
rely on it with C99 but maybe not before and not in later C standards
conforming compilers?

For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.

Janis
Post by Andrey Tarasevich
Post by Janis Papanagnou
[...]
Keith Thompson
2025-02-09 10:25:13 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by Andrey Tarasevich
Post by Janis Papanagnou
I've found examples on the Net where the arrays have been defined in a
function context and the size passed as parameter
f(int n) {
char * arr[n];
...
}
Yes, that would be a VLA.
Post by Janis Papanagnou
That reminded me on other languages where you'd need at least a block
context for dynamically sized arrays, like
int n = 5;
{
char * arr[n];
...
}
But a function body is in itself a block. Inside a function body you are
already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc, -std=C99).
You mean you did this at file scope? No, VLAs are illegal at file scope.
And I was unable to repeat this feat in GCC.
Oh, sorry, no; above I had just written an excerpt. - Actually I had
those two examples above within a main() function. - Sorry again for
my inaccuracy.
What I meant was (with surrounding context) that I knew (from _other_
languages) a syntax like
main ()
{
int n = 5;
{
char * arr[n];
...
}
}
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
The first line needs to be `int main(void)`. The "implicit int"
misfeature was removed in C99. Your compiler might let you get away
with it; many C compilers are quite lax by default. For gcc, use
"-std=cNN -pedantic" to enforce the language rules, where NN specifies
the language version.
Post by Janis Papanagnou
and it seemed to work that way. (In those other languages that wasn't
possible.)
VLAs were introduced in C99, so the above is invalid in C90 with or
without the inner block. In C99 and later, there's no requirement to
put the VLA object definition in an inner block (if the implementation
supports them). (C99 did add the ability to mix declarations and
statements, but that's not relevant to your example).
Post by Janis Papanagnou
Post by Andrey Tarasevich
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind that
support for local declarations of VLA _objects_ is now optional (i.e.
not portable). Support for variably-modified _types_ themselves (VLA
types) is mandatory. But you are not guaranteed to be able to declare an
actual VLA variable.
I fear I don't understand what you're saying here. - By "now" do you
mean newer versions of the C standards? That you can rely only, say,
rely on it with C99 but maybe not before and not in later C standards
conforming compilers?
C90 didn't have VLAs at all.

C99 introduced them and required all implementations to support them.

C11 made variably modified types optional.

C23 still makes variable length arrays with automatic storage duration
optional but "Parameters declared with variable length array types are
adjusted and then define objects of automatic storage duration with
pointer types. Thus, support for such declarations is mandatory."
(Support for C23 is still preliminary.)
Post by Janis Papanagnou
For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.
C99 requires support for local objects of variable length array types.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2025-02-09 17:17:45 UTC
Reply
Permalink
[...]
Post by Keith Thompson
Post by Janis Papanagnou
Post by Andrey Tarasevich
But a function body is in itself a block. Inside a function body you are
already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc, -std=C99).
You mean you did this at file scope? No, VLAs are illegal at file scope.
And I was unable to repeat this feat in GCC.
Oh, sorry, no; above I had just written an excerpt. - Actually I had
those two examples above within a main() function. - Sorry again for
my inaccuracy.
What I meant was (with surrounding context) that I knew (from _other_
languages) a syntax like
main ()
{
int n = 5;
{
char * arr[n];
...
}
}
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
The first line needs to be `int main(void)`. The "implicit int"
misfeature was removed in C99. [...]
Thanks. (Again answering more/something different than I asked.) :-)

Please note that I structurally illustrated just the posters question
about where the relevant code excerpt resides (file scope or else).

If I'd knew the audience is picky I'd posted the whole test program;
but then there's even much more picky comments to expect. ;-)

I hope to mollify the audience if I point out that my code actually
looks *like* this

...
int main (int argc, char * argv[])
{
...
return 0;
}

(And, yes, I know that the "..." is not correct, and argc is unused,
and I omitted 'const', etc.)
Post by Keith Thompson
[...] but that's not relevant to your example).
Right.
Post by Keith Thompson
Post by Janis Papanagnou
[...]
C90 didn't have VLAs at all.
C99 introduced them and required all implementations to support them.
C11 made variably modified types optional.
C23 still makes variable length arrays with automatic storage duration
optional but "Parameters declared with variable length array types are
adjusted and then define objects of automatic storage duration with
pointer types. Thus, support for such declarations is mandatory."
(Support for C23 is still preliminary.)
Post by Janis Papanagnou
For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.
C99 requires support for local objects of variable length array types.
Thanks!

Janis
Keith Thompson
2025-02-10 00:38:12 UTC
Reply
Permalink
[...]
Post by Janis Papanagnou
Post by Keith Thompson
The first line needs to be `int main(void)`. The "implicit int"
misfeature was removed in C99. [...]
Thanks. (Again answering more/something different than I asked.) :-)
Please note that I structurally illustrated just the posters question
about where the relevant code excerpt resides (file scope or else).
If I'd knew the audience is picky I'd posted the whole test program;
but then there's even much more picky comments to expect. ;-)
Yeah, we're picky here.
Post by Janis Papanagnou
I hope to mollify the audience if I point out that my code actually
looks *like* this
...
int main (int argc, char * argv[])
{
...
return 0;
}
(And, yes, I know that the "..." is not correct, and argc is unused,
and I omitted 'const', etc.)
It's clear enough that the "..." is figurative. As picky as I am,
I wouldn't have commented on it. "// ..." or "/* ... */" is more
pedantically correct, but whatever.

(Incidentally, Perl has a "..." operator, nicknamed the Yada Yada
operator, intended to mark placeholder code that is not yet
implemented.)

The "return 0;" is unnecessary but harmless in C99 and later.

As for "const", there's nothing in the above snippet that requires it.
You can probably add a const or two to the argv declaration:
const char *const *argv
but the standard doesn't include "const" in the declaration of argv.
Strictly speaking, adding "const" probably makes the program's behavior
undefined.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2025-02-10 02:35:18 UTC
Reply
Permalink
Post by Keith Thompson
Post by Janis Papanagnou
If I'd knew the audience is picky I'd posted the whole test program;
but then there's even much more picky comments to expect. ;-)
Yeah, we're picky here.
Post by Janis Papanagnou
I hope to mollify the audience if I point out that my code actually
looks *like* this
...
int main (int argc, char * argv[])
{
...
return 0;
}
(And, yes, I know that the "..." is not correct, and argc is unused,
and I omitted 'const', etc.)
It's clear enough that the "..." is figurative.
And my main() { ... } was to figuratively illustrate the structure.

I could have also used f() { ... } but it didn't occur to me that
using main would trigger yet another off-topic comment.
Post by Keith Thompson
As picky as I am, I wouldn't have commented on it.
Some are more picky some less, some on this detail others on that.

I think we're all well served if we'd be more equanimous especially
when postings or clarifying things are based on _excerpts_ or code
snippets.

I know that specifically in _this_ newsgroup that is difficult. :-)
Post by Keith Thompson
"// ..." or "/* ... */" is more
pedantically correct, but whatever.
But this is code. The meta '...' is clearly descriptive for some
[contextually uninteresting] things omitted.

(You see what I mean by trying to be more equanimous. Perception
of folks differs a lot. We cannot expect others to share our views
or habits.)
Post by Keith Thompson
(Incidentally, Perl has a "..." operator, nicknamed the Yada Yada
operator, intended to mark placeholder code that is not yet
implemented.)
The "return 0;" is unnecessary but harmless in C99 and later.
That - returning a value when a function is declared to return
one - is actually a [maybe picky] coding-habit of mine. :-)

Janis
James Kuyper
2025-02-10 04:03:52 UTC
Reply
Permalink
...
Post by Janis Papanagnou
Post by Keith Thompson
Post by Janis Papanagnou
int main (int argc, char * argv[])
{
...
return 0;
}
...
Post by Janis Papanagnou
Post by Keith Thompson
The "return 0;" is unnecessary but harmless in C99 and later.
That - returning a value when a function is declared to return
one - is actually a [maybe picky] coding-habit of mine. :-)
It's a reasonable habit to acquire. However, the interface for main()
was created back when there was no way to declare that a function didn't
return any value, and all functions by default returned "int" unless
explicitly declared otherwise. It made sense, therefore, for main() to
be declared as returning an int value that would be passed to the host
environment to indicate the exit status of the program.
Many programmers had no need to report an exit status, and therefore
didn't bother returning a value from main(). As a result, many
implementations failed to require it. Eventually the committee decided
to make a special case for main(), to accommodate those practices.
Andrey Tarasevich
2025-02-10 04:14:01 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by Keith Thompson
The "return 0;" is unnecessary but harmless in C99 and later.
That - returning a value when a function is declared to return
one - is actually a [maybe picky] coding-habit of mine. :-)
This is, of course, a purely stylistic matter. But still... `main` is
special. And it kinda makes sense to acknowledge it special nature by
not doing explicit `return 0` from `main`. `main` looks cleaner without it.
--
Best regards,
Andrey
Janis Papanagnou
2025-02-10 06:43:26 UTC
Reply
Permalink
Post by Andrey Tarasevich
Post by Janis Papanagnou
Post by Keith Thompson
The "return 0;" is unnecessary but harmless in C99 and later.
That - returning a value when a function is declared to return
one - is actually a [maybe picky] coding-habit of mine. :-)
This is, of course, a purely stylistic matter. But still... `main` is
special.
I know.
Post by Andrey Tarasevich
And it kinda makes sense to acknowledge it special nature by
not doing explicit `return 0` from `main`. `main` looks cleaner without it.
Now I'm astonished by that comment (to say the least).

I'm regularly returning status and error information to the calling
instance to act upon it. And you're saying that I should not return
any value? Or only values that are different from 0? Or only 0 if I
also return other values? - Whatever; that sounds all wrong to me.
Or is 'return' deprecated or depreciated and we should now rewrite
our source code and replace every 'return' by 'exit()', or use now
only 'exit()' in the first place instead?

I don't think you will convince me to not return 0 only because in
the special case that there's nothing specified the current language
standards defines that per default it implicitly provides that value
for me.

Janis
Keith Thompson
2025-02-10 06:57:06 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by Andrey Tarasevich
Post by Janis Papanagnou
Post by Keith Thompson
The "return 0;" is unnecessary but harmless in C99 and later.
That - returning a value when a function is declared to return
one - is actually a [maybe picky] coding-habit of mine. :-)
This is, of course, a purely stylistic matter. But still... `main` is
special.
I know.
Post by Andrey Tarasevich
And it kinda makes sense to acknowledge it special nature by
not doing explicit `return 0` from `main`. `main` looks cleaner without it.
Now I'm astonished by that comment (to say the least).
I'm regularly returning status and error information to the calling
instance to act upon it. And you're saying that I should not return
any value? Or only values that are different from 0? Or only 0 if I
also return other values? - Whatever; that sounds all wrong to me.
Or is 'return' deprecated or depreciated and we should now rewrite
our source code and replace every 'return' by 'exit()', or use now
only 'exit()' in the first place instead?
I don't think you will convince me to not return 0 only because in
the special case that there's nothing specified the current language
standards defines that per default it implicitly provides that value
for me.
Having a "return 0;" at the end of main() is neither obsolescent nor
deprecated. It's still perfectly valid, with the expected semantics.

The only that changed in C99 is that reaching the closing "}" of
the main function does an implicit "return 0;". This change was
borrowed from C++.

I do tend to dislike the change; it feels like an unnecessary special
case, and I used to advise people to add an explicit "return 0;"
anyway. This was partly because C99 compilers were relatively rare,
and there were advantages to writing code that still worked with
C90-only compilers.

I usually omit the "return 0;", but I don't object to including it --
though I will occasionally mention in passing that it's "unnecessary
but harmless". If nothing else, the history is interesting (at
least to me).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2025-02-10 18:33:55 UTC
Reply
Permalink
Post by Keith Thompson
[...]
Having a "return 0;" at the end of main() is neither obsolescent nor
deprecated. It's still perfectly valid, with the expected semantics.
The only that changed in C99 is that reaching the closing "}" of
the main function does an implicit "return 0;". This change was
borrowed from C++.
I do tend to dislike the change; it feels like an unnecessary special
case, and I used to advise people to add an explicit "return 0;"
anyway. This was partly because C99 compilers were relatively rare,
and there were advantages to writing code that still worked with
C90-only compilers.
I usually omit the "return 0;", but I don't object to including it --
though I will occasionally mention in passing that it's "unnecessary
but harmless". If nothing else, the history is interesting (at
least to me).
I'm with you.

Personally I think it's an uninteresting point; primarily because
it's a detail and, as you wrote, also a special case.

I would prefer - but for that reason of minor importance to me I
also don't mind as it is - if the language would allow all sorts
of interface variants, e.g.

void main (void) { ... } or also void main () { ... }
int main (void) { ... ; return rc; }

void main (int argc, ...) { ... }
int main (int argc, ...) { ... ; return rc; }

if they are _consistent_; i.e. no 'return' allowed in case of a
void declaration (with an implicit return 0 to the environment),
and requiring an explicit 'return' in case of an int declaration.

What I find bad is if some special [visibly inconsistent] form
is the norm and other [more consistent] forms are "UB" with the
"threat" of the proverbial missile launch as possible outcome.

I haven't checked what's actually valid in current or former
"C" compilers or standards. As it's of minor relevance (to me)
I just see whether the compiler accepts my main() declaration
or not. (There was never a problem here, so why should I care.)

Janis
Keith Thompson
2025-02-10 22:17:06 UTC
Reply
Permalink
Janis Papanagnou <janis_papanagnou+***@hotmail.com> writes:
[...]
Post by Janis Papanagnou
I haven't checked what's actually valid in current or former
"C" compilers or standards. As it's of minor relevance (to me)
I just see whether the compiler accepts my main() declaration
or not. (There was never a problem here, so why should I care.)
That was probably meant to be a rhetorical question, but ...

Because it might not always work in the future, when you port your
code to another system, when a new release of the compiler you're
using is installed, or during the next full moon. If your code
has undefined behavior, it might appear to work perfectly until
it doesn't.

If your code is valid according to the C standard, you're less likely to
run into problems -- and if a compiler rejects your code you have a good
basis for a bug report. If "void main()" or "main()" starts behaving
differently, or being rejected, you don't have much recourse.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Michael S
2025-02-09 10:39:18 UTC
Reply
Permalink
On Sun, 9 Feb 2025 10:54:36 +0100
Post by Janis Papanagnou
Post by Andrey Tarasevich
Post by Janis Papanagnou
I've found examples on the Net where the arrays have been defined
in a function context and the size passed as parameter
f(int n) {
char * arr[n];
...
}
Yes, that would be a VLA.
Post by Janis Papanagnou
That reminded me on other languages where you'd need at least a
block context for dynamically sized arrays, like
int n = 5;
{
char * arr[n];
...
}
But a function body is in itself a block. Inside a function body
you are already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc,
-std=C99).
You mean you did this at file scope? No, VLAs are illegal at file
scope. And I was unable to repeat this feat in GCC.
Oh, sorry, no; above I had just written an excerpt. - Actually I had
those two examples above within a main() function. - Sorry again for
my inaccuracy.
What I meant was (with surrounding context) that I knew (from _other_
languages) a syntax like
main ()
{
int n = 5;
{
char * arr[n];
...
}
}
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
and it seemed to work that way. (In those other languages that wasn't
possible.)
Post by Andrey Tarasevich
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind that
support for local declarations of VLA _objects_ is now optional
(i.e. not portable). Support for variably-modified _types_
themselves (VLA types) is mandatory. But you are not guaranteed to
be able to declare an actual VLA variable.
I fear I don't understand what you're saying here. - By "now" do you
mean newer versions of the C standards? That you can rely only, say,
rely on it with C99 but maybe not before and not in later C standards
conforming compilers?
Yes, theoretically.
In practice, I am not sure that there exists fully conforming C17 or
especially C23 compiler that does not support VLA. But there exists one
important almost-C17 compiler that does not support VLA.

There is another problem in your code - it assigns string literal to
non-const char*. It is legal, as far as 'C' Standard is concerned, but
makes very little practical sense, because any attempt to assign to
string literal through resulting pointer is UB. And not just a
theoretical UB, but a real-world UB.
Post by Janis Papanagnou
For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.
Janis
Post by Andrey Tarasevich
Post by Janis Papanagnou
[...]
Janis Papanagnou
2025-02-09 17:18:04 UTC
Reply
Permalink
Post by Michael S
On Sun, 9 Feb 2025 10:54:36 +0100
[...]
There is another problem in your code - it assigns string literal to
non-const char*. It is legal, as far as 'C' Standard is concerned, but
makes very little practical sense, because any attempt to assign to
string literal through resulting pointer is UB. And not just a
theoretical UB, but a real-world UB.
This comment specifically draw my attention and made me nervous.

You know, I'm rarely programming in plain "C", and while in C++
I generally try to program in "const-correct" form I never make
use of 'const' in "C". - Unless the compiler complains about it,
but I don't recall it (ever?) did.

In my test application I actually never assign string literals
or strings to any other string object (modulo the buffer that I
filled with a 'fgets'). I operate solely with pointers to 'argv'
elements and to the 'char buf[]' buffer data.

Do you see any issue with that?

Janis
Michael S
2025-02-09 17:57:11 UTC
Reply
Permalink
On Sun, 9 Feb 2025 18:18:04 +0100
Post by Janis Papanagnou
Post by Michael S
On Sun, 9 Feb 2025 10:54:36 +0100
[...]
There is another problem in your code - it assigns string literal to
non-const char*. It is legal, as far as 'C' Standard is concerned,
but makes very little practical sense, because any attempt to
assign to string literal through resulting pointer is UB. And not
just a theoretical UB, but a real-world UB.
This comment specifically draw my attention and made me nervous.
You know, I'm rarely programming in plain "C", and while in C++
I generally try to program in "const-correct" form
Which, I suppose, is not easy.
Post by Janis Papanagnou
I never make
use of 'const' in "C". - Unless the compiler complains about it,
but I don't recall it (ever?) did.
In my test application I actually never assign string literals
or strings to any other string object (modulo the buffer that I
filled with a 'fgets'). I operate solely with pointers to 'argv'
elements and to the 'char buf[]' buffer data.
Do you see any issue with that?
Janis
I see no issues.

Generally, due to absence of user-defined polymorphism, C does not have
the type of ugly surprises with constness that make life of C++
programmers miserable. Still, behavior of string literals can be
surprising.
I would guess that if it was feasible to make a breaking changes, C89
would define type of string literals as 'const char*' rather than
'char*'. But breaking changes were not feasible.
Janis Papanagnou
2025-02-09 18:10:43 UTC
Reply
Permalink
Post by Michael S
On Sun, 9 Feb 2025 18:18:04 +0100
Post by Janis Papanagnou
Post by Michael S
On Sun, 9 Feb 2025 10:54:36 +0100
[...]
There is another problem in your code - it assigns string literal to
non-const char*. It is legal, as far as 'C' Standard is concerned,
but makes very little practical sense, because any attempt to
assign to string literal through resulting pointer is UB. And not
just a theoretical UB, but a real-world UB.
This comment specifically draw my attention and made me nervous.
You know, I'm rarely programming in plain "C", and while in C++
I generally try to program in "const-correct" form
Which, I suppose, is not easy.
Well, if you start with it there seems to be no end. But once you
understand and got used to it it's like code patterns you apply
without thinking.
Post by Michael S
Post by Janis Papanagnou
I never make
use of 'const' in "C". - Unless the compiler complains about it,
but I don't recall it (ever?) did.
In my test application I actually never assign string literals
or strings to any other string object (modulo the buffer that I
filled with a 'fgets'). I operate solely with pointers to 'argv'
elements and to the 'char buf[]' buffer data.
Do you see any issue with that?
Janis
I see no issues.
Good to hear.
Post by Michael S
Generally, due to absence of user-defined polymorphism, C does not have
the type of ugly surprises with constness that make life of C++
programmers miserable. Still, behavior of string literals can be
surprising.
Actually I think there's a lot that can surprise the unwary in "C". :-)

As "simple" as it is, I wouldn't classify it as an "easy" language
(i.e. it's not one that prevents you from making mistakes).
Post by Michael S
I would guess that if it was feasible to make a breaking changes, C89
would define type of string literals as 'const char*' rather than
'char*'. But breaking changes were not feasible.
Janis
Keith Thompson
2025-02-10 00:46:15 UTC
Reply
Permalink
Post by Michael S
On Sun, 9 Feb 2025 18:18:04 +0100
Post by Janis Papanagnou
Post by Michael S
On Sun, 9 Feb 2025 10:54:36 +0100
[...]
There is another problem in your code - it assigns string literal to
non-const char*. It is legal, as far as 'C' Standard is concerned,
but makes very little practical sense, because any attempt to
assign to string literal through resulting pointer is UB. And not
just a theoretical UB, but a real-world UB.
This comment specifically draw my attention and made me nervous.
You know, I'm rarely programming in plain "C", and while in C++
I generally try to program in "const-correct" form
Which, I suppose, is not easy.
Post by Janis Papanagnou
I never make
use of 'const' in "C". - Unless the compiler complains about it,
but I don't recall it (ever?) did.
In my test application I actually never assign string literals
or strings to any other string object (modulo the buffer that I
filled with a 'fgets'). I operate solely with pointers to 'argv'
elements and to the 'char buf[]' buffer data.
There's no such thing as a "string object" in C. See below.
Post by Michael S
Post by Janis Papanagnou
Do you see any issue with that?
[...]
Post by Michael S
I see no issues.
Generally, due to absence of user-defined polymorphism, C does not have
the type of ugly surprises with constness that make life of C++
programmers miserable. Still, behavior of string literals can be
surprising.
I would guess that if it was feasible to make a breaking changes, C89
would define type of string literals as 'const char*' rather than
'char*'. But breaking changes were not feasible.
The type of string literals would be const char[N], not const char*.
(C++ did exactly that.)

In C, a *string* is by definition "a contiguous sequence of characters
terminated by and including the first null character". A *pointer to a
string* is "a pointer to its initial (lowest addressed) character". A
string is not a data type; it's a data layout. An array of char may or
may not have a string *as its contents* (or part of its contents).

A string literal "foo" represents an anonymous array object, in this
case of type char[4]. C++ made string literals const, but C did not.
In C, any attempt to modify the contents of the array object
corresponding to a string literal has undefined behavior.

In C, you can legally write:
char *ptr = "hello";
but then something like `ptr[0] = 'H';` is legal but has undefined
behavior, and likely will not trigger a warning. The recommended
practice is that any pointer to a string should be defined with "const":
const char *ptr = "hello";
so that if you later try `ptr[0] = 'H';` it will be rejected.

Sections 6 and 8 of the comp.lang.c FAQ, <https://www.c-faq.com/>,
cover "Arrays and Pointers" and "Characters and Strings",
respectively.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2025-02-10 01:56:29 UTC
Reply
Permalink
Post by Michael S
On Sun, 9 Feb 2025 18:18:04 +0100
There's no such thing as a "string object" in C. [...]
Since I didn't know what Michael was suspecting as a potential
problem I wanted to cover two aspects; assignation s = "literal"
and s = strcpy() . (Nothing more was intended to express by my
reply.)
[ "C" and C++ considerations of strings character arrays, pointers ]
(In C++ I'm not using "C" strings but string objects, plus literal
strings of course.)

Janis
Waldek Hebisch
2025-02-09 14:29:00 UTC
Reply
Permalink
Post by Janis Papanagnou
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
and it seemed to work that way. (In those other languages that wasn't
possible.)
Hmm, IIRC both (Extended) Pascal and PL/I allow VLA within function
in places where they allow variable declarations, so really no
special _inner block_ requirement (just must be local to a function).
Post by Janis Papanagnou
Post by Andrey Tarasevich
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind that
support for local declarations of VLA _objects_ is now optional (i.e.
not portable). Support for variably-modified _types_ themselves (VLA
types) is mandatory. But you are not guaranteed to be able to declare an
actual VLA variable.
I fear I don't understand what you're saying here. - By "now" do you
mean newer versions of the C standards? That you can rely only, say,
rely on it with C99 but maybe not before and not in later C standards
conforming compilers?
For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.
This used to be GCC extention, AFAIK present in all versions of
GCC. It was standarized in C99 and is now available in reasonable
C compilers like curent tcc (Tiny C). But AFAIK it is not present
in Microsoft C. And probably not present in historic C compilers
like compilers for old proprietary Unices or historic tcc (Tendra
C compiler). As already mentioned compiler can claim C11 or C23
compliance but do not implement VLA (VMT are optional in C11
but mandatoryu in C23, VLA are optional both in C11 and C23).
--
Waldek Hebisch
Janis Papanagnou
2025-02-09 17:19:05 UTC
Reply
Permalink
Post by Waldek Hebisch
Post by Janis Papanagnou
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
and it seemed to work that way. (In those other languages that wasn't
possible.)
Hmm, IIRC both (Extended) Pascal and PL/I allow VLA within function
in places where they allow variable declarations, so really no
special _inner block_ requirement (just must be local to a function).
I had other languages in mind.[*]

Janis

[*] For example Simula:

begin
integer n;
n := inint;
begin
integer array arr (1:n);
arr(5) := 9;
end
end

(Which is also understandable, since in Simula declarations must appear
before statements in any block.)
Andrey Tarasevich
2025-02-09 17:29:12 UTC
Reply
Permalink
Post by Janis Papanagnou
I had other languages in mind.[*]
Janis
begin
integer n;
n := inint;
begin
integer array arr (1:n);
arr(5) := 9;
end
end
(Which is also understandable, since in Simula declarations must appear
before statements in any block.)
Well, in that case your previous mentions of "block context" are not
related to VLAs at all. You apparently meant that in order to introduce
a new declaration, any new declaration "in the middle of the code" one
needs to open a new block - just because the language requires all
declarations to reside at the beginning of a block.

This was the case in C90, where we also sometimes had to open new blocks
to introduce new declarations. But starting from C99 this is no longer
necessary. In C99 one can simply place declarations in the middle of the
code "C++-style" (the underlying semantics is still different from C++,
but syntactically/superficially it looks pretty much the same).
--
Best regards,
Andrey
Janis Papanagnou
2025-02-09 17:46:44 UTC
Reply
Permalink
Post by Andrey Tarasevich
Post by Janis Papanagnou
I had other languages in mind.[*]
Janis
begin
integer n;
n := inint;
begin
integer array arr (1:n);
arr(5) := 9;
end
end
(Which is also understandable, since in Simula declarations must appear
before statements in any block.)
Well, in that case your previous mentions of "block context" are not
related to VLAs at all. You apparently meant that in order to introduce
a new declaration, any new declaration "in the middle of the code" one
needs to open a new block - just because the language requires all
declarations to reside at the beginning of a block.
Yes, but that (declaration-statement-ordering) is language dependent;
in Simula it's [for this reason] a necessity, but not necessarily in
other languages. - Whether any language allows that or not is not a
given per se. - I just did not know how "C" handles VLA declarations;
and that's why I tried it whether it works or not. But a positive
outcome of my tries is still no proof; my test-wise - deliberately
wrong! - assignment to 'a[99]' produced also no compiler complaints,
so I want to get an affirmation whether you need an extra scope.
The sample from the Net that had the array size parameter passed as
function argument also didn't answer that.
Post by Andrey Tarasevich
This was the case in C90, where we also sometimes had to open new blocks
to introduce new declarations. But starting from C99 this is no longer
necessary. In C99 one can simply place declarations in the middle of the
code "C++-style" (the underlying semantics is still different from C++,
but syntactically/superficially it looks pretty much the same).
Janis
Janis Papanagnou
2025-02-09 17:53:33 UTC
Reply
Permalink
[...] my test-wise - deliberately
wrong! - assignment to 'a[99]' produced also no compiler complaints,
BTW, it produced also no core dump or any other runtime error.
(But it is obviously severely wrong code anyway.)

Janis
James Kuyper
2025-02-09 20:55:41 UTC
Reply
Permalink
Post by Janis Papanagnou
[...] my test-wise - deliberately
wrong! - assignment to 'a[99]' produced also no compiler complaints,
BTW, it produced also no core dump or any other runtime error.
(But it is obviously severely wrong code anyway.)
Accessing an array beyond it's bounds has undefined behavior, whether or
not it is a VLA. "undefined behavior" means that the C standard imposes
no requirements on the behavior. In particular, it does not require a
diagnostic message, nor a core dump, nor any other kind of runtime error.
The standard says "undefined behavior" when there are some situations
where it can be arbitrarily difficult to identify violations of a rule
at compile time. That applies to this rule, because if pointers are
used, it can be quite difficult to confirm whether or not a given access
will violate this rule.
In this particular case. however, it would be trivial to detect the
violation, but none of the compilers I've tested do so.
When such code is accepted by an implementation, what is most likely to
happen is that the compiler will generate code that creates an array
containing "foobar", and which attempts to write a pointer to the first
element of that array to the location where a[99] should be, if a had at
least 100 elements. Depending upon what that piece of memory is being
used for, the results could be catastrophic, or completely innocuous, or
somewhere in-between.
Waldek Hebisch
2025-02-10 01:36:01 UTC
Reply
Permalink
Post by James Kuyper
Post by Janis Papanagnou
[...] my test-wise - deliberately
wrong! - assignment to 'a[99]' produced also no compiler complaints,
BTW, it produced also no core dump or any other runtime error.
(But it is obviously severely wrong code anyway.)
Accessing an array beyond it's bounds has undefined behavior, whether or
not it is a VLA. "undefined behavior" means that the C standard imposes
no requirements on the behavior. In particular, it does not require a
diagnostic message, nor a core dump, nor any other kind of runtime error.
The standard says "undefined behavior" when there are some situations
where it can be arbitrarily difficult to identify violations of a rule
at compile time. That applies to this rule, because if pointers are
used, it can be quite difficult to confirm whether or not a given access
will violate this rule.
In this particular case. however, it would be trivial to detect the
violation, but none of the compilers I've tested do so.
When such code is accepted by an implementation, what is most likely to
happen is that the compiler will generate code that creates an array
containing "foobar", and which attempts to write a pointer to the first
element of that array to the location where a[99] should be, if a had at
least 100 elements. Depending upon what that piece of memory is being
used for, the results could be catastrophic, or completely innocuous, or
somewhere in-between.
Both gcc and tcc are supposed to allocate 'a' on the stack, below
normal variables. 99 is large enough that write to a[99] probably
is doing no harm to variables needed to finish the program, so
no observale effect.

I slightly modifed the program to:

int
main(void) {
int n = 5;
char * arr[n];
for(int i = 0; i < 99; i++) {
arr[i] = 0;
}
return 0;
}

After that tcc produces apparently infinite loop. Program compiled
using 'gcc arr2.c' (I stored the program if file called 'arr2.c')
crashes. Using 'gcc -O arr2.c' apparently optimizes stores to
nothing (arr is local and unused otherwise, so stores do not have
any defined observable effect). Using 'tcc -b arr2.c' activates
bounds checking in tcc. Due to this program exits with sensible
error message:

$ ./a.out
arr2.c:6: at main: BCHECK: 0x7ffc41829d08 is outside of the region
arr2.c:6: at main: RUNTIME ERROR: invalid memory access
--
Waldek Hebisch
Janis Papanagnou
2025-02-10 02:01:41 UTC
Reply
Permalink
Post by James Kuyper
Post by Janis Papanagnou
[...] my test-wise - deliberately
wrong! - assignment to 'a[99]' produced also no compiler complaints,
BTW, it produced also no core dump or any other runtime error.
(But it is obviously severely wrong code anyway.)
Accessing an array beyond it's bounds has undefined behavior, whether or
not it is a VLA. "undefined behavior" means that the C standard imposes
no requirements on the behavior. In particular, it does not require a
diagnostic message, nor a core dump, nor any other kind of runtime error.
Yes. (There's reasons why I prefer programming languages with checks.)
Post by James Kuyper
[...]
In this particular case. however, it would be trivial to detect the
violation, but none of the compilers I've tested do so.
Yes, because it's a special case with a constant defined. But the
value may come from "outside", undetectable by the compiler and only
detectable during runtime if there's additional information maintained
(stored and checked).
Post by James Kuyper
When such code is accepted by an implementation, what is most likely to
happen is that the compiler will generate code that creates an array
containing "foobar", and which attempts to write a pointer to the first
element of that array to the location where a[99] should be, if a had at
least 100 elements. Depending upon what that piece of memory is being
used for, the results could be catastrophic, or completely innocuous, or
somewhere in-between.
Yes. That matches exactly my fears or expectations here [with "C"].

Janis
David Brown
2025-02-10 10:39:40 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by James Kuyper
Post by Janis Papanagnou
[...] my test-wise - deliberately
wrong! - assignment to 'a[99]' produced also no compiler complaints,
BTW, it produced also no core dump or any other runtime error.
(But it is obviously severely wrong code anyway.)
Accessing an array beyond it's bounds has undefined behavior, whether or
not it is a VLA. "undefined behavior" means that the C standard imposes
no requirements on the behavior. In particular, it does not require a
diagnostic message, nor a core dump, nor any other kind of runtime error.
Yes. (There's reasons why I prefer programming languages with checks.)
Post by James Kuyper
[...]
In this particular case. however, it would be trivial to detect the
violation, but none of the compilers I've tested do so.
Yes, because it's a special case with a constant defined. But the
value may come from "outside", undetectable by the compiler and only
detectable during runtime if there's additional information maintained
(stored and checked).
Post by James Kuyper
When such code is accepted by an implementation, what is most likely to
happen is that the compiler will generate code that creates an array
containing "foobar", and which attempts to write a pointer to the first
element of that array to the location where a[99] should be, if a had at
least 100 elements. Depending upon what that piece of memory is being
used for, the results could be catastrophic, or completely innocuous, or
somewhere in-between.
Yes. That matches exactly my fears or expectations here [with "C"].
Janis
Let's assume your full code is :

int main() {
int n = 5;
char * arr[n];

arr[99] = "foobar";
}


In C, that means exactly the same as a do-nothing program:

int main() { }

There is no observable behaviour. So an optimising compiler (like gcc,
if you have enabled optimisation) will give a program that simply returns 0.

If optimisation is not enabled, or a weak compiler is used, the array
will be put on the stack and your assignment will write to a part of the
stack that exists in memory. It /might/ stomp over something important,
but there's a fair chance that - by luck - nothing bad happens.


The C language does not have much in the way of checks other than for
syntax, grammar and constraints. But C /implementations/ do far better,
as long as you enable them.

When you are checking code like this, I recommend you get in the habit
of using <https://godbolt.org> for convenience, as it makes it easy to
check snippets with the newest compiler versions. And use command-line
flags such as :

-std=c99 -Wpedantic -Wall -Wextra -O2

The "-Wpedantic" makes sure that all standards-required diagnostics are
produced (to the best of the compiler's abilities). "-Wall" should
always be used as a basis to give useful warnings. "-Wextra" is more
debatable - it warns on things that some people feel is good coding
practice, other people feel is poor coding practice. But I think it is
very useful for this kind of thing, even if you don't use it in your
main development. And it is a good idea to enable optimisation - it
makes warnings better, and it gives you a better idea of what your code
/really/ means.

So with this example code, you'd see the virtually empty generated
assembly, and a warning that "arr" is set but never used. (I am
disappointed that neither gcc nor clang can spot the clearly
out-of-bounds access.)

Sometimes gcc will give the best static error checking, sometimes clang
- use both tools.


The next step is to use run-time checkers. Compile the code with the
flag "-fsanitize=undefined" and you will get a run-time check and
run-time error on the array bounds violation. (You can do this from
within godbolt.org).


C (and C++) have a vast range of checks available to developers - you
just have to know how to use them.
Janis Papanagnou
2025-02-10 17:57:22 UTC
Reply
Permalink
Post by David Brown
[...]
int main() {
int n = 5;
char * arr[n];
arr[99] = "foobar";
}
This assumption is incorrect (for my case), so all derived possible
implications like
Post by David Brown
int main() { }
[...]
are meaningless.

Janis
David Brown
2025-02-10 18:19:19 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by David Brown
[...]
int main() {
int n = 5;
char * arr[n];
arr[99] = "foobar";
}
This assumption is incorrect (for my case), so all derived possible
implications like
Post by David Brown
int main() { }
[...]
are meaningless.
The details will of course be different because your code is different
(you didn't show it, so all anyone can do is guess). But the type of
effects I described, and the way to get good information about your code
and how it works, is valid for a wide range of code. The fact that you
think C does not have checks, and that you dislike that your erroneous
code compiles and runs without you being informed of the errors, shows
that you could benefit from the kind of suggestions I gave about good
tool usage.

Understanding how to get the best from your tools is as important a part
of software development as understanding the details of the language you
are using. You'd learn more about both if you didn't skip and snip
useful help.
Michael S
2025-02-09 18:19:02 UTC
Reply
Permalink
On Sun, 9 Feb 2025 18:46:44 +0100
Post by Janis Papanagnou
wrong! - assignment to 'a[99]' produced also no compiler complaints,
gcc produces warning in this case, but only at optimization level of 2
or higher.
clang does not warn at all, which is disappointing.

OTOH, for non-VLA clang (and MSVC) warn at any optimization level. gcc
behaves the same as with VLA, which is disappointing.
Michael S
2025-02-10 10:38:02 UTC
Reply
Permalink
On Mon, 10 Feb 2025 02:44:26 +0100
Post by Michael S
On Sun, 9 Feb 2025 18:46:44 +0100
Post by Janis Papanagnou
wrong! - assignment to 'a[99]' produced also no compiler
complaints,
gcc produces warning in this case, but only at optimization level
of 2 or higher.
Which version of gcc?
14.1

My test code is:

void bar(char*[]);
int foo(void)
{
int n = 10;
char *a[n];
bar(a);
a[99] = "42";
return a[3][2];
}
Tried with gcc 14.2 (x86-64) with -Wall -O3 (or -O2, same), it
doesn't give any warning whatsoever. (And yes, same with clang.)
May be, in your test arr[] is not used later, so compiler silently
optimizes away all accesses?
error: Array 'arr[5]' accessed at index 99, which is out of bounds.
[arrayIndexOutOfBounds]
arr[99] = "foobar";
I highly recommend using Cppcheck as a static analyzer (at the bare
minimum, there are better out there). Compilers are pretty basic in
terms of static analysis.
Opus
2025-02-10 22:08:35 UTC
Reply
Permalink
Post by Michael S
On Mon, 10 Feb 2025 02:44:26 +0100
Post by Michael S
On Sun, 9 Feb 2025 18:46:44 +0100
Post by Janis Papanagnou
wrong! - assignment to 'a[99]' produced also no compiler
complaints,
gcc produces warning in this case, but only at optimization level
of 2 or higher.
Which version of gcc?
14.1
void bar(char*[]);
int foo(void)
{
int n = 10;
char *a[n];
bar(a);
a[99] = "42";
return a[3][2];
}
Tried with gcc 14.2 (x86-64) with -Wall -O3 (or -O2, same), it
doesn't give any warning whatsoever. (And yes, same with clang.)
May be, in your test arr[] is not used later, so compiler silently
optimizes away all accesses?
You're right. I was actually using arr[] to avoid this pitfall, but the
way I used it was not 'reading' arr[99], and so the compiler did what
you say.

Note that with your example, if you comment out the bar() call, you
won't get the warning (but you'll get another one about a[3][2] being
used uninitialized). And as you did, declaring an external function
bar() guarantees that the compiler can't guess what it does, so that
prevents any further optimization on the array access.

The underlying "issue" is that gcc analyzes code after the optimization
pass (or it does remove warnings that it detected before optimization on
'dead code'). This may be defended. There are quite a few tickets on
their bugzilla about this though, because it tends to surprise many
people, and while the example above is relatively simple, there are tons
of potential cases where it's way less trivial to understand.

I would personally favor giving a warning even if the code is optimized
away during optimization, and actually mentioning that the statement has
no effect (after optimization). I'm not sure the GCC team either cares,
or that it's even doable considering the architecture of the compiler.
Just a thought.

As I said, don't hesitate to use a third-party static analyzer to
complement compiler warnings.
Keith Thompson
2025-02-09 23:50:26 UTC
Reply
Permalink
Janis Papanagnou <janis_papanagnou+***@hotmail.com> writes:
[...]
Post by Janis Papanagnou
I had other languages in mind.[*]
Janis
begin
integer n;
n := inint;
begin
integer array arr (1:n);
arr(5) := 9;
end
end
(Which is also understandable, since in Simula declarations must appear
before statements in any block.)
Apparently Simula doesn't allow variables to be initialized in their
declarations. Because of that, `n := inint;` (where inint is actually a
call to a value-returning procedure) is a separate statement. If
Simula supported C-style initializers, then presumably the inner block
would not be needed:

begin
integer n := inint;
integer array arr (1:n);
arr(5) := 9;
end
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2025-02-10 02:12:18 UTC
Reply
Permalink
Post by Keith Thompson
[...]
Post by Janis Papanagnou
I had other languages in mind.[*]
Janis
begin
integer n;
n := inint;
begin
integer array arr (1:n);
arr(5) := 9;
end
end
(Which is also understandable, since in Simula declarations must appear
before statements in any block.)
Apparently Simula doesn't allow variables to be initialized in their
declarations.
The Cim compiler I use actually supports a constant initialization -
but I'm positive this is non-standard! - like

begin
integer n = 10;
integer array arr (1:n);
arr(5) := 9;
end
Post by Keith Thompson
Because of that, `n := inint;` (where inint is actually a
call to a value-returning procedure) is a separate statement. If
Simula supported C-style initializers, then presumably the inner block
Indeed. (See non-standard example above.)
Post by Keith Thompson
begin
integer n := inint;
integer array arr (1:n);
arr(5) := 9;
end
This is not possible because 'n' isn't a constant here (and 'inint' is
a function, as you say). - So you're correct that the block would not
be an inherent necessity for that declaration [with Cim compiler].
A general statement for the standard Simula language cannot be given,
though, since it's undefined.

Janis
Janis Papanagnou
2025-02-10 06:27:22 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by Keith Thompson
[...]
Post by Janis Papanagnou
I had other languages in mind.[*]
Janis
begin
integer n;
n := inint;
begin
integer array arr (1:n);
arr(5) := 9;
end
end
(Which is also understandable, since in Simula declarations must appear
before statements in any block.)
Apparently Simula doesn't allow variables to be initialized in their
declarations.
The Cim compiler I use actually supports a constant initialization -
but I'm positive this is non-standard! - like
begin
integer n = 10;
integer array arr (1:n);
arr(5) := 9;
end
I should have noted that this (non-standard) syntax doesn't really
address the problem (since you have just a named constant that you
may use in several places, but the array is *not* really dynamic,
its memory demands are fixed and known for that block).

I think the primary point for an explanation of the necessity of a
block structure is different. - It is typical that stack memory is
allocated when a block is "entered". For the stack allocation the
size of all local objects should be known. But in the first example
above the outer block has only knowledge about the size of the 'n'
integer variable, but without a concrete value for 'n' we don't
know how many space the dynamic array will need. But if we open a
new (subordinated) block the array size is determined if declared
there.

BTW, this is also true for Algol 60, which is effectively a subset
of Simula.
Post by Janis Papanagnou
Post by Keith Thompson
Because of that, `n := inint;` (where inint is actually a
call to a value-returning procedure) is a separate statement. If
Simula supported C-style initializers, then presumably the inner block
Indeed. (See non-standard example above.)
Post by Keith Thompson
begin
integer n := inint;
integer array arr (1:n);
arr(5) := 9;
end
This is not possible because 'n' isn't a constant here (and 'inint' is
a function, as you say). - So you're correct that the block would not
be an inherent necessity for that declaration [with Cim compiler].
So, given my above stack memory argumentation, I'd have to withdraw
my agreement; and not only because it's irrelevant (in several ways).
Post by Janis Papanagnou
A general statement for the standard Simula language cannot be given,
though, since it's undefined.
Janis
Keith Thompson
2025-02-10 06:49:10 UTC
Reply
Permalink
[...]
Post by Janis Papanagnou
Post by Janis Papanagnou
The Cim compiler I use actually supports a constant initialization -
but I'm positive this is non-standard! - like
begin
integer n = 10;
integer array arr (1:n);
arr(5) := 9;
end
I should have noted that this (non-standard) syntax doesn't really
address the problem (since you have just a named constant that you
may use in several places, but the array is *not* really dynamic,
its memory demands are fixed and known for that block).
I think the primary point for an explanation of the necessity of a
block structure is different. - It is typical that stack memory is
allocated when a block is "entered". For the stack allocation the
size of all local objects should be known. But in the first example
above the outer block has only knowledge about the size of the 'n'
integer variable, but without a concrete value for 'n' we don't
know how many space the dynamic array will need. But if we open a
new (subordinated) block the array size is determined if declared
there.
In C lifetime of a non-VLA object with automatic storage duration
(defined within a block without "static") extends from when execution
reaches the opening "{" of the enclosing block to when it reaches the
closing "}". There are well-defined ways you can even access such an
object in code that appears textually before its definition (by saving
its address and using a goto). This means that a conforming
implementation *must* allocate memory for non-VLA automatic objects on
block entry; waiting until the definition is encountered would be
non-conforming.

But a VLA object's lifetime begins, not when execution reaches the
opening "{", but when it reaches the object definition.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2025-02-10 18:14:13 UTC
Reply
Permalink
Post by Keith Thompson
[ Simula / Cim stuff snipped ]
[...]
But a VLA object's lifetime begins, not when execution reaches the
opening "{", but when it reaches the object definition.
Yes, I assumed so [for "C")´].

My explanations were meant only for the Algol 60 and Simula rationale.
(But mind, that the whole thing of mentioning these other languages
was anyway only a side-track after someone mentioned Pascal and PL/I.
It was to understand what semantics are possible in principle. None
of these language have any relevance for how "C" handles VLAs.)

The solution was already formulated elsethread long ago. And all the
rest of this thread was the CLC-typical OT spin-offs. - So I'm fine
with the insights on the topic. And I'm (slightly amused) following
and participating in the rest of the discussions. :-)

Janis
Andrey Tarasevich
2025-02-09 15:15:54 UTC
Reply
Permalink
Post by Janis Papanagnou
Post by Andrey Tarasevich
Post by Janis Papanagnou
I've found examples on the Net where the arrays have been defined in a
function context and the size passed as parameter
f(int n) {
char * arr[n];
...
}
Yes, that would be a VLA.
Post by Janis Papanagnou
That reminded me on other languages where you'd need at least a block
context for dynamically sized arrays, like
int n = 5;
{
char * arr[n];
...
}
But a function body is in itself a block. Inside a function body you are
already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc, -std=C99).
You mean you did this at file scope? No, VLAs are illegal at file scope.
And I was unable to repeat this feat in GCC.
Oh, sorry, no; above I had just written an excerpt. - Actually I had
those two examples above within a main() function. - Sorry again for
my inaccuracy.
What I meant was (with surrounding context) that I knew (from _other_
languages) a syntax like
main ()
{
int n = 5;
{
char * arr[n];
...
}
}
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
and it seemed to work that way. (In those other languages that wasn't
possible.)
Post by Andrey Tarasevich
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind that
support for local declarations of VLA _objects_ is now optional (i.e.
not portable). Support for variably-modified _types_ themselves (VLA
types) is mandatory. But you are not guaranteed to be able to declare an
actual VLA variable.
I fear I don't understand what you're saying here. - By "now" do you
mean newer versions of the C standards?
Yes, I mean C23 specifically.
Post by Janis Papanagnou
That you can rely only, say,
rely on it with C99 but maybe not before and not in later C standards
conforming compilers?
Things take some wild swings wrt to VLA support as you progress through
various C standards beginning. Before C99 there was no VLA. In C99
everything VLA is required. In C11 everything VLA is optional. C23 takes
a hybrid approach: the whole thing is required, except support for
declaring automatic VLA objects is optional.

The latter means that you can declare VLA typedefs or VLA parameters in
functions, apply `sizeof` to VLA types and expressions, but not
necessarily declare such arrays locally

void foo(int n, int m, int a[n][m])
// VLA parameter (will be adjusted to pointer to VLA) - supported
{
typedef char A[n];
// VLA typedef - supported

double (*p)[n + m] = malloc(sizeof *p);
// Pointer to VLA and `sizeof` on VLA - supported

A x;
short y[n] = {};
// Both are automatic VLAs - optional
}
Post by Janis Papanagnou
For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.
Formally, if you want to declare local (automatic) VLAs, then the only
version of C standard with which you are completely okay is C99. Later
versions make things more problematic.

However, GNU seems to be dedicated to supporting everything VLA-related.
--
Best regards,
Andrey
Janis Papanagnou
2025-02-09 17:20:04 UTC
Reply
Permalink
Post by Andrey Tarasevich
Post by Janis Papanagnou
That you can rely only, say,
rely on it with C99 but maybe not before and not in later C standards
conforming compilers?
Things take some wild swings wrt to VLA support as you progress through
various C standards beginning. Before C99 there was no VLA. In C99
everything VLA is required. In C11 everything VLA is optional. C23 takes
a hybrid approach: the whole thing is required, except support for
declaring automatic VLA objects is optional.
Interesting, and scary! - Thanks!
Post by Andrey Tarasevich
[...]
Post by Janis Papanagnou
For my purpose it would be okay to know whether with the C99 version
(that I used) it's okay, or whether that's some GNU specific extension
or some such.
Formally, if you want to declare local (automatic) VLAs, then the only
version of C standard with which you are completely okay is C99. Later
versions make things more problematic.
Hmm.. - okay. - Thanks!
Post by Andrey Tarasevich
However, GNU seems to be dedicated to supporting everything VLA-related.
Janis
James Kuyper
2025-02-09 20:34:43 UTC
Reply
Permalink
On 2/9/25 10:15, Andrey Tarasevich wrote:
...
Post by Andrey Tarasevich
Formally, if you want to declare local (automatic) VLAs, then the only
version of C standard with which you are completely okay is C99. Later
versions make things more problematic.
However, even with later versions, such code is OK so long as
__STDC_NO_VLA_ is not pre#defined by the implementation with a value of 1.
James Kuyper
2025-02-09 20:27:29 UTC
Reply
Permalink
...
Post by Janis Papanagnou
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
Post by Janis Papanagnou
What I meant was (with surrounding context) that I knew (from _other_
languages) a syntax like
main ()
{
int n = 5;
{
char * arr[n];
...
}
}
And in "C" (C99) I tried it *without* the _inner block_
main ()
{
int n = 5;
char * arr[n];
...
}
You need to get more familiar with the relevant terminology. In C,
identifiers have a scope within which they may be used, and three of the
relevant kinds of scopes are "file scope", "function scope", and "block
scope". That's sufficiently similar to your mention of "function or
block context", to make people think that "context" might be meant as an
informal equivalent of "scope". That's why people assumed you were
talking about a file scope declaration.

The body of a function is a compound statement, which constitutes a
block. There is another, larger block that includes "The parameter type
list, [and] the attribute specifier sequence of the declarator that
follows the parameter type list, ...". Each of those two blocks has a
separate scope. Identifiers declared inside either of those blocks have
block scope.
"function scope" applies only to label names.
James Kuyper
2025-02-09 20:16:54 UTC
Reply
Permalink
...
Post by Andrey Tarasevich
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc, -std=C99).
You mean you did this at file scope? No, VLAs are illegal at file scope.
And I was unable to repeat this feat in GCC.
While that is correct, it's incomplete. The relevant constraint is
"If an identifier is declared to be an object with static or thread
storage duration, it shall not have a variable length array type."
(6.7.6.2p2).

All objects declared at file scope have static storage duration, but
that rule also applies to objects objects declared at block scope if
they are declared with the "static" or "thread_local" keywords.
Tim Rentsch
2025-02-15 16:19:10 UTC
Reply
Permalink
Post by Andrey Tarasevich
Post by Janis Papanagnou
I've found examples on the Net where the arrays have been defined
in a function context and the size passed as parameter
f(int n) {
char * arr[n];
...
}
Yes, that would be a VLA.
Post by Janis Papanagnou
That reminded me on other languages where you'd need at least a
block context for dynamically sized arrays, like
int n = 5;
{
char * arr[n];
...
}
But a function body is in itself a block. Inside a function body
you are already in "a block context".
Post by Janis Papanagnou
Anyway. I tried it without function or block context
int n = 5;
char * arr[n];
...
and it seemed to work seamlessly like that (with GNU cc,
-std=C99).
You mean you did this at file scope? No, VLAs are illegal at file
scope. And I was unable to repeat this feat in GCC.
Post by Janis Papanagnou
Q1: Is this a correct (portable) form?
VLA objects have to be declared locally. However, keep in mind
that support for local declarations of VLA _objects_ is now
optional (i.e. not portable). Support for variably-modified
_types_ themselves (VLA types) is mandatory. But you are not
guaranteed to be able to declare an actual VLA variable.
In regular English usage, there is no hyphen between "variably"
and "modified" in "variably modified types". The reason for this
rule is "variably" is an adverb, and thus can pertain only to the
adjective "modified", and not to the noun "types". A hyphen is
used only when needed to clarify ambiguity; because there is no
ambiguity, there is no hyphen.

Loading...