Discussion:
What I've learned in comp.lang.c
(too old to reply)
bart
2024-02-05 01:09:10 UTC
Permalink
In no particular order.

* Software development can ONLY be done on a Unix-related OS

* It is impossible to develop any software, let alone C, on pure Windows

* You can spend decades developing and implementing systems languages at
the level of C, but you still apparently know nothing of the subject

* You can spend a decade developing whole-program compilers and a
suitably matched language, and somebody can still lecture you on exactly
what a whole-language compiler is, because you've got it wrong

* No matter how crazy the interface or behaviour of some Linux utility,
no one is ever going to admit there's anything wrong with it

* Every single tool I've written, is a toy.

* Every single project I've worked on, is a toy (even if it made my
company millions)

* No one should post or link code here, unless it passes '-std=c99
-pedantic-errors'

* Discussing build systems for C, is off-topic

* Discussing my C compiler, is off-topic, but discussing gcc is fine

* Nobody here apparently knows how to build a program consisting purely
of C source files, using only a C compiler.

* Simply enumerating the N files and submitting them to the compiler in
any of several easy methods seems to be out of the question. Nobody has
explained why.

* Nearly everyone here is working on massively huge and complex
projects, which all take from minutes to hours for a full build.

* Hardly anybody here has a project which can be built simply by
compiling and linking all the modules. Even Tim Rentsch's simplest
project has a dizzying set of special requirements.

* Funnily enough, every project I /have/ managed to build with my
compilers after eventually getting through the complexity, /has/ reduced
down to a simple list of .c files.

* The Tiny C compiler, is a toy. Even though you'd have trouble telling,
from the behaviour of a binary, whether or not it was built with tcc.

* Actually, any C compiler that is not gcc, clang, or possibly MSVC, is
a toy. Unless you have to buy it.

* There's is nothing wrong with AT&T assembly syntax

* There's especially nothing wrong with AT&T syntax written as a series
of string literals, with extra % symbols, together with \n and \t escapes.

* There is not a single feature of my alternate systems language that is
superior to the C equivalent

* There is not even a single feature that is worth discussing as a
possible feature of C

* There is nothing in my decades of implementing such languages (even
implementing C), that makes my views on such possible features have any
weight at all

* Having fast compilation speed of C is of no use to anyone and
impresses nobody.

* Having code where you naughtily cast a function pointer to or from a
function pointer is a no-no. No matter that the whole of C is widely
regarded as unsafe.

* Nobody here is interested in a simple build system for C. Not even my
idea of a README simply listing the files needed, and any special steps,
to accompany the usual makefiles.

* There is no benefit at all in having a tool like a compiler, be a
small, self-contained executable.

* Generated C code is not real C code.

* I should use makefiles myself for my own language, even though the
build-process is always one, simple, indivisable command that usually
completes in 1/10th of a second.

* Makefiles should be for everything.

* There's no problem in having to specify those pesky .c extensions to
compiler input files, or adding that -o option

* But it's too much work to specify a filename to 'make', or to even
remember what your project is called

* Linux /does/ use .c and .s extensions to distinguish between file contents

* But Linux also uses a.out to mean both an executable and an object
file. Huh.

* C added a 'text' mode to to convert \n to/from CRLF when Windows came
along.

* Somebody who's only developed under Unix, and using a plethora of
ready-made tools and utilities, is not in a bubble.

* But somebody who's developed under a range of other environments
spanning eras, is the one who's been in their own bubble.

* I was crazy to write '1M' lines of code (I've no idea how much) in my
private language

* I am apparently ignorant, a moron and might even be a BOT.

* I am allowed to have strong opinions, but I will always be wrong.



Shall I post this pile of crap or not?

I really need to get back to some of those pointless, worthless toy
projects of mine.

So here goes....
Kaz Kylheku
2024-02-05 05:58:55 UTC
Permalink
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
I've developed on DOS, Windows as well as for DSP chips and some
microcontrollers. I find most of the crap that you say is simply wrong.

Speaking of Windows, the CL.EXE compiler does not know where its
include files are. You literally cannot do "cl program.c".
You have to give it options which tell it where the SDK is installed:
where the headers and libraries are.

The Visual Studio project-file-driven build build system passes all
those details to every invocation of CL.EXE. Your project file (called
a "solution" nowadays) includes information like the path where your SDK
is installed. In the GUI there is some panel where you specify it.

If I'm going to be doing programming on Windows today, it's either going
be some version of that CL.EXE compiler from Microsoft, or GCC.
Post by bart
* You can spend decades developing and implementing systems languages at
the level of C, but you still apparently know nothing of the subject
There is forty years of experience and then there is 8 years, five times
over again.
Post by bart
* You can spend a decade developing whole-program compilers and a
suitably matched language, and somebody can still lecture you on exactly
what a whole-language compiler is, because you've got it wrong
Writing a compiler is pretty easy, because the bar can be set very low
while still calling it a compiler.

Whole-program compilers are easier because there are fewer requirements.
You have only one kind of deliverable to produce: the executable.
You don't have to deal with linkage and produce a linkable format.
Post by bart
* No matter how crazy the interface or behaviour of some Linux utility,
no one is ever going to admit there's anything wrong with it
That is false; the stuff has a lot of critics, mostly from the inside
now. (Linux outsiders are mostly a lunatic fringe nowadays. The tables
have turned.)

You don't seem to understand that the interfaces tools that are not
directly invoked by people don't matter, as long as they are reliable.

And then, interfaces that are exposed to user are hard to change, even
if we don't like them, because changes break things. Everyone hates
breaking changes more than they hate the particular syntax of a tool.

The environment is infinitely customizeable. Users have their private
environments which works they way they want. At the command line,
you can use aliases and shell functions to give yourself the ideal
commands you want.

You only have to use the standard commands when writing scripts to be
used by others. And even then, you can include functions which work
the way you want, and then use your functions.
Post by bart
* Discussing my C compiler, is off-topic, but discussing gcc is fine
GCC is maintained by people who know what a C compiler is, and GCC can
be asked to be one.

You've chosen not to read the C standard, which leaves you unqualified
to even write test cases to validate that something is a C compiler.

Your idea of writing a C compiler seems to be to pick some random
examples of code believed to be C and make them work. (Where "work"
means that they compile and show a few behaviors that look like
the expected ones.)

Basically, you don't present a very credible case that you've actually
written a C compiler.
Post by bart
* Nobody here apparently knows how to build a program consisting purely
of C source files, using only a C compiler.
* Simply enumerating the N files and submitting them to the compiler in
any of several easy methods seems to be out of the question. Nobody has
explained why.
* Nearly everyone here is working on massively huge and complex
projects, which all take from minutes to hours for a full build.
That's the landscape. Nobody is going to pay you for writing small
utilities in C. That sort of thing all went to scripting languages.
(It happens from time to time as a side task.)

I currently work on a a firmware application that compiles to a 100
megabyte (stripped!) executable.
Post by bart
* There is not a single feature of my alternate systems language that is
superior to the C equivalent
The worst curve ball someone could throw you would be to
be eagerly interested in your language, and ask for guidance
in how to get it installed and start working in it.

Then you're screwed.

As long as you just post to comp.lang.c, you're safe from that.
Post by bart
* Having fast compilation speed of C is of no use to anyone and
impresses nobody.
Not as much as fast executable code, unfortunately.

If it takes 10 extra seconds of compilation to shave off a 100
milliseconds off a program, it's worth if it millions of copies of that
program are used.

Most of GCC's run time is spent in optimizing. It's a lot faster
with -O0.

I just measured a 3.38X difference compiling a project with -O0 versus
its usual -O2. This means it's spending over 70% of its time on
optimizing.

The remaining 30% is still kind of slow.

But it's not due to scanning lots of header files.

If I run it with the "-fsyntax-only" option so that it parses all
the syntax, but doesn't produce output, it gets almost 4X faster
(versus -O0, and thus about 13.5X faster compared to -O2).

Mode: | -fsyntax-only | -O0 | -O2 |
Time: | 1.0 | 4.0 | 13.5 |

Thus, about 7.5% is spent on scanning, preprocessing and parsing.
22.2% is spent on the intermediate code processing and target
generation activities, and 70.4 on optimization.

Is it due to decades of legacy code in GCC? Clang is a newer
implementatation, so you might think it's faster than GCC. But it
manages only to be about the same.

Compilers that blaze through large amounts of code in the blink of an
eye are almost certainly dodging on the optimization. And because they
don't need the internal /architecture/ to support the kinds
optimizations they are not doing, they can speed up the code generation
also. There is no need to generate an intermediate representation like
SSA; you can pretty much just parse the syntax and emit assembly code in
the same pass. Particularly if you only target one architecture.

A poorly optimizing retargetable compiler that emits an abstract
intermediate code will never be as blazingly fast as something equally
poorly optimizing that goes straight to code in one pass.
Post by bart
* Having code where you naughtily cast a function pointer to or from a
function pointer is a no-no.
Nobody said that, but it was pointed out that this isn't a feature of
the ISO C standard dialect. It's actually a common extension, widely
exploited by programs. There is nothing wrong with using it, but people
who know C understand that it's not "maximally portable". Most code
does not have to anywhere near "maximally portable".
Post by bart
* There is no benefit at all in having a tool like a compiler, be a
small, self-contained executable.
Not as much as there used to, decades ago.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Chris M. Thomasson
2024-02-05 06:49:50 UTC
Permalink
Post by Kaz Kylheku
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
I've developed on DOS,
TSR's?
Post by Kaz Kylheku
Windows as well as for DSP chips and some
microcontrollers. I find most of the crap that you say is simply wrong.
[...]
Kaz Kylheku
2024-02-05 07:03:20 UTC
Permalink
Post by Chris M. Thomasson
Post by Kaz Kylheku
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
I've developed on DOS,
TSR's?
I did make a couple of TSRs back in the day, but only as a hobby.

Not in C.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Chris M. Thomasson
2024-02-05 07:51:16 UTC
Permalink
Post by Kaz Kylheku
Post by Chris M. Thomasson
Post by Kaz Kylheku
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
I've developed on DOS,
TSR's?
I did make a couple of TSRs back in the day, but only as a hobby.
Not in C.
Nice. I only messed around with them a couple of times. There was a cool
one, iirc, called key correspondence (KEYCOR), iirc. It was
programmable, and could be used with any program. I used it for a
reporting system and to control WordPerfect 5.1. I still have it! lol.
For legacy purposes.
Chris M. Thomasson
2024-02-05 07:52:02 UTC
Permalink
Post by Chris M. Thomasson
Post by Kaz Kylheku
Post by Chris M. Thomasson
Post by Kaz Kylheku
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
I've developed on DOS,
TSR's?
I did make a couple of TSRs back in the day, but only as a hobby.
Not in C.
Nice. I only messed around with them a couple of times. There was a cool
one, iirc, called key correspondence (KEYCOR), iirc. It was
programmable, and could be used with any program. I used it for a
reporting system and to control WordPerfect 5.1. I still have it! lol.
For legacy purposes.
Writing WordPerfect 5.1 macros was a fun time.... ;^o
Jan van den Broek
2024-02-05 08:36:47 UTC
Permalink
[Schnipp]
Post by Chris M. Thomasson
Post by Chris M. Thomasson
Nice. I only messed around with them a couple of times. There was a cool
one, iirc, called key correspondence (KEYCOR), iirc. It was
programmable, and could be used with any program. I used it for a
reporting system and to control WordPerfect 5.1. I still have it! lol.
For legacy purposes.
Writing WordPerfect 5.1 macros was a fun time.... ;^o
Writing my own macro-compiler was also fun.
--
Jan v/d Broek
***@dds.nl
Look out, here he comes again
The kid with the replaceable head
Michael S
2024-02-05 16:23:47 UTC
Permalink
On Mon, 5 Feb 2024 05:58:55 -0000 (UTC)
Post by Kaz Kylheku
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
I've developed on DOS, Windows as well as for DSP chips and some
microcontrollers. I find most of the crap that you say is simply wrong.
Speaking of Windows, the CL.EXE compiler does not know where its
include files are. You literally cannot do "cl program.c".
where the headers and libraries are.
It depends on definitions.
cl.exe called from random command prompt, either cmd.exe or powershell,
does not know.
cl.exe called from "x64 Native Tools Command Prompt for VS 2019" that I
have installed on the computer that I'm writing this message, knows
very well where they are, because when I clicked on the shortcut it was
written into environment variables, respectively named Include and Lib.
So, from this prompt I can do "cl program.c".
In practice, I d likely prefer "cl -W4 -O1 -MD program.c", but that's
because I am more than most people concerned about unimportant details.
Call it a defect of character.
Post by Kaz Kylheku
The Visual Studio project-file-driven build build system passes all
those details to every invocation of CL.EXE. Your project file (called
a "solution" nowadays) includes information like the path where your
SDK is installed. In the GUI there is some panel where you specify
it.
If I'm going to be doing programming on Windows today, it's either
going be some version of that CL.EXE compiler from Microsoft, or GCC.
Native C language gcc programming under MSYS2 and native C language
clang programming under MSYS2 have extremely similar look and feel. I
can't think about any technical reasons to prefer one over the other.
Michael S
2024-02-05 16:32:33 UTC
Permalink
On Mon, 5 Feb 2024 05:58:55 -0000 (UTC)
Post by Kaz Kylheku
Is it due to decades of legacy code in GCC? Clang is a newer
implementatation, so you might think it's faster than GCC. But it
manages only to be about the same.
I still believe that "decades of legacy" are the main reason.
clang *was* much faster than gcc 10-12 years ago. Since then it
accumulated a decade of legacy. And this particular decade mostly
consisted of code that was written by people that (a) less experienced
than gcc maintainers (b) care about speed of compilation even less than
gcc maintainers. Well, for the later, I don't really believe that it is
possible, but I need to bring a plausible explanation, don't I?
David Brown
2024-02-05 19:53:41 UTC
Permalink
Post by Michael S
On Mon, 5 Feb 2024 05:58:55 -0000 (UTC)
Post by Kaz Kylheku
Is it due to decades of legacy code in GCC? Clang is a newer
implementatation, so you might think it's faster than GCC. But it
manages only to be about the same.
I still believe that "decades of legacy" are the main reason.
clang *was* much faster than gcc 10-12 years ago. Since then it
accumulated a decade of legacy. And this particular decade mostly
consisted of code that was written by people that (a) less experienced
than gcc maintainers (b) care about speed of compilation even less than
gcc maintainers. Well, for the later, I don't really believe that it is
possible, but I need to bring a plausible explanation, don't I?
Early clang was faster than C at compilation and static error checking.
And it had much nicer formats and outputs for its warnings. But it
wasn't close to gcc for optimisation and generated code efficiency, and
had less powerful checking.

Over time, clang has gained a lot more optimisation and is now similar
to gcc in code generation (each is better at some things), while gcc has
sped up some aspects and greatly improved the warning formats.

clang is now a similar speed to gcc because it does a similar job. It
turns out that doing a lot of analysis and code optimisation takes effort.
Kaz Kylheku
2024-02-05 20:53:50 UTC
Permalink
Post by David Brown
Post by Michael S
On Mon, 5 Feb 2024 05:58:55 -0000 (UTC)
Post by Kaz Kylheku
Is it due to decades of legacy code in GCC? Clang is a newer
implementatation, so you might think it's faster than GCC. But it
manages only to be about the same.
I still believe that "decades of legacy" are the main reason.
clang *was* much faster than gcc 10-12 years ago. Since then it
accumulated a decade of legacy. And this particular decade mostly
consisted of code that was written by people that (a) less experienced
than gcc maintainers (b) care about speed of compilation even less than
gcc maintainers. Well, for the later, I don't really believe that it is
possible, but I need to bring a plausible explanation, don't I?
Early clang was faster than C at compilation and static error checking.
And it had much nicer formats and outputs for its warnings. But it
wasn't close to gcc for optimisation and generated code efficiency, and
had less powerful checking.
Over time, clang has gained a lot more optimisation and is now similar
to gcc in code generation (each is better at some things), while gcc has
sped up some aspects and greatly improved the warning formats.
clang is now a similar speed to gcc because it does a similar job. It
turns out that doing a lot of analysis and code optimisation takes effort.
It takes more and more effort for diminishing results.

A compiler can spend a lot of time just searching for the conditions
that allow a certain optimization, where those conditions turn out to be
false most of the time. So that in a large code base, there will be just
a couple of "hits" (the conditions are met, and the optimization can
take place). Yet all the instruction sequences in every basic block in
every file had to be looked at to determine that.

Mnay of these conditions are specific to the optimization. Another
kind of optimization has its own conditions that don't reuse anything
from that one. So the more optimizations you add, the more work it takes
just to determine applicability.

The optimizer may have to iterate on the program graph. After certain
optimizations are applied, the program graph changes. And that may
"unlock" more opportunities to do optimizations that were not possible
before. But because the program graph changed, its properties have to be
recalculated, like liveness of variables/temporaries and whatnot.
More time.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
David Brown
2024-02-06 08:44:20 UTC
Permalink
Post by Kaz Kylheku
Post by David Brown
Post by Michael S
On Mon, 5 Feb 2024 05:58:55 -0000 (UTC)
Post by Kaz Kylheku
Is it due to decades of legacy code in GCC? Clang is a newer
implementatation, so you might think it's faster than GCC. But it
manages only to be about the same.
I still believe that "decades of legacy" are the main reason.
clang *was* much faster than gcc 10-12 years ago. Since then it
accumulated a decade of legacy. And this particular decade mostly
consisted of code that was written by people that (a) less experienced
than gcc maintainers (b) care about speed of compilation even less than
gcc maintainers. Well, for the later, I don't really believe that it is
possible, but I need to bring a plausible explanation, don't I?
Early clang was faster than C at compilation and static error checking.
And it had much nicer formats and outputs for its warnings. But it
wasn't close to gcc for optimisation and generated code efficiency, and
had less powerful checking.
Over time, clang has gained a lot more optimisation and is now similar
to gcc in code generation (each is better at some things), while gcc has
sped up some aspects and greatly improved the warning formats.
clang is now a similar speed to gcc because it does a similar job. It
turns out that doing a lot of analysis and code optimisation takes effort.
It takes more and more effort for diminishing results.
Yes.
Post by Kaz Kylheku
A compiler can spend a lot of time just searching for the conditions
that allow a certain optimization, where those conditions turn out to be
false most of the time. So that in a large code base, there will be just
a couple of "hits" (the conditions are met, and the optimization can
take place). Yet all the instruction sequences in every basic block in
every file had to be looked at to determine that.
This is always the case with optimisations. Each pass might only give a
few percent increase in speed - but when you have 50 passes, this adds
up to a lot. And some passes (that is, some types of optimisation) can
open up new opportunities for if you redo previous passes. And the same
applies to static error checking - there is quite an overlap in the
kinds of analysis used for optimisations and for static error checking.
Post by Kaz Kylheku
Mnay of these conditions are specific to the optimization. Another
kind of optimization has its own conditions that don't reuse anything
from that one. So the more optimizations you add, the more work it takes
just to determine applicability.
The optimizer may have to iterate on the program graph. After certain
optimizations are applied, the program graph changes. And that may
"unlock" more opportunities to do optimizations that were not possible
before. But because the program graph changed, its properties have to be
recalculated, like liveness of variables/temporaries and whatnot.
More time.
Yes.

For a great lot of code, it is not necessary to squeeze out as much
speed as possible. But IMHO it is usually a good idea to have as much
static error checking as you reasonably can without too high a risk of
false positives.

Major compilers aren't really bothered about the speed of compilation of
C code - it is usually fast enough that it is of little concern. Those
that are building a lot, use make (or other build tools), perhaps
ccache, and usually use machines with plenty of cores and plenty of ram.

It's C++ that is the concern, especially big projects. And there you
/do/ need at least some optimisation effort, because C++ is generally
full of little functions that are expected to "disappear" entirely by
inlining. So that is where the compiler developer effort goes for
compiler speed, analysis, and optimisation.

Programmers are notoriously bad at determining which bits of their code
need to be efficient. And if they know their compiler is poor at
optimising, they do "manual optimisation". They use pointers where
arrays would be clearer. They reuse "temp" variables instead of making
new ones. They write jumbles of "gotos" instead of breaking code into
multiple functions. They write "(x << 3) + x" instead of "x * 9". It
is much better to write the clearest source code you can, and let the
compiler do its job and generate efficient object code.

It's never a bad thing if a compiler is faster. But IMHO it is more
important for the compiler to be /better/ - better warnings and checks
that catch issues earlier, and better optimisation because that allows
people to write code in the clearest, safest and most maintainable way
while still getting good results.
Chris M. Thomasson
2024-02-06 09:03:48 UTC
Permalink
On 2/6/2024 12:44 AM, David Brown wrote:
[...]
Post by David Brown
Programmers are notoriously bad at determining which bits of their code
need to be efficient.
This brings me back to a code base I was ask to take a look at. Well,
the keyword register was all over the place! Spooky...



[...]
Michael S
2024-02-06 11:41:52 UTC
Permalink
On Tue, 6 Feb 2024 09:44:20 +0100
Post by David Brown
Post by Kaz Kylheku
A compiler can spend a lot of time just searching for the conditions
that allow a certain optimization, where those conditions turn out
to be false most of the time. So that in a large code base, there
will be just a couple of "hits" (the conditions are met, and the
optimization can take place). Yet all the instruction sequences in
every basic block in every file had to be looked at to determine
that.
This is always the case with optimisations. Each pass might only
give a few percent increase in speed - but when you have 50 passes,
this adds up to a lot. And some passes (that is, some types of
optimisation) can open up new opportunities for if you redo previous
passes.
Except that at least gcc by design never redo previous passes. More so,
it does not even try to compare result of optimization with certain
pass vs result without this pass and to take better of the two.

I don't know if the same applies to clang, I never had
conversations with clang maintainers (had plenty with gcc maintainers).
However, the bottom line for last 2-3 years is that when I compare
speed of gcc-compiled code vs clang-compiled then both can do good
job and both can do ordinary stupid things, but clang is much more
likely then gcc to do astonishingly stupid things. Like, for example,
vectorization that reduces the speed by factor of 3 vs non-vectorized
variant.
So, most likely, clang also proceeds pass after pass after pass and
never ever looks back. Seems like they took the lesson of Lot's wife
very seriously.
Post by David Brown
And the same applies to static error checking - there is
quite an overlap in the kinds of analysis used for optimisations and
for static error checking.
David Brown
2024-02-06 12:08:15 UTC
Permalink
Post by Michael S
On Tue, 6 Feb 2024 09:44:20 +0100
Post by David Brown
Post by Kaz Kylheku
A compiler can spend a lot of time just searching for the conditions
that allow a certain optimization, where those conditions turn out
to be false most of the time. So that in a large code base, there
will be just a couple of "hits" (the conditions are met, and the
optimization can take place). Yet all the instruction sequences in
every basic block in every file had to be looked at to determine
that.
This is always the case with optimisations. Each pass might only
give a few percent increase in speed - but when you have 50 passes,
this adds up to a lot. And some passes (that is, some types of
optimisation) can open up new opportunities for if you redo previous
passes.
Except that at least gcc by design never redo previous passes. More so,
it does not even try to compare result of optimization with certain
pass vs result without this pass and to take better of the two.
AFAIUI (I am not a gcc developer), gcc redoes certain types of
optimisations after later passes - even if it calls them different pass
numbers. For example, constant propagation and dead code elimination is
done early on in functions. Then after inlining and IPA passes, it is
done again using the new information.

I expect you are correct that it does not try to compare the results
from pass to pass. I think that would quickly be infeasible. You can't
just compare the results of applying optimisation B to base A to see if
it is better or worse than before A was, and then decide which to keep
before moving to step C. Maybe A was better than AB, but ABC is better
than AC. You'd need to keep comparing all sorts of combinations, and it
would be a scalability nightmare.
Post by Michael S
I don't know if the same applies to clang, I never had
conversations with clang maintainers (had plenty with gcc maintainers).
However, the bottom line for last 2-3 years is that when I compare
speed of gcc-compiled code vs clang-compiled then both can do good
job and both can do ordinary stupid things, but clang is much more
likely then gcc to do astonishingly stupid things. Like, for example,
vectorization that reduces the speed by factor of 3 vs non-vectorized
variant.
I see the same, though I have not used clang very seriously for real
work. It does, however, seem a bit over-enthusiastic about vectorising
code.
Post by Michael S
So, most likely, clang also proceeds pass after pass after pass and
never ever looks back. Seems like they took the lesson of Lot's wife
very seriously.
Post by David Brown
And the same applies to static error checking - there is
quite an overlap in the kinds of analysis used for optimisations and
for static error checking.
Lawrence D'Oliveiro
2024-02-06 23:23:11 UTC
Permalink
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
David Brown
2024-02-07 07:54:12 UTC
Permalink
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables. It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or conditionals
- but it happens.

However, the context here was compiler optimisation. Not all compilers
have good optimisation. In the embedded world, there are vast numbers
of C compilers, many of which are much more limited than the modern and
advanced tools most of us use today. These weaker compilers are much
rarer now, as are many of the ISAs they served - 32-bit ARM "M" cores
are dominant along with gcc. But in the old days, an embedded C
programmer had to write their code in a way that suited the compiler if
they wanted the best out of their microcontroller - and efficient code
means cheaper devices, lower power and longer battery life. Some of
these weaker tools would allocate registers to local variables on a
first come, first served basis, with no lifetime analysis or reuse
inside a function. Thus you re-used your temporary variables.

Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at
the top of the function.
Malcolm McLean
2024-02-07 08:59:13 UTC
Permalink
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope, or
to decare all variables at the start of the function and give them all
function scope.

The case for minimum scope is the same as the case for scope itself. The
variable is accessible where it is used and not elsewhere, which makes
it less likely it will be used in error, and means there are fewer names
to understand.

However there are also strong arguments for ducntion scope. A function
is a natural unit. Adn all the varibales used in that unit are listed
together and, ideally, commented. So at a glance you can see what is in
scope and what is being operated on. And there are only three levels of
scope. A varibale is global, or it is file scope, or it is scoped to the
function.

I tend to prefer function scope for C. However I use a lot of C++ these
days, and in C++ local scope is often better, and in some cases even
necessary. So I find that I'm tending to use local scope in C more.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Ben Bacarisse
2024-02-07 10:47:47 UTC
Permalink
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope, or to
decare all variables at the start of the function and give them all
function scope.
The term "function scope" has a specific meaning in C. Only labels have
function scope. I know you are not very interested in using exact
terms, but some people might like to know the details.

Since you want to argue for the peculiar (but common) practice of giving
names the largest possible scope (without altering their linkage) you
need a term for the outer-most block scope, but "function scope" is
taken.
The case for minimum scope is the same as the case for scope itself.
Someone might well misinterpret the term "minimum scope" since it would
require adding lots of otherwise redundant braces. I *think* you mean
declaring names at the point of first use. The resulting scope is not
minimum because it often extends beyond the point of last use.

Other people, not familiar with" modern" C, might interpret the term to
mean declaring names at the top of the inner-most appropriate block.
The
variable is accessible where it is used and not elsewhere, which makes it
less likely it will be used in error, and means there are fewer names to
understand.
The case for declaration at first use is much stronger than this. It
almost always allows for a meaningful initialisation at the same point,
so the initialisation does not need to be hunted down a checked. For
me, this is a big win. (Yes, some people then insist on a dummy
initialisation when the proper one isn't know, but that's a fudge that
is, to my mind, even worse.)
However there are also strong arguments for ducntion scope. A function is a
natural unit. Adn all the varibales used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on.
You should not need an inventory of what's being operated on. Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written. I'd argue that
this is also a big win for "short scopes". A policy that leads to early
triggers for refactoring is worth considering.
And there are only three levels of scope. A
varibale is global, or it is file scope, or it is scoped to the
function.
You are mixing up scope and lifetime. C has no "global scope". A name
may have external linkage (which is probably what you are referring to),
but that is not directly connected to its scope.
I tend to prefer function scope for C.
We could call it outer-most block scope rather than re-use a term with
an existing, but different, technical meaning.
However I use a lot of C++ these
days, and in C++ local scope is often better, and in some cases even
necessary. So I find that I'm tending to use local scope in C more.
Interesting. Is it just that using C++ has given you what you would
think of as a bad habit in C, or has using C++ led you to see that your
old preference was not the best one?
--
Ben.
bart
2024-02-07 11:04:45 UTC
Permalink
Post by Ben Bacarisse
However there are also strong arguments for function scope. A function is a
natural unit. And all the variables used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on. [typos fixed]
You should not need an inventory of what's being operated on. Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written.
But if you keep functions small, eg. the whole body is visible at the
same time, then there is less need for declarations to clutter up the
code. They can go at the top, so that you can literally can just glance
there.
Post by Ben Bacarisse
And there are only three levels of scope. A
varibale is global, or it is file scope, or it is scoped to the
function.
You are mixing up scope and lifetime. C has no "global scope". A name
may have external linkage (which is probably what you are referring to),
but that is not directly connected to its scope.
Funny, I use the same definitions of scope:

int abc; // inter-file scope, may be imported or exported
static int def; // file scope

void F(void) {
int ghi; // function-scope
}

If I look inside my compiler, I can see these sets of enums to describe
scope (not C code):

(function_scope, "Fn"), !within a function (note
import/exported names can be declared in a block scope)
(local_scope, "Loc"), !file-scope/not exported
(imported_scope, "Imp"), !imported from another module
(exported_scope, "Exp") !file-scope/exported
end

Within a function, there is an additional mechanism to deal with block
scopes. Plus another overall to deal with namespaces.
David Brown
2024-02-07 13:21:54 UTC
Permalink
Post by bart
However there are also strong arguments for function scope. A function is a
natural unit. And all the variables used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on. [typos fixed]
You should not need an inventory of what's being operated on.  Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written.
But if you keep functions small, eg. the whole body is visible at the
same time, then there is less need for declarations to clutter up the
code. They can go at the top, so that you can literally can just glance
there.
With a small enough function, the benefits of minimum practical scope
(or "define on first use") are reduced, but not removed. The perceived
benefits of "declare everything at the start of the function" disappear
entirely.
Post by bart
And there are only three levels of scope. A
varibale is global, or it is file scope, or it is scoped to the
function.
You are mixing up scope and lifetime.  C has no "global scope".  A name
may have external linkage (which is probably what you are referring to),
but that is not directly connected to its scope.
For discussions of C, it's best to use the well-defined C terms for
scope and lifetime. Other languages may use different terms.
bart
2024-02-07 14:24:28 UTC
Permalink
Post by David Brown
For discussions of C, it's best to use the well-defined C terms for
scope and lifetime.  Other languages may use different terms.
Many of the terms used in the C grammar remind me exactly of the 'twisty
little passages' variations from the original text Adventure game.

In my program, I choose to use identifiers that make more sense to me,
and that match my view of how the language works.
David Brown
2024-02-07 20:30:57 UTC
Permalink
Post by bart
Post by David Brown
For discussions of C, it's best to use the well-defined C terms for
scope and lifetime.  Other languages may use different terms.
Many of the terms used in the C grammar remind me exactly of the 'twisty
little passages' variations from the original text Adventure game.
In my program, I choose to use identifiers that make more sense to me,
and that match my view of how the language works.
OK, I suppose. But if you want to talk about C with other people, it
makes sense to use the same terms they are using, in the same way.

I can certainly agree that there are bits of the C standards that are
not as clear as I would like. The definitions of scope are not one of
those parts.
Ben Bacarisse
2024-02-07 15:36:11 UTC
Permalink
Post by Ben Bacarisse
However there are also strong arguments for function scope. A function is a
natural unit. And all the variables used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on. [typos fixed]
You should not need an inventory of what's being operated on. Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written.
But if you keep functions small, eg. the whole body is visible at the same
time, then there is less need for declarations to clutter up the code. They
can go at the top, so that you can literally can just glance there.
Declarations don't clutter up the code, just as the code does not
clutter up the declarations. That's just your own spin on the matter.
They are both important parts of a C program.
Post by Ben Bacarisse
And there are only three levels of scope. A
varibale is global, or it is file scope, or it is scoped to the
function.
You are mixing up scope and lifetime. C has no "global scope". A name
may have external linkage (which is probably what you are referring to),
but that is not directly connected to its scope.
You can use any definition you like, provided you don't insit that other
use your own terms. I was just pointing out that the problems
associated with using the wrong terms in a public post.

I'll cut the text where you use the wrong terms, because there is
nothing to be gained from correcting your usage.
--
Ben.
bart
2024-02-07 18:05:34 UTC
Permalink
Post by Ben Bacarisse
Post by Ben Bacarisse
However there are also strong arguments for function scope. A function is a
natural unit. And all the variables used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on. [typos fixed]
You should not need an inventory of what's being operated on. Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written.
But if you keep functions small, eg. the whole body is visible at the same
time, then there is less need for declarations to clutter up the code. They
can go at the top, so that you can literally can just glance there.
Declarations don't clutter up the code, just as the code does not
clutter up the declarations. That's just your own spin on the matter.
They are both important parts of a C program.
That sounds like your opinion against mine. It's nothing to do with
spin, whatever that means.

I would argue however that it you take a clear, cleanly written
language-neutral algorithm, and then introduce type annotations /within/
that code rather than segragated, then it is no longer quite as clear or
as clean looking.

As a related example, suppose you had this function:

void F(int a, double* b) {...}

All the parameters are specified with their names and types at the top.
Now imagine if only the names were given, but the types specified only
at their first usage within the body:

void F(a, b) {...}

Now you no longer have an instant picture of the interface to the
function. The declarations could also be shadowed within the body, so
you can't tell whether a definition for 'a' refers to a parameter
without checking for definitions in an outer scope.

Imagine further than even the parameter names were specified within the
body ...

I /like/ having a summary of both parameters and locals at the top. I
/like/ code looking clean, and as aligned as possible (some decls will
push code to the right). I /like/ knowing that there is only one
instance of a variable /abc/, and it is the one at the top.

So it might be my opinion but also my preference.
Post by Ben Bacarisse
Post by Ben Bacarisse
And there are only three levels of scope. A
varibale is global, or it is file scope, or it is scoped to the
function.
You are mixing up scope and lifetime. C has no "global scope". A name
may have external linkage (which is probably what you are referring to),
but that is not directly connected to its scope.
You can use any definition you like, provided you don't insit that other
use your own terms. I was just pointing out that the problems
associated with using the wrong terms in a public post.
I'll cut the text where you use the wrong terms, because there is
nothing to be gained from correcting your usage.
That's a shame. I think there is something to be gained by not sticking
slavishly to what the C standard says (which very few people will study)
and using more colloquial terms or ones that more can relate to.

Apparently both 'typedef' and 'static' are forms of 'linkage'. But no
identifiers declared with those will ever be linked to anything!
Scott Lurndal
2024-02-07 18:26:06 UTC
Permalink
Post by bart
Post by Ben Bacarisse
Post by Ben Bacarisse
However there are also strong arguments for function scope. A function is a
natural unit. And all the variables used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on. [typos fixed]
You should not need an inventory of what's being operated on. Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written.
But if you keep functions small, eg. the whole body is visible at the same
time, then there is less need for declarations to clutter up the code. They
can go at the top, so that you can literally can just glance there.
Declarations don't clutter up the code, just as the code does not
clutter up the declarations. That's just your own spin on the matter.
They are both important parts of a C program.
That sounds like your opinion against mine. It's nothing to do with
spin, whatever that means.
I would argue however that it you take a clear, cleanly written
language-neutral algorithm, and then introduce type annotations /within/
that code rather than segragated, then it is no longer quite as clear or
as clean looking.
void F(int a, double* b) {...}
All the parameters are specified with their names and types at the top.
Now imagine if only the names were given,
Now imagine if the moon was made from green cheese. It's just as
likely, and neither are C.
bart
2024-02-07 19:53:53 UTC
Permalink
Post by Scott Lurndal
Post by bart
Post by Ben Bacarisse
Declarations don't clutter up the code, just as the code does not
clutter up the declarations. That's just your own spin on the matter.
They are both important parts of a C program.
That sounds like your opinion against mine. It's nothing to do with
spin, whatever that means.
I would argue however that it you take a clear, cleanly written
language-neutral algorithm, and then introduce type annotations /within/
that code rather than segragated, then it is no longer quite as clear or
as clean looking.
void F(int a, double* b) {...}
All the parameters are specified with their names and types at the top.
Now imagine if only the names were given,
Now imagine if the moon was made from green cheese. It's just as
likely, and neither are C.
It's perfectly possible as an extension. Old C had something similar
that was halfway there.

But it was a hypothetical illustration to elicit a response to this
question: would it make harder to easier to understand what the function
is doing?

Because it is related to whether the locals used by a function are
declared all at the top, or buried within the code at random places.

BTW I've just done a quick survey of some codebases; functions tend to
have 3 local variables on average.

Is really worth spreading them out in nested block scopes?

Here is a histogram for tcc.c: the first column is how many locals, and
the second is how many functions with that number:

0 161
1 118
2 73
3 42
4 29
5 15
6 12
7 14
8 11
9 6
10 9
11 6
12 3
13 5
14 3
16 4
17 1
18 2
19 2
20 1
21 2
25 1
27 1
31 1
32 1
33 1
35 1

In one of my own programs, 92% of functions have 6 locals or fewer. (The
figures includes extra temporary locals created as part of the
transpilation to C.)
Lawrence D'Oliveiro
2024-02-07 21:38:02 UTC
Permalink
Post by bart
BTW I've just done a quick survey of some codebases; functions tend to
have 3 local variables on average.
Is really worth spreading them out in nested block scopes?
If you write “average” functions, you know what the answer is.

Some of us don’t write “average” functions.
Malcolm McLean
2024-02-08 00:29:02 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by bart
BTW I've just done a quick survey of some codebases; functions tend to
have 3 local variables on average.
Is really worth spreading them out in nested block scopes?
If you write “average” functions, you know what the answer is.
Some of us don’t write “average” functions.
Most functions are short and trivial. But those functions tend to be
easy to understnad and unlikely to have bugs. What matters is how you
write the longer functions.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
David Brown
2024-02-07 20:37:31 UTC
Permalink
Post by bart
That's a shame. I think there is something to be gained by not sticking
slavishly to what the C standard says (which very few people will study)
and using more colloquial terms or ones that more can relate to.
There is something to be said for explaining the technical terms from
the C standards in more colloquial language to make it easier for others
to understand. There is nothing at all to be said for using C standard
terms in clearly and obviously incorrect ways. That's just going to
confuse these non-standard-reading C programmers when they try to find
out more, no matter where they look for additional information.
Post by bart
Apparently both 'typedef' and 'static' are forms of 'linkage'. But no
identifiers declared with those will ever be linked to anything!
Could you point to the paragraph of the C standards that justifies that
claim? Or are you perhaps mixing things up? (I can tell you the
correct answer, with references, if you are stuck - but I'd like to give
you the chance to show off your extensive C knowledge first.)
bart
2024-02-07 22:52:24 UTC
Permalink
Post by David Brown
Post by bart
That's a shame. I think there is something to be gained by not
sticking slavishly to what the C standard says (which very few people
will study) and using more colloquial terms or ones that more can
relate to.
There is something to be said for explaining the technical terms from
the C standards in more colloquial language to make it easier for others
to understand.  There is nothing at all to be said for using C standard
terms in clearly and obviously incorrect ways.  That's just going to
confuse these non-standard-reading C programmers when they try to find
out more, no matter where they look for additional information.
Post by bart
Apparently both 'typedef' and 'static' are forms of 'linkage'. But no
identifiers declared with those will ever be linked to anything!
Could you point to the paragraph of the C standards that justifies that
claim?  Or are you perhaps mixing things up?  (I can tell you the
correct answer, with references, if you are stuck - but I'd like to give
you the chance to show off your extensive C knowledge first.)
* The standard talks a lot about Linkage but there are no specific
lexical elements for those.

* Instead the standard uses lexical elements called 'storage-class
specifiers' to control what kind of linkage is applied to identifiers

* Because of this association, I use 'linkage symbol' to refer to those
particular tokens

* The tokens include 'typedef extern static'

6.2.2p3 says: "If the declaration of a file scope identifier for an
object or a function contains the storageclass specifier static, the
identifier has internal linkage."

So it talks about statics as having linkage of some kind. What did I
say? I said statics will never be linked to anything.

6.6.2p6 excludes typedefs (by omission). Or rather it says they have 'no
linkage', which is one of the three kinds of linkage (external,
internal, none).

So as far as I can see, statics and typedef are still lumped in to the
class of entities that have a form of linkage, and are part of the set
of tokens that control linkage.

---------------------------------------------------

This is to me is all a bit mixed up. Much as you dislike other languages
being brought in, they can give an enlightening perspective.

So for me, linking applies to all named entities that occupy memory, and
that have global/export scope.

But global/export scope applies also to all other named entities,
whether they occupy memory or not. I can show that here in in this chart:

M Scope? M Link? C Linkage?

Function names Y Y Y (internal/external)

Variable names Y Y Y (internal/external)

Enum names Y N ??

Named constants Y N --

Type names Y N Y (none)

Macro names Y N ??

Module names Y N --

(Type names include C's struct tags. Enum tags are not listed.)

In the M language, ALL user identifiers declared at file scope can be
imported and exported automatically by the language across modules.

This is the primary control method for visibility.

There is a special mechanism to import into a program/library, or export
from one. This is the only place linkage comes up, where those names
need to appear in EXE, DLL and OBJ file formats. Only functions and
variables (entities that have an address) are involved.

In the C column, ?? are identifiers that usually can't apppear in a
declaration with a storage class. And -- is for things not meaningful in C.
Kaz Kylheku
2024-02-08 01:13:21 UTC
Permalink
Post by bart
* The standard talks a lot about Linkage but there are no specific
lexical elements for those.
Yes; linkage doesn't have a dedicated phrase element in the syntax.
Post by bart
* Instead the standard uses lexical elements called 'storage-class
specifiers' to control what kind of linkage is applied to identifiers
Yes. "storage-class specifier" is just a syntactic category, a "part of
speech" of C.

Not everything that is syntactically a "storage class specifier"
determines the kind of storage for an object.

It's not really a great situation and it has gotten worse with the
introduction of new storage class keywords for alignment and whatnot.
Post by bart
* Because of this association, I use 'linkage symbol' to refer to those
particular tokens
Your one term for everything is equally flawed and just gratuitously
different.
Post by bart
* The tokens include 'typedef extern static'
6.2.2p3 says: "If the declaration of a file scope identifier for an
object or a function contains the storageclass specifier static, the
identifier has internal linkage."
Yes. If it's a function it has no storage class. If it's an object
at file scope, its storage duration is static. The concept of storage
class doesn't apply to anything at file scope, in other words.

The syntax is instead abused for determining linkage.
Post by bart
So it talks about statics as having linkage of some kind. What did I
say? I said statics will never be linked to anything.
file scope statics have internal linkage for the reason that they
are allowed to be multiply defined. You're thining of linkage as some
object-file-level resolution mechanism.

In C, linkage just refers to the situation when multiple declarations of
an identifier are permitted, and refer to a single entity according
to some rule.

At file scope "static int x;" has linkage because you can
have a situation like this:
things liek this:

static int x; // declaration / tentative definition

void foo(void)
{
int x;
{
extern int x; // links to the file scope static
}
}

static int x; // declaration / tentative definition

static int x = 42; // definition

The linkage is internal because all these occurrences of x do not
refer to a file scope x in another translation unit.

This internal linkage is not necessarily handled by a linker. Since it
happens in one translation unit, the compiler can take care of it so
that the resulting object file has resolved all the internal linkage;
then the linkage of multiple translation units into a single program
only deals with external linkage. That can make external linkage appear
more "real".
Post by bart
6.6.2p6 excludes typedefs (by omission). Or rather it says they have 'no
linkage', which is one of the three kinds of linkage (external,
internal, none).
So as far as I can see, statics and typedef are still lumped in to the
class of entities that have a form of linkage, and are part of the set
of tokens that control linkage.
No; typedef is just the "part of C speech" called "storage class
specifier". When a declaration has "typedef" storage class, it's
understood as defining a typedef name in that scope: file scope
or lexical.

The phrase element called "storage class" serves as a general "command
verb" kind of thing in the declaration (which may be omitted).

It should be renamed accordingly. Maybe "declaration kind" or
"declaration category" or "declaration class" or what have you.

Mainly, get rid of the word "storage".

Good names for entities are important. Sometimes the systems we use
don't get them right.

The naming of "storage class" is less important than the multiple
meanings of static and whatnot. Unlike grammar terminology, that can't
be fixed without breaking programs.
Post by bart
This is to me is all a bit mixed up. Much as you dislike other languages
being brought in, they can give an enlightening perspective.
Right, nobody here knows anything outside of C, or can think outside of
the C box, except for you.

You're the newsgroup's Prometheus.

The vultures eat your liver daily and everything.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
bart
2024-02-08 02:09:01 UTC
Permalink
Post by Kaz Kylheku
Post by bart
This is to me is all a bit mixed up. Much as you dislike other languages
being brought in, they can give an enlightening perspective.
Right, nobody here knows anything outside of C, or can think outside of
the C box, except for you.
Well, quite. AFAIK, nobody here HAS (1) used a comparable language to C;
(2) over such a long term; (3) which they have invented themselves; (4)
have implemented themselves; (5) is similar enough to C yet different
enough in how it works to give that perspective.

See, I gave an interesting comparison of how my module scheme works
orthogonally across all kinds of entities, compared with the confusing
mess of C, and you shut down that view.

You're never in a million years going to admit that my language has some
good points are you? Exactly as I said in my OP.

So what's the rule here, that people can only think INSIDE the C box?

Is the point of this group only to show off your master knowledge of C,
or the ins and outs of 300 kinds of Linux systems?
Kaz Kylheku
2024-02-08 03:07:05 UTC
Permalink
Post by bart
Post by Kaz Kylheku
Post by bart
This is to me is all a bit mixed up. Much as you dislike other languages
being brought in, they can give an enlightening perspective.
Right, nobody here knows anything outside of C, or can think outside of
the C box, except for you.
Well, quite. AFAIK, nobody here HAS (1) used a comparable language to C;
(2) over such a long term; (3) which they have invented themselves; (4)
have implemented themselves; (5) is similar enough to C yet different
enough in how it works to give that perspective.
You've taken a perspective is not transferrable to others.

If one can only see something after using your own invention for many
years, and other people don't have that same invention and
implementation experience, then they just cannot see what you see.

You cannot teach (2) through (4), just like a basketball coach cannot
teach a player to be seven foot tall.
Post by bart
See, I gave an interesting comparison of how my module scheme works
orthogonally across all kinds of entities, compared with the confusing
mess of C, and you shut down that view.
You're never in a million years going to admit that my language has some
good points are you? Exactly as I said in my OP.
I have no idea what it is; I've not seen the reference manual / spec,
and even if I did, I wouldn't have implemented it myself and used it for
a long time, so I don't have the right perspective.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
David Brown
2024-02-08 12:01:32 UTC
Permalink
Post by bart
Post by David Brown
Post by bart
That's a shame. I think there is something to be gained by not
sticking slavishly to what the C standard says (which very few people
will study) and using more colloquial terms or ones that more can
relate to.
There is something to be said for explaining the technical terms from
the C standards in more colloquial language to make it easier for
others to understand.  There is nothing at all to be said for using C
standard terms in clearly and obviously incorrect ways.  That's just
going to confuse these non-standard-reading C programmers when they
try to find out more, no matter where they look for additional
information.
Post by bart
Apparently both 'typedef' and 'static' are forms of 'linkage'. But no
identifiers declared with those will ever be linked to anything!
Could you point to the paragraph of the C standards that justifies
that claim?  Or are you perhaps mixing things up?  (I can tell you the
correct answer, with references, if you are stuck - but I'd like to
give you the chance to show off your extensive C knowledge first.)
* The standard talks a lot about Linkage but there are no specific
lexical elements for those.
* Instead the standard uses lexical elements called 'storage-class
specifiers' to control what kind of linkage is applied to identifiers
* Because of this association, I use 'linkage symbol' to refer to those
particular tokens
So I was correct that you were mixing things up, and can't provide a
reference in the C standards?

You are correct that there are no lexical elements that explicitly
control linkage, and that storage-class specifiers can affect linking.
That does not mean linkage is determined solely by storage-class
specifiers, nor does it mean all storage-class specifiers affect
linkage. They cover related, but separate concepts. (It's a bit like
scope and lifetime - they are related, but they are not the same thing.)
Post by bart
* The tokens include 'typedef extern static'
They also include _Thread_local, auto and register.

And pay attention to 6.7.1p5 :
"""
The typedef specifier is called a "storage-class specifier" for
syntactic convenience only;
"""
Post by bart
6.2.2p3 says: "If the declaration of a file scope identifier for an
object or a function contains the storageclass specifier static, the
identifier has internal linkage."
So it talks about statics as having linkage of some kind. What did I
say? I said statics will never be linked to anything.
They have "internal linkage". This means they are allocated space (in
ram for objects, code space for functions) by the linker, but static
symbols from one translation unit do not link to the same names from
other units.

This is distinct from "external linkage", where space is allocated,
identical symbols from different units are linked together, you must
have no more than one definition (but you can have multiple
declarations), and the definition and declarations must match in type.

And it is distinct from "no linkage" - things that are not involved in
the linking process, and have no connection between the identifier and
an item in memory (code or data). Things like typedef names, non-static
local variables, struct tags, and macro names are amongst the things
that have no linkage.

So static objects, with internal linkage, /are/ linked - the use of the
identifier in the source code is linked to the linker-allocated memory
address (relative or absolute, depending on the kind of linking and kind
of target).
Post by bart
6.6.2p6 excludes typedefs (by omission). Or rather it says they have 'no
linkage', which is one of the three kinds of linkage (external,
internal, none).
It is not by omission - it is covered in 6.2.2p6.
Post by bart
So as far as I can see, statics and typedef are still lumped in to the
class of entities that have a form of linkage, and are part of the set
of tokens that control linkage.
No, you are mistaken. But I can understand how you got it wrong, and I
hope my post here can help clear it up.

Your mixup stems from a limited view of "linking". You are viewing the
term to mean something like "linking identical global identifiers from
different units so that they refer to the same object". That is part of
the process, but it /also/ means "linking identifiers and references
with code or data memory areas". That applies equally to static data
(with C "internal linkage") as to "global" data (with "external
linkage") - but it does /not/ apply to things with "no linkage".
Post by bart
---------------------------------------------------
This is to me is all a bit mixed up. Much as you dislike other languages
being brought in, they can give an enlightening perspective.
They can't give much help with terms unless they are established
languages, and even then the terms can vary significantly between languages.

This is about your misunderstanding of the term "linkage" - at most,
references to your language could illustrate what you have got wrong.
But I think that has already been established and does not need extra help.
Post by bart
So for me, linking applies to all named entities that occupy memory,
Yes.
Post by bart
and
that have global/export scope.
No. It is that extra restriction here that is wrong.

(And "global scope" is a /really/ bad term to use. Scope is about when
an identifier is visible in a program, and is not the same as linkage or
lifetime. I know what you are trying to say here, but it is not an
accurate term.)
Ben Bacarisse
2024-02-08 11:37:40 UTC
Permalink
Post by Ben Bacarisse
Post by Ben Bacarisse
However there are also strong arguments for function scope. A function is a
natural unit. And all the variables used in that unit are listed together
and, ideally, commented. So at a glance you can see what is in scope and
what is being operated on. [typos fixed]
You should not need an inventory of what's being operated on. Any
function so complex that I can't tell immediately what declaration
corresponds to which name needs to be re-written.
But if you keep functions small, eg. the whole body is visible at the same
time, then there is less need for declarations to clutter up the code. They
can go at the top, so that you can literally can just glance there.
Declarations don't clutter up the code, just as the code does not
clutter up the declarations. That's just your own spin on the matter.
They are both important parts of a C program.
That sounds like your opinion against mine. It's nothing to do with spin,
whatever that means.
It's spin, because the term is emotive. "Cluttering up" is how you feel
about it. The phrase is just a mildly pejorative one about appearances.
There's no substance there. To make a technical point you would have to
explain how, for example,

struct item *items;
...
n_elements = get_number_of_items(...);
items = malloc(n_elements * sizeof *items);
...

is technically better than

n_elements = get_number_of_items(...);
struct item *items = malloc(n_elements * sizeof *items);

I've explained (more than once) how I find reasoning about the direct
initialise at first use style easier with fewer distractions.
I would argue however that it you take a clear, cleanly written
language-neutral algorithm, and then introduce type annotations /within/
that code rather than segragated, then it is no longer quite as clear or as
clean looking.
I agree. That's one big win for languages like Haskell with
sophisticated type inference. But the discussion (here) should be about
C where the disagreement is only about where to put the declaration.
void F(int a, double* b) {...}
All the parameters are specified with their names and types at the top. Now
imagine if only the names were given, but the types specified only at their
void F(a, b) {...}
That's not a related example. No one is suggesting anything remotely
like this.

This is why I keep asking if you have some political (or PR) background.
There is no reason at all to present an example where type information
is removed from the function prototype because no one is suggesting
that. It's a straw-man that you can argue against where, presumably,
you don't want to argue in favour of splitting the declaration away from
the point of first use.
I /like/ having a summary of both parameters and locals at the top. I
/like/ code looking clean, and as aligned as possible (some decls will push
code to the right). I /like/ knowing that there is only one instance of a
variable /abc/, and it is the one at the top.
That's fine. I have other concerns that I feel trump rather subjective
notions of aesthetics.
Post by Ben Bacarisse
Post by Ben Bacarisse
And there are only three levels of scope. A
varibale is global, or it is file scope, or it is scoped to the
function.
You are mixing up scope and lifetime. C has no "global scope". A name
may have external linkage (which is probably what you are referring to),
but that is not directly connected to its scope.
You can use any definition you like, provided you don't insit that other
use your own terms. I was just pointing out that the problems
associated with using the wrong terms in a public post.
I'll cut the text where you use the wrong terms, because there is
nothing to be gained from correcting your usage.
That's a shame. I think there is something to be gained by not sticking
slavishly to what the C standard says (which very few people will study)
and using more colloquial terms or ones that more can relate to.
Avoiding incorrect use of technical terms never gets in the way of
writing clear and easy to understand explanations. Quite the reverse.
If you try to explain C's notions of scope and linkage by mixing them up
into terms like "global variables" you can only sow confusion.
Apparently both 'typedef' and 'static' are forms of 'linkage'. But no
identifiers declared with those will ever be linked to anything!
You /could/ explain what the term linkage means in relation to C
identifiers, but your preference is rarely to help people understand.
You'd rather just make a snide remark: "look, the C standard uses an
ordinary English word in a way that is not normal!".
--
Ben.
Malcolm McLean
2024-02-08 12:10:07 UTC
Permalink
Post by Ben Bacarisse
That sounds like your opinion against mine. It's nothing to do with spin,
whatever that means.
It's spin, because the term is emotive. "Cluttering up" is how you feel
about it. The phrase is just a mildly pejorative one about appearances.
There's no substance there. To make a technical point you would have to
explain how, for example,
struct item *items;
...
n_elements = get_number_of_items(...);
items = malloc(n_elements * sizeof *items);
...
is technically better than
n_elements = get_number_of_items(...);
struct item *items = malloc(n_elements * sizeof *items);
I've explained (more than once) how I find reasoning about the direct
initialise at first use style easier with fewer distractions.
items = malloc(n_elements * sizeof *items);

is shorter than

struct item *items = malloc(n_elements * sizeof *items);

and that is an objective statement about which there can be no dispute.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
David Brown
2024-02-08 12:24:35 UTC
Permalink
Post by Ben Bacarisse
That sounds like your opinion against mine. It's nothing to do with spin,
whatever that means.
It's spin, because the term is emotive.  "Cluttering up" is how you feel
about it.  The phrase is just a mildly pejorative one about appearances.
There's no substance there.  To make a technical point you would have to
explain how, for example,
    struct item *items;
    ...
    n_elements = get_number_of_items(...);
    items = malloc(n_elements * sizeof *items);
    ...
is technically better than
    n_elements = get_number_of_items(...);
    struct item *items = malloc(n_elements * sizeof *items);
I've explained (more than once) how I find reasoning about the direct
initialise at first use style easier with fewer distractions.
items = malloc(n_elements * sizeof *items);
is shorter than
struct item *items = malloc(n_elements * sizeof *items);
and that is an objective statement about which there can be no dispute.
But that is not the comparison.

struct item *items = malloc(n_elements * sizeof *items);

is shorter than:

struct item *items;
items = malloc(n_elements * sizeof *items);

You have to define the variable somewhere. Doing so when you initialise
it when you first need it, is, without doubt, objectively shorter.
Opinions may differ on whether it is clearer, or "cluttered", but which
is shorter is not in doubt. (What relevance that might have, is much
more in doubt.)

Malcolm McLean
2024-02-07 12:44:33 UTC
Permalink
Post by Ben Bacarisse
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope, or to
decare all variables at the start of the function and give them all
function scope.
The term "function scope" has a specific meaning in C. Only labels have
function scope. I know you are not very interested in using exact
terms, but some people might like to know the details.
To explain this, if we have

void function(void)
{
int i;

for (i = 0; i < 10;; i++)
dosomething();
if ( condition)
{
int i;

for (i = 0; i < 11; i++)
dosomething();
if (i == 10)
/* always false */
}
}

The first i is not in scope when we test for i == 10 and the test will
be false. So "fucntion scope" isn't the term.

However if we have this:

void fucntion(void)
{
label:
dosomething();
if (condition)
{
label:
dosomething();
}
got label:
}

Then it is a error. Both labels are in scope and that isn't allowed.
Post by Ben Bacarisse
Since you want to argue for the peculiar (but common) practice of giving
names the largest possible scope (without altering their linkage) you
need a term for the outer-most block scope, but "function scope" is
taken.
So "function scope" isn't the correct term. So we need another. I expect
that at this point someone will jump in and say it must be "Malcolm
scope". As you say, it's common enough to need a term for it.
Post by Ben Bacarisse
The case for minimum scope is the same as the case for scope itself.
Someone might well misinterpret the term "minimum scope" since it would
require adding lots of otherwise redundant braces. I *think* you mean
declaring names at the point of first use. The resulting scope is not
minimum because it often extends beyond the point of last use.
Yes, I don't mean literally the minimum scope that would be possible by
artificially ending a block when a variable is used for the last time.
No one would do that. I mean that the variable is either declared at
point of first use or, if this isn't allowed because of the C version,
at the top of the block in which it is used. But also that variables are
not reused if in fact the value is discarded between statements or
especially between blocks.
Post by Ben Bacarisse
Other people, not familiar with" modern" C, might interpret the term to
mean declaring names at the top of the inner-most appropriate block.
Top of the block or point of first use?
Post by Ben Bacarisse
The
variable is accessible where it is used and not elsewhere, which makes it
less likely it will be used in error, and means there are fewer names to
understand.
The case for declaration at first use is much stronger than this. It
almost always allows for a meaningful initialisation at the same point,
so the initialisation does not need to be hunted down a checked. For
me, this is a big win. (Yes, some people then insist on a dummy
initialisation when the proper one isn't know, but that's a fudge that
is, to my mind, even worse.)
If you go for top of block and you don't have a value, you either
intialise, usually to zero, or leave it wild. Neither is ideal. But it
rarely makes a big difference. However if you go for policy two, all the
variables are either given initial values at the top of the function or
they are not given initial values at the top of the function,and so you
can easily check, and ensure that all the initial values are consistent
woth each other.
Post by Ben Bacarisse
We could call it outer-most block scope rather than re-use a term with
an existing, but different, technical meaning.
The variable has scope within the function, within the whole of the
function, and the motive is that the function is the natural unit of
thought. So I think we need the word "function".
Post by Ben Bacarisse
However I use a lot of C++ these
days, and in C++ local scope is often better, and in some cases even
necessary. So I find that I'm tending to use local scope in C more.
Interesting. Is it just that using C++ has given you what you would
think of as a bad habit in C, or has using C++ led you to see that your
old preference was not the best one?
Not sure. If I thought it was a terrible habit of course I wouldn't do
it. I do think it makes the code look a little bit less clear. But it's
slightly easier to write and hack, which is why I do it.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
David Brown
2024-02-07 13:49:42 UTC
Permalink
Post by Malcolm McLean
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or
conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope, or to
decare all variables at the start of the function and give them all
function scope.
The term "function scope" has a specific meaning in C.  Only labels have
function scope.  I know you are not very interested in using exact
terms, but some people might like to know the details.
To explain this, if we have
void function(void)
{
   int i;
   for (i = 0; i < 10;; i++)
      dosomething();
   if ( condition)
   {
      int i;
      for (i = 0; i < 11; i++)
         dosomething();
      if (i == 10)
        /* always false */
   }
}
The first i is not in scope when we test for i == 10 and the test will
be false. So "fucntion scope" isn't the term.
"Function scope" is not the term, because - as has been explained to you
- "function scope" has a specific meaning in C, and this is not it.

Everyone can figure out what you are trying to say - you mean the
outermost block scope of the function. It's just block scope, as normal.

(By the way, you do know that Thunderbird has a pretty good spell
checker? I don't want to get hung up on this, and don't want to start a
new branch or argument, but avoiding the silly typos in your posts would
improve them.)
Post by Malcolm McLean
void fucntion(void)
{
   dosomething();
   if (condition)
   {
      dosomething();
   }
}
Then it is a error. Both labels are in scope and that isn't allowed.
Yes, that's because labels have function scope in C.
Post by Malcolm McLean
Since you want to argue for the peculiar (but common) practice of giving
names the largest possible scope (without altering their linkage) you
need a term for the outer-most block scope, but "function scope" is
taken.
So "function scope" isn't the correct term. So we need another. I expect
that at this point someone will jump in and say it must be "Malcolm
scope". As you say, it's common enough to need a term for it.
We don't need a new term. We have the terms in the C standards. Block
scope is fine.

Note that there is another very big difference between "function scope"
and "block scope". Labels in function scope are in scope within the
function, even before they are declared. For identifiers in block
scope, their scope does not start until they are declared.
Post by Malcolm McLean
The case for minimum scope is the same as the case for scope itself.
Someone might well misinterpret the term "minimum scope" since it would
require adding lots of otherwise redundant braces.  I *think* you mean
declaring names at the point of first use.  The resulting scope is not
minimum because it often extends beyond the point of last use.
Yes, I don't mean literally the minimum scope that would be possible by
artificially ending a block when a variable is used for the last time.
No one would do that. I mean that the variable is either declared at
point of first use or, if this isn't allowed because of the C version,
at the top of the block in which it is used. But also that variables are
not reused if in fact the value is discarded between statements or
especially between blocks.
Other people, not familiar with" modern" C, might interpret the term to
mean declaring names at the top of the inner-most appropriate block.
Top of the block or point of first use?
In C90, you have to declare your variables before any statements within
the block. In C99, you can intermingle declarations and statements.
Thus even in C90, you can still have top of block declarations.
Post by Malcolm McLean
The
variable is accessible where it is used and not elsewhere, which makes it
less likely it will be used in error, and means there are fewer names to
understand.
The case for declaration at first use is much stronger than this.  It
almost always allows for a meaningful initialisation at the same point,
so the initialisation does not need to be hunted down a checked.  For
me, this is a big win.  (Yes, some people then insist on a dummy
initialisation when the proper one isn't know, but that's a fudge that
is, to my mind, even worse.)
If you go for top of block and you don't have a value, you either
intialise, usually to zero, or leave it wild. Neither is ideal.
Leaving it uninitialised is /much/ better, unless you are using weak
tools or don't know how to use them properly. (There can be
circumstances where code is too complex for compilers to be sure that a
variable is never used uninitialised, and you might find it appropriate
to give a dummy initialisation in that case. But such cases are rare.)

Even better, of course, is not to declare the variable at all until you
have something sensible to put in it. (And then consider making it
"const" if it does not change.)
Post by Malcolm McLean
But it
rarely makes a big difference. However if you go for policy two, all the
variables are either given initial values at the top of the function or
they are not given initial values at the top of the function,and so you
can easily check, and ensure that all the initial values are consistent
woth each other.
If you declare your variables when you have a value for them, then the
initial values are all clear and consistent, and have no artificial
values, and in many cases, they never change. Having your variables
unchanging makes code /much/ easier to understand and check for correctness.
Post by Malcolm McLean
We could call it outer-most block scope rather than re-use a term with
an existing, but different, technical meaning.
The variable has scope within the function, within the whole of the
function, and the motive is that the function is the natural unit of
thought. So I think we need the word "function".
No, we don't. And no, the scope is /not/ the entire function.
Post by Malcolm McLean
However I use a lot of C++ these
days, and in C++ local scope is often better, and in some cases even
necessary. So I find that I'm tending to use local scope in C more.
Interesting.  Is it just that using C++ has given you what you would
think of as a bad habit in C, or has using C++ led you to see that your
old preference was not the best one?
Not sure. If I thought it was a terrible habit of course I wouldn't do
it. I do think it makes the code look a little bit less clear. But it's
slightly easier to write and hack, which is why I do it.
Ben Bacarisse
2024-02-07 16:13:14 UTC
Permalink
Post by Malcolm McLean
Post by Ben Bacarisse
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope, or to
decare all variables at the start of the function and give them all
function scope.
The term "function scope" has a specific meaning in C. Only labels have
function scope. I know you are not very interested in using exact
terms, but some people might like to know the details.
To explain this, if we have
What is the "this" that you are explaining?
Post by Malcolm McLean
void function(void)
{
int i;
for (i = 0; i < 10;; i++)
dosomething();
if ( condition)
{
int i;
for (i = 0; i < 11; i++)
dosomething();
if (i == 10)
/* always false */
}
}
The first i is not in scope when we test for i == 10 and the test will be
false. So "fucntion scope" isn't the term.
"function scope" is not the term because only labels have function
scope. This example does not explain anything about the term
"functions scope" -- even why it's the wrong term.
Post by Malcolm McLean
void fucntion(void)
{
dosomething();
if (condition)
{
dosomething();
}
(you mean "goto label;")
Post by Malcolm McLean
}
Then it is a error. Both labels are in scope and that isn't allowed.
The key thing about the scope of labels is that they can be used before
that are defined:

int *f(int *p)
{
if (!p) goto error:
...
error:
return p;
}
Post by Malcolm McLean
Post by Ben Bacarisse
Since you want to argue for the peculiar (but common) practice of giving
names the largest possible scope (without altering their linkage) you
need a term for the outer-most block scope, but "function scope" is
taken.
So "function scope" isn't the correct term. So we need another. I expect
that at this point someone will jump in and say it must be "Malcolm
scope". As you say, it's common enough to need a term for it.
I see no reason not to call it "the outer-most block scope".
Post by Malcolm McLean
Post by Ben Bacarisse
The case for minimum scope is the same as the case for scope itself.
Someone might well misinterpret the term "minimum scope" since it would
require adding lots of otherwise redundant braces. I *think* you mean
declaring names at the point of first use. The resulting scope is not
minimum because it often extends beyond the point of last use.
Yes, I don't mean literally the minimum scope that would be possible by
artificially ending a block when a variable is used for the last time. No
one would do that. I mean that the variable is either declared at point of
first use or, if this isn't allowed because of the C version, at the top of
the block in which it is used. But also that variables are not reused if in
fact the value is discarded between statements or especially between
blocks.
Post by Ben Bacarisse
Other people, not familiar with" modern" C, might interpret the term to
mean declaring names at the top of the inner-most appropriate block.
Top of the block or point of first use?
I don't know what you are asking. I was trying to point out these two
possible meanings for "minimum scope".
Post by Malcolm McLean
Post by Ben Bacarisse
The
variable is accessible where it is used and not elsewhere, which makes it
less likely it will be used in error, and means there are fewer names to
understand.
The case for declaration at first use is much stronger than this. It
almost always allows for a meaningful initialisation at the same point,
so the initialisation does not need to be hunted down a checked. For
me, this is a big win. (Yes, some people then insist on a dummy
initialisation when the proper one isn't know, but that's a fudge that
is, to my mind, even worse.)
If you go for top of block and you don't have a value, you either
intialise, usually to zero, or leave it wild. Neither is ideal. But it
rarely makes a big difference. However if you go for policy two, all the
variables are either given initial values at the top of the function or
they are not given initial values at the top of the function,and so you can
easily check, and ensure that all the initial values are consistent woth
each other.
What?
Post by Malcolm McLean
Post by Ben Bacarisse
We could call it outer-most block scope rather than re-use a term with
an existing, but different, technical meaning.
The variable has scope within the function, within the whole of the
function, and the motive is that the function is the natural unit of
thought. So I think we need the word "function".
You need the word function. I don't.
--
Ben.
Keith Thompson
2024-02-07 16:21:20 UTC
Permalink
[...]
Post by Malcolm McLean
Post by Ben Bacarisse
Since you want to argue for the peculiar (but common) practice of giving
names the largest possible scope (without altering their linkage) you
need a term for the outer-most block scope, but "function scope" is
taken.
So "function scope" isn't the correct term. So we need another. I
expect that at this point someone will jump in and say it must be
"Malcolm scope". As you say, it's common enough to need a term for it.
Please, no, not "Malcolm scope". That's the kind of thing that gets
suggested as a last resort, or as a joke, when you insist on using
existing terminology with your own idiosyncratic meaning.

"Outermost block scope" is a clear and correct description of what
you're talking about. Though what you're probably talking about is
outermost block scope before any statements. Or just "at the top of the
function definition".

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */
David Brown
2024-02-07 13:01:27 UTC
Permalink
Post by Malcolm McLean
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or
conditionals - but it happens.
The two common patterns are to give each variable the minimum scope, or
to decare all variables at the start of the function and give them all
function scope.
The case for minimum scope is the same as the case for scope itself. The
variable is accessible where it is used and not elsewhere, which makes
it less likely it will be used in error, and means there are fewer names
to understand.
It makes code simpler, clearer, easier to reuse, easier to see that it
is correct, and easier to see if there is an error. It is very much
easier for automatic tools (static warnings) to spot issues.
Post by Malcolm McLean
However there are also strong arguments for ducntion scope.
Not in my experience and in my opinion.
Post by Malcolm McLean
A function
is a natural unit.
True, but irrelevant.
Post by Malcolm McLean
Adn all the varibales used in that unit are listed
together and, ideally, commented.
In reality, not commented. And if commented, then commented incorrectly.

Rather than trying to write vague comments to say what something is how
it is used, it is better to write the code so that it is clear. Giving
variables appropriate names is part of that. For the most part, I'd say
if you think a variable needs a comment, your code is not clear enough
or has poor structure.

It is /massively/ simpler and clearer to write :

for (int i = 0; i < 10; i++) { ... }

than

int i;

/* ... big gap ... */

for (i = 0; i < 10; i++) { ... }

It doesn't help if you have "int loop_index;" or add a comment to the
variable definition. Putting it at the loop itself is better.
Post by Malcolm McLean
So at a glance you can see what is in
scope and what is being operated on. And there are only three levels of
scope. A varibale is global, or it is file scope, or it is scoped to the
function.
Every block is a new scope. Function scope in C is only for labels.
Post by Malcolm McLean
I tend to prefer function scope for C. However I use a lot of C++ these
days, and in C++ local scope is often better, and in some cases even
necessary. So I find that I'm tending to use local scope in C more.
I hate having to work with code written in long-outdated "declare
everything at the top of the function" style. I realise style and
experience are subjective, but I have not seen any code or any argument
that has led me to doubt my preferences.
Richard Harnden
2024-02-07 13:21:38 UTC
Permalink
Post by David Brown
Post by Malcolm McLean
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your
local variables.  It's rare that you need to add braces just to make
a scope for a variable - usually you have enough braces in loops or
conditionals - but it happens.
The two common patterns are to give each variable the minimum scope,
or to decare all variables at the start of the function and give them
all function scope.
The case for minimum scope is the same as the case for scope itself.
The variable is accessible where it is used and not elsewhere, which
makes it less likely it will be used in error, and means there are
fewer names to understand.
It makes code simpler, clearer, easier to reuse, easier to see that it
is correct, and easier to see if there is an error.  It is very much
easier for automatic tools (static warnings) to spot issues.
Post by Malcolm McLean
However there are also strong arguments for ducntion scope.
Not in my experience and in my opinion.
Post by Malcolm McLean
A function is a natural unit.
True, but irrelevant.
Post by Malcolm McLean
Adn all the varibales used in that unit are listed together and,
ideally, commented.
In reality, not commented.  And if commented, then commented incorrectly.
Rather than trying to write vague comments to say what something is how
it is used, it is better to write the code so that it is clear.  Giving
variables appropriate names is part of that.  For the most part, I'd say
if you think a variable needs a comment, your code is not clear enough
or has poor structure.
    for (int i = 0; i < 10; i++) { ... }
than
    int i;
    /* ... big gap ... */
    for (i = 0; i < 10; i++) { ... }
It doesn't help if you have "int loop_index;" or add a comment to the
variable definition.  Putting it at the loop itself is better.
Post by Malcolm McLean
So at a glance you can see what is in scope and what is being operated
on. And there are only three levels of scope. A varibale is global, or
it is file scope, or it is scoped to the function.
Every block is a new scope.  Function scope in C is only for labels.
Post by Malcolm McLean
I tend to prefer function scope for C. However I use a lot of C++
these days, and in C++ local scope is often better, and in some cases
even necessary. So I find that I'm tending to use local scope in C more.
We could have 'malcolm-scope' ?!

(sorry :) )
Post by David Brown
I hate having to work with code written in long-outdated "declare
everything at the top of the function" style.  I realise style and
experience are subjective, but I have not seen any code or any argument
that has led me to doubt my preferences.
Malcolm McLean
2024-02-07 13:42:36 UTC
Permalink
Post by David Brown
Post by Malcolm McLean
The case for minimum scope is the same as the case for scope itself.
The variable is accessible where it is used and not elsewhere, which
makes it less likely it will be used in error, and means there are
fewer names to understand.
It makes code simpler, clearer, easier to reuse, easier to see that it
is correct, and easier to see if there is an error.  It is very much
easier for automatic tools (static warnings) to spot issues.
This is all true, but only in way. Whilst it's easier to see that there
are errors in one way, because you have to look at a smaller section of
code, it's harder in others, for example because that small section is
more cluttered. From experience with automatic tools, they give too many
false warnings for correct code, and then programmers often rewrite the
code less clearly to suppress the warning.
Post by David Brown
Post by Malcolm McLean
However there are also strong arguments for ducntion scope.
Not in my experience and in my opinion.
That's not a legitimate response. The correct thing to say is "you have
given a argment there but I don't think it is strong one". Unless you
are claiming to be experieenced in arguing with people over scope, and I
donlt think that is what yiu mean to say,
Post by David Brown
Post by Malcolm McLean
A function is a natural unit.
True, but irrelevant.
Post by Malcolm McLean
Adn all the varibales used in that unit are listed together and,
ideally, commented.
In reality, not commented.  And if commented, then commented incorrectly.
Variable names mean something. The classic name for a variable is "x".
This usually means either "the value that is given" or "the horizontal
value on an axis". But it can of ciurse mean "a value which we shall
calculate that doesn;t have an abvous other name", or even maybe, "the
nunber of times the letter "x" appears in the data. It depnds on
context. However the imprtant thing is that x should always mean the
same thing within the same function. So if it's a real on the horizontal
axis of a graph, we don't also use "x" for an integer we need to
factorise, in the same function. And if it isn't clear, (x is such a
strong convention that it seldom needs a comment), we need to say how
"x" is being used and what it means in that function. Function and not
block is the unit for that.
Post by David Brown
Rather than trying to write vague comments to say what something is how
it is used, it is better to write the code so that it is clear.  Giving
variables appropriate names is part of that.  For the most part, I'd say
if you think a variable needs a comment, your code is not clear enough
or has poor structure.
I prefer short variable names because it is the mathematical convention
and because it makes complex expressions easier to read. But of course
then they can't be as meaningful. So to use a short name and add a
comment is reasonable way to achieve both goals.
Post by David Brown
    for (int i = 0; i < 10; i++) { ... }
than
    int i;
    /* ... big gap ... */
    for (i = 0; i < 10; i++) { ... }
It doesn't help if you have "int loop_index;" or add a comment to the
variable definition.  Putting it at the loop itself is better.
This pattern is quite common in C.

for (i = 0; i < N; i++)
if (x[i] == 0)
break;
if (i == N) /* no zero found */

So you can't scope the counter to the loop.

i is always a loop index. Usually I just out one at the top so it is
hanging around and handy.
Post by David Brown
I hate having to work with code written in long-outdated "declare
everything at the top of the function" style.  I realise style and
experience are subjective, but I have not seen any code or any argument
that has led me to doubt my preferences.
I quite often work with code which was written a very long time ago and
is still useful. That's one of the big strengths of C. It is subjective
however. It's not about making life easier for the compiler. It's about
what is clearer. That depends on the way people read code and think
about it, and that won't necessarily be the same for every person.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
David Brown
2024-02-07 15:17:54 UTC
Permalink
Post by Malcolm McLean
Post by David Brown
Post by Malcolm McLean
The case for minimum scope is the same as the case for scope itself.
The variable is accessible where it is used and not elsewhere, which
makes it less likely it will be used in error, and means there are
fewer names to understand.
It makes code simpler, clearer, easier to reuse, easier to see that it
is correct, and easier to see if there is an error.  It is very much
easier for automatic tools (static warnings) to spot issues.
This is all true, but only in way. Whilst it's easier to see that there
are errors in one way, because you have to look at a smaller section  of
code, it's harder in others, for example because that small section is
more cluttered.
No, it is not - unless you write it very badly.
Post by Malcolm McLean
From experience with automatic tools, they give too many
false warnings for correct code, and then programmers often rewrite the
code less clearly to suppress the warning.
You need to use good tools, and you need to know how to use them. It is
unfortunately the case that some people are poor programmers - they
write bad code, and they don't know how to get the best from their tools.

But is that an excuse for /you/ not to write the best code you can, in
the clearest and most maintainable manner, using the best practical
tools to help catch any errors?
Post by Malcolm McLean
Post by David Brown
Post by Malcolm McLean
However there are also strong arguments for ducntion scope.
Not in my experience and in my opinion.
That's not a legitimate response. The correct thing to say is "you have
given a argment there but I don't think it is strong one".
My experience and opinion is that there are no strong arguments in
favour of "all declarations at the top of the function." That is what I
meant to say, and it is a legitimate response.
Post by Malcolm McLean
Unless you
are claiming to be experieenced in arguing with people over scope, and I
donlt think that is what yiu mean to say,
/Please/ get a spell checker! Or type more carefully.
Post by Malcolm McLean
Post by David Brown
Post by Malcolm McLean
A function is a natural unit.
True, but irrelevant.
Post by Malcolm McLean
Adn all the varibales used in that unit are listed together and,
ideally, commented.
In reality, not commented.  And if commented, then commented incorrectly.
Variable names mean something. The classic name for a variable is "x".
This usually means either "the value that is given" or "the horizontal
value on an axis". But it can of ciurse mean "a value which we shall
calculate that doesn;t have an abvous other name", or even maybe, "the
nunber of times the letter "x" appears in the data. It depnds on
context. However the imprtant thing is that x should always mean the
same thing within the same function.
No.

The important thing is that the purpose of a variable should be clear
within its scope and use. It is completely artificial to suggest it
should be consistent within a function - you could equally well say it
should be consistent within a file, or within a block.
Post by Malcolm McLean
Post by David Brown
Rather than trying to write vague comments to say what something is
how it is used, it is better to write the code so that it is clear.
Giving variables appropriate names is part of that.  For the most
part, I'd say if you think a variable needs a comment, your code is
not clear enough or has poor structure.
I prefer short variable names because it is the mathematical convention
and because it makes complex expressions easier to read. But of course
then they can't be as meaningful. So to use a short name and add a
comment is reasonable way to achieve both goals.
Or, far better, use small scopes and then variables can have short names
without comments and be clear.
Post by Malcolm McLean
Post by David Brown
     for (int i = 0; i < 10; i++) { ... }
than
     int i;
     /* ... big gap ... */
     for (i = 0; i < 10; i++) { ... }
It doesn't help if you have "int loop_index;" or add a comment to the
variable definition.  Putting it at the loop itself is better.
This pattern is quite common in C.
for (i = 0; i < N; i++)
  if (x[i] == 0)
     break;
if (i == N) /* no zero found */
If you need to do that, you need a bigger scope for "i". But it would
be insane to use worse code style for 95% of your loops for the 5% (or
less) that need this.
Post by Malcolm McLean
So you can't scope the counter to the loop.
i is always a loop index. Usually I just out one at the top so it is
hanging around and handy.
Laziness is not good.
Post by Malcolm McLean
Post by David Brown
I hate having to work with code written in long-outdated "declare
everything at the top of the function" style.  I realise style and
experience are subjective, but I have not seen any code or any
argument that has led me to doubt my preferences.
I quite often work with code which was written a very long time ago and
is still useful. That's one of the big strengths of C. It is subjective
however. It's not about making life easier for the compiler. It's about
what is clearer. That depends on the way people read code and think
about it, and that won't necessarily be the same for every person.
Lawrence D'Oliveiro
2024-02-07 21:34:36 UTC
Permalink
Post by David Brown
It makes code simpler, clearer, easier to reuse, easier to see that it
is correct, and easier to see if there is an error. It is very much
easier for automatic tools (static warnings) to spot issues.
Here’s an example of how granular I like to make my scopes:

struct pollfd topoll[MAX_WATCHES + 1];
int total_timeout = -1; /* to begin with */
for (int i = 0; i < nr_watches; ++i)
{
DBusWatch * const watch = watches[i];
struct pollfd * const entry = topoll + i;
entry->fd = dbus_watch_get_unix_fd(watch);
entry->events = 0; /* to begin with */
if (dbus_watch_get_enabled(watch))
{
const int flags = dbus_watch_get_flags(watch);
if ((flags & DBUS_WATCH_READABLE) != 0)
{
entry->events |= POLLIN | POLLERR;
} /*if*/
if ((flags & DBUS_WATCH_WRITABLE) != 0)
{
entry->events |= POLLOUT | POLLERR;
} /*if*/
} /*if*/
} /*for*/
{
struct pollfd * const entry = topoll + nr_watches;
entry->fd = notify_receive_pipe;
entry->events = POLLIN;
}
for (int i = 0; i < nr_timeouts; ++i)
{
DBusTimeout * const timeout = timeouts[i];
if (dbus_timeout_get_enabled(timeout))
{
const int interval = dbus_timeout_get_interval(timeout);
if (total_timeout < 0 or total_timeout > interval)
{
total_timeout = interval;
} /*if*/
} /*if*/
} /*for*/
const long timeout_start = get_milliseconds();
bool got_io;
{
const int sts = poll(topoll, nr_watches + 1, total_timeout);
fprintf(stderr, "poll returned status %d\n", sts);
if (sts < 0)
{
perror("doing poll");
die();
} /*if*/
got_io = sts > 0;
}
Scott Lurndal
2024-02-07 16:21:25 UTC
Permalink
Post by Malcolm McLean
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your local
variables.  It's rare that you need to add braces just to make a scope
for a variable - usually you have enough braces in loops or conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope, or
to decare all variables at the start of the function and give them all
function scope.
The case for minimum scope is the same as the case for scope itself. The
variable is accessible where it is used and not elsewhere, which makes
it less likely it will be used in error, and means there are fewer names
to understand.
And it means the compiler can re-use the local storage (if any was
allocated) for subsequent minimal scope variables (or even same scope
if the compiler knows the original variable is never used again),
so long as the address of the variable isn't taken.
Michael S
2024-02-08 11:26:57 UTC
Permalink
On Wed, 07 Feb 2024 16:21:25 GMT
Post by Scott Lurndal
Post by Malcolm McLean
Post by David Brown
Post by Lawrence D'Oliveiro
They reuse "temp" variables instead of making new ones.
I like to limit the scope of my temporary variables. In C, this is as easy
as sticking a pair of braces around a few statements.
Generally, you want to have the minimum practical scope for your
local variables.  It's rare that you need to add braces just to
make a scope for a variable - usually you have enough braces in
loops or conditionals
- but it happens.
The two common patterns are to give each variable the minimum scope,
or to decare all variables at the start of the function and give
them all function scope.
The case for minimum scope is the same as the case for scope itself.
The variable is accessible where it is used and not elsewhere, which
makes it less likely it will be used in error, and means there are
fewer names to understand.
And it means the compiler can re-use the local storage (if any was
allocated) for subsequent minimal scope variables (or even same scope
if the compiler knows the original variable is never used again),
so long as the address of the variable isn't taken.
That's completely orthogonal to the scope of declaration, at least as
long as compiler is not completely idiotic.
Ben Bacarisse
2024-02-07 10:04:25 UTC
Permalink
Post by David Brown
Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at the
top of the function.
The comma suggests (I think) that it is C90 that mandates that all one's
variables are declared at the top of the function. But that's not the
case (as I am sure you know). The other reading -- that this is done in
idiomatic C90 code -- is also something that I'd question, but not
something that I'd want to argue.

I comment just because there seems to be a myth that "old C" had to have
all the declarations at the top of a function. That was true once, but
so long ago as to be irrelevant. Even K&R C allowed declarations at the
top of a compound statement.
--
Ben.
David Brown
2024-02-07 13:51:46 UTC
Permalink
Post by Ben Bacarisse
Post by David Brown
Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at the
top of the function.
The comma suggests (I think) that it is C90 that mandates that all one's
variables are declared at the top of the function. But that's not the
case (as I am sure you know).
Yes.
Post by Ben Bacarisse
The other reading -- that this is done in
idiomatic C90 code -- is also something that I'd question, but not
something that I'd want to argue.
"Idiomatic" is perhaps not the best word. (And "idiotic" is too
strong!) I mean written in a way that is quite common in C90 code.
Post by Ben Bacarisse
I comment just because there seems to be a myth that "old C" had to have
all the declarations at the top of a function. That was true once, but
so long ago as to be irrelevant. Even K&R C allowed declarations at the
top of a compound statement.
It's good to make it clear.
Ben Bacarisse
2024-02-07 15:30:12 UTC
Permalink
Post by Ben Bacarisse
Post by David Brown
Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at the
top of the function.
The comma suggests (I think) that it is C90 that mandates that all one's
variables are declared at the top of the function. But that's not the
case (as I am sure you know).
Yes.
Post by Ben Bacarisse
The other reading -- that this is done in
idiomatic C90 code -- is also something that I'd question, but not
something that I'd want to argue.
"Idiomatic" is perhaps not the best word. (And "idiotic" is too strong!)
I mean written in a way that is quite common in C90 code.
The most common meaning of "idiomatic", and the one I usually associate
with it in this context, is "containing expressions that are natural and
correct". That's not how I would describe eschewing declarations in
inner blocks.
--
Ben.
Malcolm McLean
2024-02-07 15:45:09 UTC
Permalink
Post by Ben Bacarisse
Post by Ben Bacarisse
Post by David Brown
Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at the
top of the function.
The comma suggests (I think) that it is C90 that mandates that all one's
variables are declared at the top of the function. But that's not the
case (as I am sure you know).
Yes.
Post by Ben Bacarisse
The other reading -- that this is done in
idiomatic C90 code -- is also something that I'd question, but not
something that I'd want to argue.
"Idiomatic" is perhaps not the best word. (And "idiotic" is too strong!)
I mean written in a way that is quite common in C90 code.
The most common meaning of "idiomatic", and the one I usually associate
with it in this context, is "containing expressions that are natural and
correct". That's not how I would describe eschewing declarations in
inner blocks.
No. It means writing the code in a way which is common in C and has
certain advantages, but is not so in other languages.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
David Brown
2024-02-07 20:44:26 UTC
Permalink
Post by Malcolm McLean
Post by Ben Bacarisse
Post by Ben Bacarisse
Post by David Brown
Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at the
top of the function.
The comma suggests (I think) that it is C90 that mandates that all one's
variables are declared at the top of the function.  But that's not the
case (as I am sure you know).
Yes.
Post by Ben Bacarisse
  The other reading -- that this is done in
idiomatic C90 code -- is also something that I'd question, but not
something that I'd want to argue.
"Idiomatic" is perhaps not the best word.  (And "idiotic" is too
strong!)
I mean written in a way that is quite common in C90 code.
The most common meaning of "idiomatic", and the one I usually associate
with it in this context, is "containing expressions that are natural and
correct".  That's not how I would describe eschewing declarations in
inner blocks.
Some people do feel it is more "natural" to have all their declarations
at the start of their functions (and never declare variables in any
inner block scopes). It's common, and their code can be correct. You
and I both think there are usually better ways to structure code, but
does that mean it is not "idiomatic" ? I'm not sure there is a good
answer here. Unfortunately the C standards don't define the term
"idiomatic" :-(

If you can think of a better term to use here, I'd be happy to hear it -
otherwise I think we all know the kind of code structure I meant, which
was the most important point.
Post by Malcolm McLean
No. It means writing the code in a way which is common in C and has
certain advantages, but is not so in other languages.
An idiom in C could also be an idiom in C++, Python, or any other
language. Nothing in "idiomatic" implies that it is unique to a
particular language, just that it is commonly used in that language.
Malcolm McLean
2024-02-08 00:33:35 UTC
Permalink
Post by David Brown
Post by Malcolm McLean
Post by Ben Bacarisse
The most common meaning of "idiomatic", and the one I usually associate
with it in this context, is "containing expressions that are natural and
correct".  That's not how I would describe eschewing declarations in
inner blocks.
No. It means writing the code in a way which is common in C and has
certain advantages, but is not so in other languages.
An idiom in C could also be an idiom in C++, Python, or any other
language.  Nothing in "idiomatic" implies that it is unique to a
particular language, just that it is commonly used in that language.
We must be able to point to at least one other language where it is not
the idiom, in order to say that it is an idiom.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Kaz Kylheku
2024-02-08 01:30:25 UTC
Permalink
Post by Malcolm McLean
Post by David Brown
Post by Malcolm McLean
Post by Ben Bacarisse
The most common meaning of "idiomatic", and the one I usually associate
with it in this context, is "containing expressions that are natural and
correct".  That's not how I would describe eschewing declarations in
inner blocks.
No. It means writing the code in a way which is common in C and has
certain advantages, but is not so in other languages.
An idiom in C could also be an idiom in C++, Python, or any other
language.  Nothing in "idiomatic" implies that it is unique to a
particular language, just that it is commonly used in that language.
We must be able to point to at least one other language where it is not
the idiom, in order to say that it is an idiom.
There are two meanings of *idiom*.

The "strong" meaning of idiom is that it's a meaning arbitrarily
assigned to a canned combination of words which otherwise make no
sense or are ungrammatical.

The "weak" meaning refers to some often used phrase.

Your proposed rule has a logical flaw because it requires us to confirm
that something is not an idiom in order to confirm that it is.
Even though that is in separate languages, it is still a problem.

Suppose that that all the literal translations of some phrase X into all
known languages are naively considered idioms by all their respective
speakers.

Then according to your criterion, none of the languages have the right
to consider it to be an idiom.

Suppos that English speakers are the first to realize this problem, and
choose to stop considering the English translation of X an idiom.

At that point, the other remaining languages may keep it as an idiom:
their speakers can now point to at least one language where it isn't.

Problem is, the choice of which group must stop treating it as an idiom,
so that the others may, is arbitrary.

This is not a well-founded definition for a term!
Lawrence D'Oliveiro
2024-02-08 01:38:12 UTC
Permalink
Post by Malcolm McLean
We must be able to point to at least one other language where it is not
the idiom, in order to say that it is an idiom.
How about pointing to alternative ways it might be said in the same
language, and then proclaiming that “for some reason, nobody who uses the
language is supposed to do it that way”?
Malcolm McLean
2024-02-08 02:21:53 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Malcolm McLean
We must be able to point to at least one other language where it is not
the idiom, in order to say that it is an idiom.
How about pointing to alternative ways it might be said in the same
language, and then proclaiming that “for some reason, nobody who uses the
language is supposed to do it that way”?
So how do you say "My French is lousy" in idomatic French?
Accroding to a Frenchman, it is "Doucement. Le Francais n'est pas ma
langue maternelle."
Now that means literally "Softly. The French is not my maternal language".
You wouldn't say that in English. You'd say "go easy" instead of
"softly". It would be "French" rather than "the French". And whilst you
might say "maternal language" it would be rare. Normally it would be
"native language".
So the French has one idiom and the English another, and we say things
in a slightly different way. What is the convention in one is not so in
the other, and that is what makes it idiom.

And of course the Frenchman made the point that whilst his information
was correct, to actually use his translation would be self-refuting.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Kaz Kylheku
2024-02-08 03:07:35 UTC
Permalink
Post by Malcolm McLean
Post by Lawrence D'Oliveiro
Post by Malcolm McLean
We must be able to point to at least one other language where it is not
the idiom, in order to say that it is an idiom.
How about pointing to alternative ways it might be said in the same
language, and then proclaiming that “for some reason, nobody who uses the
language is supposed to do it that way”?
So how do you say "My French is lousy" in idomatic French?
Je parle Quebecois.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Ben Bacarisse
2024-02-08 11:45:07 UTC
Permalink
Post by Ben Bacarisse
Post by Ben Bacarisse
Post by David Brown
Making some "temp" variables and re-using them was also common for some
people in idiomatic C90 code, where all your variables are declared at the
top of the function.
The comma suggests (I think) that it is C90 that mandates that all one's
variables are declared at the top of the function. But that's not the
case (as I am sure you know).
Yes.
Post by Ben Bacarisse
The other reading -- that this is done in
idiomatic C90 code -- is also something that I'd question, but not
something that I'd want to argue.
"Idiomatic" is perhaps not the best word. (And "idiotic" is too strong!)
I mean written in a way that is quite common in C90 code.
The most common meaning of "idiomatic", and the one I usually associate
with it in this context, is "containing expressions that are natural and
correct". That's not how I would describe eschewing declarations in
inner blocks.
No. It means writing the code in a way which is common in C and has certain
advantages, but is not so in other languages.
Where do you get your superior knowledge of English from, and is there a
way anyone else can hope to achieve your level of competence?
--
Ben.
Malcolm McLean
2024-02-08 12:15:01 UTC
Permalink
Post by Ben Bacarisse
No. It means writing the code in a way which is common in C and has certain
advantages, but is not so in other languages.
Where do you get your superior knowledge of English from, and is there a
way anyone else can hope to achieve your level of competence?
Degree in English literature.
Places are difficult to obtain but not impossible. You need to convince
the dons that you deserve one as many other people will be after them.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Michael S
2024-02-05 17:02:09 UTC
Permalink
On Mon, 5 Feb 2024 05:58:55 -0000 (UTC)
Post by Kaz Kylheku
Post by bart
* Nearly everyone here is working on massively huge and complex
projects, which all take from minutes to hours for a full build.
That's the landscape. Nobody is going to pay you for writing small
utilities in C. That sort of thing all went to scripting languages.
(It happens from time to time as a side task.)
I currently work on a a firmware application that compiles to a 100
megabyte (stripped!) executable.
My before last firmware project compiles from scratch in 0m1.623s
despite using bloated STmicro libraries and headers.
On Windows, with antivirus running, using 10 y.o. PC.
With brand new CPU, bare metal Linux and modern NVMe SSD it will likely
finish 3 times faster.
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.

Projects that small are not typical in my embedded development
practice practice. But embedded projects that on somewhat beefier 5 y.o.
hardware compile from scratch in less than 5 sec are typical.

As to PC development, project that I am trying to fix right now uses
link-time code generation, so it takes ~8 seconds (VS 2019, msbuild,
command line tools) to rebuild when just one file changed. I accept it
because it's not my own project. If it was mine, I'd probably want to
improve it. Besides, today I grew older, less intolerable of delays
than 25-30 years ago.
Lawrence D'Oliveiro
2024-02-05 23:28:04 UTC
Permalink
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.
But if you don’t have antivirus on your build machine, the sad fact of
development on Windows is that there are viruses that will insinuate
themselves into the build products.
Richard Harnden
2024-02-05 23:40:55 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.
But if you don’t have antivirus on your build machine, the sad fact of
development on Windows is that there are viruses that will insinuate
themselves into the build products.
Reflections on Trusting Trust?
Michael S
2024-02-05 23:46:14 UTC
Permalink
On Mon, 5 Feb 2024 23:28:04 -0000 (UTC)
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is,
and until now I didn't find a way to get antivirus-free Windows at
work.
But if you don’t have antivirus on your build machine, the sad fact
of development on Windows is that there are viruses that will
insinuate themselves into the build products.
No, if I use Windpws there are no danger of viruses like these.
Besides, it's not like antivirus could have helped against viruses if
I was stupid enough to catch them. To the opposite, I suspect that
presence of antivirus increases attak surface.
David Brown
2024-02-06 08:54:28 UTC
Permalink
Post by Michael S
On Mon, 5 Feb 2024 23:28:04 -0000 (UTC)
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is,
and until now I didn't find a way to get antivirus-free Windows at
work.
But if you don’t have antivirus on your build machine, the sad fact
of development on Windows is that there are viruses that will
insinuate themselves into the build products.
No, if I use Windpws there are no danger of viruses like these.
Besides, it's not like antivirus could have helped against viruses if
I was stupid enough to catch them. To the opposite, I suspect that
presence of antivirus increases attak surface.
My experience is that antivirus programs rarely catch anything unless
the user is very gullible, or very unlucky. I have seen antivirus
programs block valid programs with false positives more often than I
have seen them catch actual malware. (And that's company wide, not just
my machines.) There is no major antivirus software that has not killed
at least some Windows machines by false-positive blocking of critical
Windows components.

And yes, there have been many successful attacks and hacks that get into
Windows machines via flaws in the massively over-complicated "security"
software.
Chris M. Thomasson
2024-02-06 00:03:26 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.
But if you don’t have antivirus on your build machine, the sad fact of
development on Windows is that there are viruses that will insinuate
themselves into the build products.
There can be viruses hidden in source code for public domain code...
Build it and they will come! ;^o
Chris M. Thomasson
2024-02-06 00:06:02 UTC
Permalink
Post by Chris M. Thomasson
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.
But if you don’t have antivirus on your build machine, the sad fact of
development on Windows is that there are viruses that will insinuate
themselves into the build products.
There can be viruses hidden in source code for public domain code...
Build it and they will come! ;^o
Other viruses can be build, not infected... Run it, BAM!!
David Brown
2024-02-06 08:50:02 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.
But if you don’t have antivirus on your build machine, the sad fact of
development on Windows is that there are viruses that will insinuate
themselves into the build products.
Nonsense. Well, /almost/ nonsense. When thinking about security, you
should not rule out anything entirely.

And of course there are those two or three unfortunate people that have
to work with embedded Windows.
Chris M. Thomasson
2024-02-06 09:01:31 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Michael S
Windows by itself is not a measurable slowdown, but antivirus is, and
until now I didn't find a way to get antivirus-free Windows at work.
But if you don’t have antivirus on your build machine, the sad fact of
development on Windows is that there are viruses that will insinuate
themselves into the build products.
Nonsense.  Well, /almost/ nonsense.  When thinking about security, you
should not rule out anything entirely.
And of course there are those two or three unfortunate people that have
to work with embedded Windows.
;^)
Lawrence D'Oliveiro
2024-02-06 23:24:41 UTC
Permalink
Post by David Brown
And of course there are those two or three unfortunate people that have
to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
David Brown
2024-02-07 07:56:15 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people that have
to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost entirely
non-existent now. I'm sure there are a few legacy products still
produced that use some kind of embedded Windows, but few more than that
- which is what I was hinting at in my post.
Michael S
2024-02-07 10:09:50 UTC
Permalink
On Wed, 7 Feb 2024 08:56:15 +0100
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people that
have to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost
entirely non-existent now. I'm sure there are a few legacy products
still produced that use some kind of embedded Windows, but few more
than that
- which is what I was hinting at in my post.
Is there any digital oscilloscope that is not Windows under the hood?
How about medical equipment?
The first question is mostly rhetorical, the second is not.
David Brown
2024-02-07 14:03:14 UTC
Permalink
Post by Michael S
On Wed, 7 Feb 2024 08:56:15 +0100
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people that
have to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost
entirely non-existent now. I'm sure there are a few legacy products
still produced that use some kind of embedded Windows, but few more
than that
- which is what I was hinting at in my post.
Is there any digital oscilloscope that is not Windows under the hood?
Yes, most that I know of. (There are some older ones that are Windows,
and high-end ones almost never used Windows.)
Post by Michael S
How about medical equipment?
A great deal.
Post by Michael S
The first question is mostly rhetorical, the second is not.
It used to be more common to have embedded Windows. Embedded Linux, and
RTOS's with GUI's (using, for example, QT) have long ago taken over.

There are some hold-outs, of course - no company wants to re-do their
systems and software if they can avoid it, and if they made the bad bet
to use embedded Windows before, they may stick to it.
Scott Lurndal
2024-02-07 16:25:45 UTC
Permalink
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people that have
to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost entirely
non-existent now. I'm sure there are a few legacy products still
produced that use some kind of embedded Windows, but few more than that
- which is what I was hinting at in my post.
Wind river is still popular, I believe, but the linux kernel + busybox is
probably the most common.
David Brown
2024-02-07 20:49:52 UTC
Permalink
Post by Scott Lurndal
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people that have
to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost entirely
non-existent now. I'm sure there are a few legacy products still
produced that use some kind of embedded Windows, but few more than that
- which is what I was hinting at in my post.
Wind river is still popular, I believe, but the linux kernel + busybox is
probably the most common.
VxWorks, you mean? Yes, that is still used in what might be called
"big" embedded systems. There are other RTOS's that have been common
for embedded systems with screens (and no one would bother with embedded
Windows without a screen!), including QNX, Integrity, eCOS, and Nucleus.

(There are many small RTOS's, but they are competing in a different field.)
Chris M. Thomasson
2024-02-07 21:04:18 UTC
Permalink
Post by Scott Lurndal
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people that have
to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost entirely
non-existent now.  I'm sure there are a few legacy products still
produced that use some kind of embedded Windows, but few more than that
- which is what I was hinting at in my post.
Wind river is still popular, I believe, but the linux kernel + busybox is
probably the most common.
VxWorks, you mean?  Yes, that is still used in what might be called
"big" embedded systems.  There are other RTOS's that have been common
for embedded systems with screens (and no one would bother with embedded
Windows without a screen!), including QNX, Integrity, eCOS, and Nucleus.
(There are many small RTOS's, but they are competing in a different field.)
Fwiw, I think the last one I used was quadros a long time ago.
Michael S
2024-02-07 21:37:06 UTC
Permalink
On Wed, 7 Feb 2024 21:49:52 +0100
Post by David Brown
Post by Scott Lurndal
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people
that have to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost
entirely non-existent now. I'm sure there are a few legacy
products still produced that use some kind of embedded Windows,
but few more than that
- which is what I was hinting at in my post.
Wind river is still popular, I believe, but the linux kernel +
busybox is probably the most common.
VxWorks, you mean? Yes, that is still used in what might be called
"big" embedded systems. There are other RTOS's that have been common
for embedded systems with screens (and no one would bother with
embedded Windows without a screen!),
Then our company and me personally are no-ones 1.5 times.

The first time it was WinCE on small Arm-based board that served as
Ethernet interface and control plane controller for big boards that
was an important building blocks for very expensive industrial
equipment. Equipment as whole was not ours, we were sub-contractor for
this particular piece. This instance of Windows never ever had display
or keyboard.
We still make few boards per year more than 15 years later.

The second one was/is [part of] our own product, a regular Windows
Embedded, starting with XP, then 7, then 10. It runs on SBC that
functions as a host of Compact PCI frame with various I/O boards mostly
of our own making. SBC does both control plane and partial data plane
processing and handles Ethernet communication with the rest of the
system. It's completely different industry, the system as a whole not
nearly as expensive as the first one, but still expensive enough for
this particular computer to be small part of the total cost.
The system does have connectors for display, keyboard and mouse.
Ssometimes it is handy to connect them during manufacturing testing. But
they are never connected in fully assembled product. However since they
exist, with relation to this system I count myself as half-no-one
rather than full no-one.
Post by David Brown
including QNX, Integrity, eCOS,
and Nucleus.
(There are many small RTOS's, but they are competing in a different field.)
David Brown
2024-02-08 07:52:12 UTC
Permalink
Post by Michael S
On Wed, 7 Feb 2024 21:49:52 +0100
Post by David Brown
Post by Scott Lurndal
Post by David Brown
Post by Lawrence D'Oliveiro
Post by David Brown
And of course there are those two or three unfortunate people
that have to work with embedded Windows.
I thought this has pretty much gone away, pushed aside by Linux.
It was never common in the first place, and yes, it is almost
entirely non-existent now. I'm sure there are a few legacy
products still produced that use some kind of embedded Windows,
but few more than that
- which is what I was hinting at in my post.
Wind river is still popular, I believe, but the linux kernel +
busybox is probably the most common.
VxWorks, you mean? Yes, that is still used in what might be called
"big" embedded systems. There are other RTOS's that have been common
for embedded systems with screens (and no one would bother with
embedded Windows without a screen!),
Then our company and me personally are no-ones 1.5 times.
You are just a rounding error :-)

But it is interesting to hear of exceptions to the general trend.
Post by Michael S
The first time it was WinCE on small Arm-based board that served as
Ethernet interface and control plane controller for big boards that
was an important building blocks for very expensive industrial
equipment. Equipment as whole was not ours, we were sub-contractor for
this particular piece. This instance of Windows never ever had display
or keyboard.
We still make few boards per year more than 15 years later.
The second one was/is [part of] our own product, a regular Windows
Embedded, starting with XP, then 7, then 10. It runs on SBC that
functions as a host of Compact PCI frame with various I/O boards mostly
of our own making. SBC does both control plane and partial data plane
processing and handles Ethernet communication with the rest of the
system. It's completely different industry, the system as a whole not
nearly as expensive as the first one, but still expensive enough for
this particular computer to be small part of the total cost.
The system does have connectors for display, keyboard and mouse.
Ssometimes it is handy to connect them during manufacturing testing. But
they are never connected in fully assembled product. However since they
exist, with relation to this system I count myself as half-no-one
rather than full no-one.
Post by David Brown
including QNX, Integrity, eCOS,
and Nucleus.
(There are many small RTOS's, but they are competing in a different field.)
Lawrence D'Oliveiro
2024-02-07 21:41:46 UTC
Permalink
... the linux kernel + busybox is probably the most common.
That “Ingenuity” Mars helicopter, that recently met its end after breaking
a rotor blade, ran Linux and other open-source software.

It was very much an afterthought project, a proof of concept, only meant
to last maybe 30 days. It ended up making dozens of flights over 3 years.
bart
2024-02-07 02:18:51 UTC
Permalink
Post by Kaz Kylheku
Writing a compiler is pretty easy, because the bar can be set very low
while still calling it a compiler.
Whole-program compilers are easier because there are fewer requirements.
You have only one kind of deliverable to produce: the executable.
You don't have to deal with linkage and produce a linkable format.
David Brown suggested that they were harder than I said. You're saying
they are easier.

BTW your statements are wrong, but I'm not going to argue about it.

My whole-program compiler is here:

https://github.com/sal55/langs/blob/master/MCompiler.md

It has a dozen different outputs.
Post by Kaz Kylheku
GCC is maintained by people who know what a C compiler is, and GCC can
be asked to be one.
So what is it when it's not a C compiler? What language is it compiling
here:

c:\qx>gcc qc.c
c:\qx>

This program passes. Mine does the same:

c:\qx>mcc qc.c
Compiling qc.c to qc.exe

Whatever language that mcc processes must be similar to that that gcc
processes.

Yet it is true that gcc can be tuned to a particular standard, dialect,
set of extensions and a set of user-specified behaviours. Which means it
can also compile some Frankensteinian version of 'C' that anyone can devise.

Mine at least is a more rigid subset.
Post by Kaz Kylheku
Your idea of writing a C compiler seems to be to pick some random
examples of code believed to be C and make them work. (Where "work"
means that they compile and show a few behaviors that look like
the expected ones.)
That's what most people expect!
Post by Kaz Kylheku
Basically, you don't present a very credible case that you've actually
written a C compiler.
Well, don't believe it if you don't want. There 1000s of amateur 'C'
compilers about, it must be the most favoured language for such projects
(since it looks deceptively simple).

Among such compilers, mine is quite accomplished by comparison. One task
it is used for is to take APIs defined by C header files and turn into
into bindings in my two languages. It does that as well as any such tool
can. So fuck you.
Post by Kaz Kylheku
I currently work on a a firmware application that compiles to a 100
megabyte (stripped!) executable.
And yet 90% of the executables on my PC are under 1MB. SOMEBODY must be
writing small programs!

The NASM.EXE program is bit larger at 1.3MB for example, that's 98.7%
smaller than your giant program.

You want to make me feel bad about my stuff because you work on a big
project and mine are small. Let me go and find that length of rope then...
Post by Kaz Kylheku
Post by bart
* There is not a single feature of my alternate systems language that is
superior to the C equivalent
The worst curve ball someone could throw you would be to
be eagerly interested in your language, and ask for guidance
in how to get it installed and start working in it.
That happened 2-3 years ago and I was able to help out. However I'm not
pushing my actual language, which is anyway volatile as it is a vehicle
for new ideas, I was only discussing the utility of certain features.

Surely somebody can do that without going to the trouble of creating and
implementation a whole language, and using the feature over years, as
proof of concept.

But when someone actually does that, THEN they are not worth listening to?

I mean, where is YOUR lower-level system language? Where is anybody's? I
don't mean the Zigs and Rusts because that would be like comparing a
40-tonne truck with a car.

My language is a modernish family car compared with C's Model T.
Post by Kaz Kylheku
Not as much as fast executable code, unfortunately.
And yet most people code in Python and JavaScript and a whole pile of
slow languages.
Post by Kaz Kylheku
Compilers that blaze through large amounts of code in the blink of an
eye are almost certainly dodging on the optimization.
Yes, probably. But the optimisation is overrated. Do you really need
optimised code to test each of those 200 builds you're going to do today?

Not for a language at the level of C. (Maybe for C++ code as it needs it
to collapse the mountain of redundant code that templates etc will produce.)

For the programs I write, gcc-O3 makes then 1.5 to 2.0 faster typically,
for 100 times longer compile time.

And if I do want the boost, I can transpile to C to use gcc-O3. I don't
need the super-optimisation within my own product.
Post by Kaz Kylheku
And because they
don't need the internal /architecture/ to support the kinds
optimizations they are not doing, they can speed up the code generation
also. There is no need to generate an intermediate representation like
SSA; you can pretty much just parse the syntax and emit assembly code in
the same pass. Particularly if you only target one architecture.
A poorly optimizing retargetable compiler that emits an abstract
intermediate code will never be as blazingly fast as something equally
poorly optimizing that goes straight to code in one pass.
My non-C compiler uses multiple passes including an IL stage. It is not
much slower than TCC which is one pass, but generally produces faster code.

It can compile itself at about 15Hz. (That is, 15 new generations per
second. Unoptimised.)
Post by Kaz Kylheku
Post by bart
* There is no benefit at all in having a tool like a compiler, be a
small, self-contained executable.
Not as much as there used to, decades ago.
Simplicity is always good. Somebody deletes one of the 1000s of files of
your gcc installation. Is it something that is essential? Who knows.

But if your compiler is the one file mm.exe, it's easy to spot if it's
missing!
David Brown
2024-02-07 08:30:22 UTC
Permalink
Post by bart
Post by Kaz Kylheku
Writing a compiler is pretty easy, because the bar can be set very low
while still calling it a compiler.
Whole-program compilers are easier because there are fewer requirements.
You have only one kind of deliverable to produce: the executable.
You don't have to deal with linkage and produce a linkable format.
David Brown suggested that they were harder than I said. You're saying
they are easier.
I described what /I/ see as "whole program compilers", and where I see
them being used as serious tools that give better results than
traditional compile-and-link toolchains. The key here is whole program
/optimisation/ and static analysis. And I think there can be little
doubt that this is a far harder task than the much more limited tools
you are talking about.

Maybe it was unreasonable of me to conflate "whole program compiler" and
"whole program optimiser", even though I see no real-world use of the
former without the later. Using your definition of the term, your tool
is a "whole program compiler".

And I think Kaz was using the term in the same way as you do when he
says he thinks it is easier. I don't know either way, but it would
certainly skip several things that are otherwise necessary in a
traditional setup - assembly generation, an assembler, and a linker.
You also don't have to deal with linking object files from other sources.

(For the record, I there are many things that cannot be done with C and
traditional compile-link setups, that could be done with some kind of
whole-program analysis and a suitable language. Rust's borrow checker,
and XMOS XC's thread analysis are two examples.)
Post by bart
Post by Kaz Kylheku
GCC is maintained by people who know what a C compiler is, and GCC can
be asked to be one.
So what is it when it's not a C compiler? What language is it compiling
You walked right into that one - how many times has the difference
between standard C and sort-of-C been explained to you? As always, I
must point out that a tool does not have to be standards compliant -
that's a choice of the tool developer. But when the distinction is
made, and Kaz was clearly making that distinction, a "C compiler" is one
that follows the C standards (one or more published version) accurately
in terms of what it accepts or does not accept, the minimum guaranteed
behaviour, and the minimum required diagnostics. As has been explained
many times, "gcc" is not, in those terms, a "C compiler" by default - it
needs flags to put it in a compliant mode. Your tool, AFAIK, has never
claimed to be a standards-compliant C compiler.
Post by bart
Whatever language that mcc processes must be similar to that that gcc
processes.
Yes. Both accept some version of sort-of-C, with a common subset. (The
common subset in this example code may also, by coincidence, be standard
C. I haven't looked at it to see.)
Post by bart
Yet it is true that gcc can be tuned to a particular standard, dialect,
set of extensions and a set of user-specified behaviours. Which means it
can also compile some Frankensteinian version of 'C' that anyone can devise.
Mine at least is a more rigid subset.
Your idea of "rigid" is other people's idea of "inflexible". Rigid is
fine for one user.
Post by bart
Post by Kaz Kylheku
Your idea of writing a C compiler seems to be to pick some random
examples of code believed to be C and make them work.  (Where "work"
means that they compile and show a few behaviors that look like
the expected ones.)
That's what most people expect!
No, it is not. /I/ expect a compiler to be written by people who have
extensive knowledge of the C standards and who do their best to get the
compiler correct /by design/. Not by luck or trial and error. By
/design/. And I expect it to have an extensive test suite of both
simple code and extreme code and corner cases, because even the best
designers can get things wrong sometimes and testing helps catch bugs.
Malcolm McLean
2024-02-07 09:04:20 UTC
Permalink
Post by bart
Post by Kaz Kylheku
Basically, you don't present a very credible case that you've actually
written a C compiler.
Well, don't believe it if you don't want. There 1000s of amateur 'C'
compilers about, it must be the most favoured language for such projects
(since it looks deceptively simple).
It's absolutely clear to me that Bart has written a C compiler, and this
statement by Kaz is ridiculous.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Kaz Kylheku
2024-02-07 23:24:16 UTC
Permalink
Post by bart
Post by Kaz Kylheku
Writing a compiler is pretty easy, because the bar can be set very low
while still calling it a compiler.
Whole-program compilers are easier because there are fewer requirements.
You have only one kind of deliverable to produce: the executable.
You don't have to deal with linkage and produce a linkable format.
David Brown suggested that they were harder than I said. You're saying
they are easier.
I'm saying it's somewhat easier to make a compiler which produces an
object file than to produce a compiler that produces object files *and*
a linker that combines them.

There is all that code you don't have to write to produce object files,
read them, and link them. You don't have to solve the problem of how to
represent unresolved references in an externalized form in a file.

David made it clear he was referring to whole program optimization.
Post by bart
Post by Kaz Kylheku
GCC is maintained by people who know what a C compiler is, and GCC can
be asked to be one.
So what is it when it's not a C compiler? What language is it compiling
c:\qx>gcc qc.c
c:\qx>
Yes, sorry. It is compiling C also: a certain revision of GNU C,
which is family of dialects in the C family.
Post by bart
Mine at least is a more rigid subset.
Rigid? Where this subset it documented, other than in the code?

GNU C is documented, and tested.
Post by bart
Post by Kaz Kylheku
Your idea of writing a C compiler seems to be to pick some random
examples of code believed to be C and make them work. (Where "work"
means that they compile and show a few behaviors that look like
the expected ones.)
That's what most people expect!
That's may be verbal way of expressing what a lot of developers
want, but it it has to be carefully interpreted to avoid a fallacy.

"Most people" expect the C compiler to work on /their/ respective code
they care about, which is different based on who you ask. The more
people you include in a sample of "most people", the more code that is.

Most people don't just expect a compiler to work on /your/ few examples.
Post by bart
Post by Kaz Kylheku
Basically, you don't present a very credible case that you've actually
written a C compiler.
Well, don't believe it if you don't want.
Oh I want to believe; I just can't do that which I want, without
proper evidence.

Do you have a reference manual for your C dialect, and is it covered by
tests? What programs and constructs are required to work in your C dialect?
What are required to be diagnosed? What is left undefined?

If you make changes to the compiler which accidentally cause it to stray
from the dialect, how are you informed?
Post by bart
The NASM.EXE program is bit larger at 1.3MB for example, that's 98.7%
smaller than your giant program.
That's amazingly large for an assembler. Is that stripped of debug info?
Post by bart
I mean, where is YOUR lower-level system language? Where is anybody's? I
don't mean the Zigs and Rusts because that would be like comparing a
40-tonne truck with a car.
I'm not interested in working on lower-level systems languages.

I work on a the implementation of a Lisp dialect.

As far as low-level systems goes, I'm quite satisfied with the C
language and its mainstream implementations.
Post by bart
Post by Kaz Kylheku
Compilers that blaze through large amounts of code in the blink of an
eye are almost certainly dodging on the optimization.
Yes, probably. But the optimisation is overrated. Do you really need
optimised code to test each of those 200 builds you're going to do today?
Yes, because of the principle that you should test what you ship.
Post by bart
Post by Kaz Kylheku
Post by bart
* There is no benefit at all in having a tool like a compiler, be a
small, self-contained executable.
Not as much as there used to, decades ago.
Simplicity is always good. Somebody deletes one of the 1000s of files of
your gcc installation. Is it something that is essential? Who knows.
That someone will have to hack the superuser account, since those
files are writable only to root, sitting in directories that are
writable only to root.

You will know when someting complains about the file not being found.

(A problem will arise if the file is part of a search, such that another
file will be found if that one is missing.)
Post by bart
But if your compiler is the one file mm.exe, it's easy to spot if it's
missing!
What if a bit randomly flips in mm.exe? Is it in a byte that is
essential? Who knows ... I don't see where this is going.

Sofware installations big and small can be damaged.

It seems disadvantageous for a compiler to have no satellite files. If
you have to fix something in <stdlib.h>, and that's buried in the
executable, you have to roll out a whole new mm.exe. The user has to
believe you when you say that you changed nothing else.

If the satellite files are kept reasonably small in number, and not
proliferated throughout a complex tree, that could be a good thing.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
bart
2024-02-08 01:46:57 UTC
Permalink
Post by Kaz Kylheku
Post by bart
Post by Kaz Kylheku
Writing a compiler is pretty easy, because the bar can be set very low
while still calling it a compiler.
Whole-program compilers are easier because there are fewer requirements.
You have only one kind of deliverable to produce: the executable.
You don't have to deal with linkage and produce a linkable format.
David Brown suggested that they were harder than I said. You're saying
they are easier.
I'm saying it's somewhat easier to make a compiler which produces an
object file than to produce a compiler that produces object files *and*
a linker that combines them.
Is there a 'than' missing above? Otherwise it's contradictory.
Post by Kaz Kylheku
There is all that code you don't have to write to produce object files,
read them, and link them. You don't have to solve the problem of how to
represent unresolved references in an externalized form in a file.
Programs that generate object files usually invoke other people's linkers.

But your comments are simplistic. EXE formats can be as hard to generate
as OBJ files. You still have to resolve the dynamic imports into an EXE.

You need to either have a ready-made language designed for whole-program
work, or you need to devise one.

Plus, if the minimal compilation unit is now all N source modules of a
project rather than just 1 module, then you'd better have a pretty fast
compiler, and some strategies for dealing with scale.

If your project involves only OBJ format, then you can also choose to
devise your own simple file format, then linking is a trivial though
fiddly task.
Post by Kaz Kylheku
David made it clear he was referring to whole program optimization.
Post by bart
Post by Kaz Kylheku
GCC is maintained by people who know what a C compiler is, and GCC can
be asked to be one.
So what is it when it's not a C compiler? What language is it compiling
c:\qx>gcc qc.c
c:\qx>
Yes, sorry. It is compiling C also: a certain revision of GNU C,
which is family of dialects in the C family.
Post by bart
Mine at least is a more rigid subset.
Rigid? Where this subset it documented, other than in the code?
In early versions of the compiler, there was a specification. Now, it's
a personal tool and I don't bother. So shoot me.
Post by Kaz Kylheku
GNU C is documented, and tested.
Post by bart
Post by Kaz Kylheku
Your idea of writing a C compiler seems to be to pick some random
examples of code believed to be C and make them work. (Where "work"
means that they compile and show a few behaviors that look like
the expected ones.)
That's what most people expect!
That's may be verbal way of expressing what a lot of developers
want, but it it has to be carefully interpreted to avoid a fallacy.
"Most people" expect the C compiler to work on /their/ respective code
they care about, which is different based on who you ask. The more
people you include in a sample of "most people", the more code that is.
Most people don't just expect a compiler to work on /your/ few examples.
A C compiler that works on any arbitrary existing code is years of
effort. That's hard enough to achieve even with better and more mature
products than mine.
Post by Kaz Kylheku
Post by bart
Post by Kaz Kylheku
Basically, you don't present a very credible case that you've actually
written a C compiler.
Well, don't believe it if you don't want.
Oh I want to believe; I just can't do that which I want, without
proper evidence.
Do you have a reference manual for your C dialect, and is it covered by
tests? What programs and constructs are required to work in your C dialect?
What are required to be diagnosed? What is left undefined?
So no one can claim to write a 'C' compiler unless it does everything as
well as gcc which started in 1987, has had hordes of people working with
it, and has had feedback from myriads of users?

I had some particular aims with my project, most of which were achieved,
boxes ticked.
Post by Kaz Kylheku
Post by bart
The NASM.EXE program is bit larger at 1.3MB for example, that's 98.7%
smaller than your giant program.
That's amazingly large for an assembler. Is that stripped of debug info?
The as.exe assembler for gcc/TDM 10.3 is 1.8MB. For mingw 13.2 it was 1.5MB.

Mine is about 100KB, but it covers a subset of x86 opcodes and outputs
only a limited number of formats.

But the size of NASM is not an issue; it's an example of modestly sized
application which seem rare. People here claim their apps are always so
massive and complicated that a 'toy' compiler will never work.
Post by Kaz Kylheku
Post by bart
Yes, probably. But the optimisation is overrated. Do you really need
optimised code to test each of those 200 builds you're going to do today?
Yes, because of the principle that you should test what you ship.
Then you're being silly. You're not shipping build#139 of 200 that day,
not even #1000 that week. You're debugging a logic bug that is nothing
to do with optimisation.
Kaz Kylheku
2024-02-08 02:50:08 UTC
Permalink
Post by bart
Post by Kaz Kylheku
Post by bart
Post by Kaz Kylheku
Writing a compiler is pretty easy, because the bar can be set very low
while still calling it a compiler.
Whole-program compilers are easier because there are fewer requirements.
You have only one kind of deliverable to produce: the executable.
You don't have to deal with linkage and produce a linkable format.
David Brown suggested that they were harder than I said. You're saying
they are easier.
I'm saying it's somewhat easier to make a compiler which produces an
object file than to produce a compiler that produces object files *and*
^^^^
Post by bart
Post by Kaz Kylheku
a linker that combines them.
Is there a 'than' missing above? Otherwise it's contradictory.
Other "than" that one? Hmm.
Post by bart
Post by Kaz Kylheku
There is all that code you don't have to write to produce object files,
read them, and link them. You don't have to solve the problem of how to
represent unresolved references in an externalized form in a file.
Programs that generate object files usually invoke other people's linkers.
But your comments are simplistic. EXE formats can be as hard to generate
as OBJ files. You still have to resolve the dynamic imports into an EXE.
Generating just the EXE format is objectively less work than generating
OBJ files and linking them into that ... same EXE format, right?
Post by bart
You need to either have a ready-made language designed for whole-program
work, or you need to devise one.
Plus, if the minimal compilation unit is now all N source modules of a
project rather than just 1 module, then you'd better have a pretty fast
compiler, and some strategies for dealing with scale.
Easy; just drop language conformance, diagnostics, optimization.
Post by bart
Post by Kaz Kylheku
Post by bart
Well, don't believe it if you don't want.
Oh I want to believe; I just can't do that which I want, without
proper evidence.
Do you have a reference manual for your C dialect, and is it covered by
tests? What programs and constructs are required to work in your C dialect?
What are required to be diagnosed? What is left undefined?
So no one can claim to write a 'C' compiler unless it does everything as
well as gcc which started in 1987, has had hordes of people working with
it, and has had feedback from myriads of users?
Nope; unless it is documented so that there is a box, where it says what
is in the box, and some way to tell that what's on the box is in the
box.
Post by bart
I had some particular aims with my project, most of which were achieved,
boxes ticked.
Post by Kaz Kylheku
Post by bart
The NASM.EXE program is bit larger at 1.3MB for example, that's 98.7%
smaller than your giant program.
That's amazingly large for an assembler. Is that stripped of debug info?
The as.exe assembler for gcc/TDM 10.3 is 1.8MB. For mingw 13.2 it was 1.5MB.
"as" on Ubuntu 18, 32 bit.

$ size /usr/bin/i686-linux-gnu-as
text data bss dec hex filename
430315 12544 37836 480695 755b7 /usr/bin/i686-linux-gnu-as

Still pretty large. Always use the "size" utility, rather than raw
file size. This has 430315 bytes of code, 12544 of non-zero static data, 37836
bytes of zeroed data (not part of the executable size).

That's still large for an assembler, but at least it's not larger
than GNU Awk.
Post by bart
Post by Kaz Kylheku
Post by bart
Yes, probably. But the optimisation is overrated. Do you really need
optimised code to test each of those 200 builds you're going to do today?
Yes, because of the principle that you should test what you ship.
Then you're being silly. You're not shipping build#139 of 200 that day,
If I make a certain change for build #139, and that part of the code
(function or entire source file) is not touched until build #1459 which
ships, that compiled code remains the same! So in fact, the #139 version of
that code is what build #1459 ships with. That code is being tested as part of
#140, #141, #142, ... even while some other things are changing.

You should not be doing all your development and developer testing with
unoptimized builds so that only Q&A people test optimized code before
shipping.

Every test, even of a private build, is a potential opportunity to find
something wrong with some optimized code that would end up shipping
otherwise.

Here is another reason to work with optimized code. If you have to debug
at the machine language level, optimized code is shorter and way more readable.
And it can help you understand logic bugs, because the compiler performs
logical analysis in doing optimizations. The optimized code shows you what your
calculation reduced to, and can even help you see a better way of writing the
code, like a tutor.
Post by bart
not even #1000 that week. You're debugging a logic bug that is nothing
to do with optimisation.
Though debugging logic bugs that have nothing to do with optimization can be
somewhat impeded by optimization, it's still better to prioritize working with
the code in the intended shipping state.

You can drop to an unoptimized build when necessary.

Pretty much that only happens when

1. It is just a logic bug, but you have to resort to a debugger, and
the optimizations are interfering with being able to see variable values.)

2. You suspect it does have to do with optimization, so you see if it
the issue goes away in the unoptimized build.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
bart
2024-02-08 11:08:06 UTC
Permalink
Post by Kaz Kylheku
Post by bart
But your comments are simplistic. EXE formats can be as hard to generate
as OBJ files. You still have to resolve the dynamic imports into an EXE.
Generating just the EXE format is objectively less work than generating
OBJ files and linking them into that ... same EXE format, right?
That depends:

* Are you generating OBJ files, or just ASM files and using a 3rd party
assembler?

* Are you producing OBJ files and relying on a 3rd party linker, or also
writing the linker?

* If the latter, are you using an official binary OBJ format, or
devising your own? That latter can make it a lot simpler.

And also, what exactly do you mean by a whole-program compiler?

I don't mean a C compiler which takes N source files at the same time,
compiles them internally, and links them internally. That's just
wrapping up all the steps involved in independent compilation into one
package.
Post by Kaz Kylheku
Post by bart
Plus, if the minimal compilation unit is now all N source modules of a
project rather than just 1 module, then you'd better have a pretty fast
compiler, and some strategies for dealing with scale.
Easy; just drop language conformance, diagnostics, optimization.
You're sceptical about something, I'm not sure what. Maybe you're used
to compilers taking forever to turn source into binary, so that you're
suspicious of anything that figures out how to do it faster.

Have you considered that recompiling after a one-line change, you don't
really the same in-depth analysis that you did 30 seconds ago?

/I/ am suspicious of compilers that produce a benchmark that completes
in 0.0 seconds, since it is most likely shirking the task that has been set.
Post by Kaz Kylheku
"as" on Ubuntu 18, 32 bit.
$ size /usr/bin/i686-linux-gnu-as
text data bss dec hex filename
430315 12544 37836 480695 755b7 /usr/bin/i686-linux-gnu-as
Still pretty large. Always use the "size" utility, rather than raw
file size. This has 430315 bytes of code, 12544 of non-zero static data, 37836
bytes of zeroed data (not part of the executable size).
Raw file size is fine. The 'size' thing on my WSL shows that 'as' is
about 700KB for text and data.

BTW the same exercise on the 1.3MB NASM.EXE shwows the code as 0.3MB,
the rest is data. On my mcc compiler:

Compiling cc.m---------- to cc.exe
Code size: 186,647 bytes
Idata size: 86,392
Zdata size: 1,333,240
EXE size: 277,504
Post by Kaz Kylheku
That's still large for an assembler, but at least it's not larger
than GNU Awk.
So Awk is another product that is still smaller than 1MB. Maybe there's
more of such programs than was thought!

But can you imagine if you're a developer of such a program, and being
told your product is a toy. Or being denied that use of a fast compiler,
because such a compiler will not scale to something that is 100x the size.
Post by Kaz Kylheku
Post by bart
Post by Kaz Kylheku
Post by bart
Yes, probably. But the optimisation is overrated. Do you really need
optimised code to test each of those 200 builds you're going to do today?
Yes, because of the principle that you should test what you ship.
Then you're being silly. You're not shipping build#139 of 200 that day,
If I make a certain change for build #139, and that part of the code
(function or entire source file) is not touched until build #1459 which
ships, that compiled code remains the same! So in fact, the #139 version of
that code is what build #1459 ships with. That code is being tested as part of
#140, #141, #142, ... even while some other things are changing.
You should not be doing all your development and developer testing with
unoptimized builds so that only Q&A people test optimized code before
shipping.
Every test, even of a private build, is a potential opportunity to find
something wrong with some optimized code that would end up shipping
otherwise.
OK. This is entirely up to you. But then you can't complain when builds
routinely take makes minutes or even hours.

On the stuff I do, a whole-program build completes in about the time it
takes you to take your finger off the Enter key.
Post by Kaz Kylheku
Here is another reason to work with optimized code. If you have to debug
at the machine language level, optimized code is shorter and way more readable.
I think the exact opposite is true, since optimised code can bear little
relation to the source code. The code may even have been elided.
Post by Kaz Kylheku
And it can help you understand logic bugs, because the compiler performs
logical analysis in doing optimizations. The optimized code shows you what your
calculation reduced to, and can even help you see a better way of writing the
code, like a tutor.
Post by bart
not even #1000 that week. You're debugging a logic bug that is nothing
to do with optimisation.
Though debugging logic bugs that have nothing to do with optimization can be
somewhat impeded by optimization, it's still better to prioritize working with
the code in the intended shipping state.
You can drop to an unoptimized build when necessary.
Pretty much that only happens when
1. It is just a logic bug, but you have to resort to a debugger, and
the optimizations are interfering with being able to see variable values.)
2. You suspect it does have to do with optimization, so you see if it
the issue goes away in the unoptimized build.
My method is to do 99.99% of builds unoptimised. The program may or may
not be in C.

If I want the finished program to be faster, then I can transpile to C
if necessary, and invoke an optimising C compiler. Or I can just supply
some C source to somebody and they can do what they like.

Millions of people code in scripting languages where there is no deep
analysis or optimisation at all. And yet they manage. Behind the scenes
there will be a fast bytecode compiler that does little other than
generated code that corresponds 99% to the source code.

But you're saying that as soon as you step over the line into AOT
compiled code, nothing less than full -O3 with a million other options
will do FOR EVERY SINGLE CCOMPILATION, even if the only change is to add
an extra space to a string literal because some message annoyingly
doens't line up?

OKay....
Michael S
2024-02-08 11:10:10 UTC
Permalink
On Thu, 8 Feb 2024 02:50:08 -0000 (UTC)
Post by Kaz Kylheku
Post by bart
Post by Kaz Kylheku
Post by bart
Post by Kaz Kylheku
Writing a compiler is pretty easy, because the bar can be set
very low while still calling it a compiler.
Whole-program compilers are easier because there are fewer
the executable. You don't have to deal with linkage and produce
a linkable format.
David Brown suggested that they were harder than I said. You're
saying they are easier.
I'm saying it's somewhat easier to make a compiler which produces
an object file than to produce a compiler that produces object
files *and*
^^^^
Post by bart
Post by Kaz Kylheku
a linker that combines them.
Is there a 'than' missing above? Otherwise it's contradictory.
Other "than" that one? Hmm.
Post by bart
Post by Kaz Kylheku
There is all that code you don't have to write to produce object
files, read them, and link them. You don't have to solve the
problem of how to represent unresolved references in an
externalized form in a file.
Programs that generate object files usually invoke other people's linkers.
But your comments are simplistic. EXE formats can be as hard to
generate as OBJ files. You still have to resolve the dynamic
imports into an EXE.
Generating just the EXE format is objectively less work than
generating OBJ files and linking them into that ... same EXE format,
right?
Post by bart
You need to either have a ready-made language designed for
whole-program work, or you need to devise one.
Plus, if the minimal compilation unit is now all N source modules
of a project rather than just 1 module, then you'd better have a
pretty fast compiler, and some strategies for dealing with scale.
Easy; just drop language conformance, diagnostics, optimization.
Post by bart
Post by Kaz Kylheku
Post by bart
Well, don't believe it if you don't want.
Oh I want to believe; I just can't do that which I want, without
proper evidence.
Do you have a reference manual for your C dialect, and is it
covered by tests? What programs and constructs are required to
work in your C dialect? What are required to be diagnosed? What is
left undefined?
So no one can claim to write a 'C' compiler unless it does
everything as well as gcc which started in 1987, has had hordes of
people working with it, and has had feedback from myriads of users?
Nope; unless it is documented so that there is a box, where it says
what is in the box, and some way to tell that what's on the box is in
the box.
Post by bart
I had some particular aims with my project, most of which were
achieved, boxes ticked.
Post by Kaz Kylheku
Post by bart
The NASM.EXE program is bit larger at 1.3MB for example, that's
98.7% smaller than your giant program.
That's amazingly large for an assembler. Is that stripped of debug info?
The as.exe assembler for gcc/TDM 10.3 is 1.8MB. For mingw 13.2 it was 1.5MB.
"as" on Ubuntu 18, 32 bit.
$ size /usr/bin/i686-linux-gnu-as
text data bss dec hex filename
430315 12544 37836 480695 755b7 /usr/bin/i686-linux-gnu-as
Still pretty large. Always use the "size" utility, rather than raw
file size. This has 430315 bytes of code, 12544 of non-zero static
data, 37836 bytes of zeroed data (not part of the executable size).
$ /mingw32/bin/as.exe --version
GNU assembler (GNU Binutils) 2.40
Copyright (C) 2023 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms
of the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `i686-w64-mingw32'.

$ size /mingw32/bin/as.exe
text data bss dec hex filename
2941952 10392 43416 2995760 2db630
C:/bin/msys64a/mingw32/bin/as.exe


$ /mingw64/bin/as.exe --version
GNU assembler (GNU Binutils) 2.39
Copyright (C) 2022 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms
of the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-w64-mingw32'.

$ size /mingw64/bin/as.exe
text data bss dec hex filename
2966156 14776 45216 3026148 2e2ce4
C:/bin/msys64a/mingw64/bin/as.exe
Post by Kaz Kylheku
That's still large for an assembler, but at least it's not larger
than GNU Awk.
$ awk --version
GNU Awk 5.1.1, API: 3.1 (GNU MPFR 4.1.0-p13, GNU MP 6.2.1)
Copyright (C) 1989, 1991-2021 Free Software Foundation.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.


$ size /usr/bin/awk.exe
text data bss dec hex filename
619497 15884 23104 658485 a0c35 C:/bin/msys64a/usr/bin/awk.exe


Looks like on msys2 GNU as is much larger than GNU awk.
Tim Rentsch
2024-02-05 07:36:39 UTC
Permalink
Post by bart
[...]
* Hardly anybody here has a project which can be built simply by
compiling and linking all the modules.
Not everyone is working on a project where the deliverable is a
single executable. It's much more difficult to work on a project
where the deliverables form a set of related files that make up a
third-party library, and target multiple platforms.
Post by bart
Even Tim Rentsch's simplest project has a dizzying set of special
requirements.
This statement is a misrepresentation, and undoubtedly a deliberate
one. Furthermore how it is expressed is petty and childish.
Scott Lurndal
2024-02-05 14:52:39 UTC
Permalink
Post by Tim Rentsch
Post by bart
[...]
* Hardly anybody here has a project which can be built simply by
compiling and linking all the modules.
Not everyone is working on a project where the deliverable is a
single executable. It's much more difficult to work on a project
where the deliverables form a set of related files that make up a
third-party library, and target multiple platforms.
Post by bart
Even Tim Rentsch's simplest project has a dizzying set of special
requirements.
This statement is a misrepresentation, and undoubtedly a deliberate
one. Furthermore how it is expressed is petty and childish.
I have reached the point where it's not worth my time to respond
to bart, even to correct his misrepresentations of what I and
other have said.
Kenny McCormack
2024-02-05 22:58:32 UTC
Permalink
In article <XC6wN.397626$p%***@fx15.iad>,
Scott Lurndal <***@pacbell.net> wrote:
...
Post by Scott Lurndal
I have reached the point where it's not worth my time to respond
to bart, even to correct his misrepresentations of what I and
other have said.
Being killfiled by Scotty is almost as good a thing as being kf'd by
Keith. Keep it up!
--
Life's big questions are big in the sense that they are momentous. However, contrary to
appearances, they are not big in the sense of being unanswerable. It is only that the answers
are generally unpalatable. There is no great mystery, but there is plenty of horror.
(https://en.wikiquote.org/wiki/David_Benatar)
Jim Jackson
2024-02-05 18:01:47 UTC
Permalink
Post by Tim Rentsch
Post by bart
[...]
* Hardly anybody here has a project which can be built simply by
compiling and linking all the modules.
Not everyone is working on a project where the deliverable is a
single executable. It's much more difficult to work on a project
where the deliverables form a set of related files that make up a
third-party library, and target multiple platforms.
Post by bart
Even Tim Rentsch's simplest project has a dizzying set of special
requirements.
This statement is a misrepresentation, and undoubtedly a deliberate
one. Furthermore how it is expressed is petty and childish.
Did you expect anything else?
Malcolm McLean
2024-02-05 08:29:32 UTC
Permalink
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
* You can spend decades developing and implementing systems languages at
the level of C, but you still apparently know nothing of the subject
The tone's currently rather bad, and somehow it has developed that you
and I are on one side and pretty much everyone else on the other. We
both have open source projects which are or at least attempt to be
actually useful to other people, whilst I don't think many of the others
can say that, and maybe that's the underlying reason. But who knows.

I'm trying to improve the tone. It's hard because people have got lots
of motivations for posting, and some of them aren't very compatible with
a good humoured, civilised group. And we've got a lot of bad behaviour,
not all of it directed at us by any means. However whilst you're very
critical of other people's design decisions, I've rarely if ever heard
to say that therefore you criticise someone's general character. But
finally tolerance has snapped.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
bart
2024-02-07 01:35:35 UTC
Permalink
Post by Malcolm McLean
Post by bart
In no particular order.
* Software development can ONLY be done on a Unix-related OS
* It is impossible to develop any software, let alone C, on pure Windows
* You can spend decades developing and implementing systems languages
at the level of C, but you still apparently know nothing of the subject
The tone's currently rather bad, and somehow it has developed that you
and I are on one side and pretty much everyone else on the other. We
both have open source projects which are or at least attempt to be
actually useful to other people, whilst I don't think many of the others
can say that, and maybe that's the underlying reason. But who knows.
I'm trying to improve the tone. It's hard because people have got lots
of motivations for posting, and some of them aren't very compatible with
a good humoured, civilised group. And we've got a lot of bad behaviour,
not all of it directed at us by any means. However whilst you're very
critical of other people's design decisions, I've rarely if ever heard
to say that therefore you criticise someone's general character.
Well we've both posted code of sizeable, actual and practical projects.
Very few on the 'other side' have. Maybe it's proprietory or there are
other reasons. But it means their own output can't be criticised here.

Myself I've also pretty much given up on discussing new features for C
or new directions. The thread on build systems lies outside the
language. But few regulars are that interested in that side of it; only
in what C does right now.

From what I can see, the most fascinating topics for them are pedantic
details of the C standard, and the most low level technical details of
Unix-like systems.
Post by Malcolm McLean
But finally tolerance has snapped.
I usually argue against ideas not people. But there is only so many
personal insults that you can take.
Kaz Kylheku
2024-02-07 02:26:02 UTC
Permalink
Post by bart
Well we've both posted code of sizeable, actual and practical projects.
Very few on the 'other side' have. Maybe it's proprietory or there are
other reasons. But it means their own output can't be criticised here.
Posting large amounts of code into discussion groups isn't practical,
and against netiquette.

The right thing is to host your code somewhere (which it behooves you to
do for obvious other reasons) and post a link to it.

People used to share code via comp.sources.*. Some well-known old
projects first made their appearance that way. E.g. Dick Grune posted
the first version of CVS in what was then called mod.sources in 1986.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
bart
2024-02-07 10:47:52 UTC
Permalink
Post by Kaz Kylheku
Post by bart
Well we've both posted code of sizeable, actual and practical projects.
Very few on the 'other side' have. Maybe it's proprietory or there are
other reasons. But it means their own output can't be criticised here.
Posting large amounts of code into discussion groups isn't practical,
and against netiquette.
The right thing is to host your code somewhere (which it behooves you to
do for obvious other reasons) and post a link to it.
People used to share code via comp.sources.*. Some well-known old
projects first made their appearance that way. E.g. Dick Grune posted
the first version of CVS in what was then called mod.sources in 1986.
Directly including source code as part of a post is not that practical
beyond a few hundred lines of code.

Clearly that wouldn't count as 'sizeable'. That would need to be done
via a link.
Dan Purgert
2024-02-05 11:03:11 UTC
Permalink
Post by bart
[...]
* Nobody here apparently knows how to build a program consisting purely
of C source files, using only a C compiler.
What does this one mean, exactly? I thought the "compiler" and "linker"
were separate tools used under the umbrella term "compiling".
--
|_|O|_|
|_|_|O| Github: https://github.com/dpurgert
|O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860
David Brown
2024-02-05 12:15:16 UTC
Permalink
Post by bart
In no particular order.
<snip>
Post by bart
Shall I post this pile of crap or not?
No. It is, after all, a pile of crap.

What we have learned about Bart in c.l.c. :

* Bart generally does not read what other people post.

* Bart exaggerates /everything/ - including things no one ever wrote.

* Bart extrapolates /everything/ to get a binary black-or-white
all-or-nothing straw man that he can complain about. If one person says
they do X and not Y, Bart takes that to mean /everyone/ does X and /no
one/ does Y.

* Bart prefers tilting at windmills to learning about C, or indeed
anything in the real world.

* Bart has no interest in what anyone else does, wants or needs.
Post by bart
I really need to get back to some of those pointless, worthless toy
projects of mine.
No, you should go back to doing something you actually /like/. Stop
fighting imaginary battles about problems that don't exist. Stop
winding yourself into a frenzy over nothing.

You don't like C. Find something else that you /do/ like, and use that
instead.

I can't speak for anyone else, but I'd rather you were happily doing
something you are comfortable with, instead of endless gripes and rants
here.
Ben Bacarisse
2024-02-05 14:09:34 UTC
Permalink
Post by David Brown
Post by bart
In no particular order.
<snip>
Post by bart
Shall I post this pile of crap or not?
No. It is, after all, a pile of crap.
Why do you reply so much? I get why Bart posts, but not why you reply
so much? It's not as if gcc (to take one but one example) needs to be
defended from people saying it does it all wrong!

He sometimes posts things that, in my opinion, benefit from a reply.
His "C is so mysterious, how can any one use it posts", for example,
benefit from a reply or two that explains what C's rules really are so
that people coming along later will know what the facts of the matter
are. But there have been none of those lately.
--
Ben.
Lawrence D'Oliveiro
2024-02-05 23:29:05 UTC
Permalink
Post by bart
* Software development can ONLY be done on a Unix-related OS
Does the term “butthurt” mean anything to you?
Loading...