Discussion:
What if pre-processor was moved to parser?
(too old to reply)
Thiago Adams
2017-02-14 23:37:40 UTC
Permalink
Raw Message
I was wondering how to eliminate pre-processor
keeping compatibility with existing C code.

Let's say we move the pre-processor for parser in
a new C2 language. (file extensions .h2 and .c2)

I want to keep the original includes of system headers
as it is:

Let's say there is a flag inside the compiler
classic pre-processor ON or OFF.

So in a file.c2 that is "C2" the pre-processor mode starts as OFF

//When we include a extension .h
//the c pre-processor flat is ON
//normal C pre-processor using DEBUG, WIN32 etc..

//include itself is not normal in this file.
#include <stdio.h>

//pre-processor flag is OFF after include

//The include here is not pre-processor command
// pre-processor still OF because the file is .h2
#include "myfile.h2"

Now, let's see the new rules


#define is inside the C grammar.
If must be used in some specific places.
The same for #if #else...
The defined symbol is no more a macro. It's a symbol inside
the AST.
The #if is branch inside the AST like normal if / else.
The #define in pre-processor ON it's not the same (don't mix) with
#define when pre-processor is OFF.


sizeof and other new features can be added to #if expressions.
For instance, we can ask if the function F was defined or some variable.

Macro expansion?
Besides simple macros I don't have idea the new rules.
So I don't known what would be allowed. For the worst case, only
simple #define would work.
Why to do this? For many reasons.
For safer code, for refactoring tools, better tools etc.



What do you think?
fir
2017-02-15 08:05:23 UTC
Permalink
Raw Message
Post by Thiago Adams
I was wondering how to eliminate pre-processor
keeping compatibility with existing C code.
Let's say we move the pre-processor for parser in
a new C2 language. (file extensions .h2 and .c2)
i am using C2 name for my improved c witm my set of extensions and improvements, i use it like 10 years, recently thinked though if it
is best name i can know..

d would be maybe not bad (logicl consequence is d) but it is already used, still dont know; (will it be illegal to use d name then?)

C2 i use more like codeneme i chosen also a sign for that - slaine be that sign

Loading Image...

(slaine has an axe which looks like c, also slaine afair was old irish who was fighting
with fomorians, (same as firbolgs maybe) - on old irish fir means men/people and i recently
get know my firish nick has old irish connotations it seem fits fine)
Post by Thiago Adams
I want to keep the original includes of system headers
Let's say there is a flag inside the compiler
classic pre-processor ON or OFF.
So in a file.c2 that is "C2" the pre-processor mode starts as OFF
//When we include a extension .h
//the c pre-processor flat is ON
//normal C pre-processor using DEBUG, WIN32 etc..
//include itself is not normal in this file.
#include <stdio.h>
//pre-processor flag is OFF after include
//The include here is not pre-processor command
// pre-processor still OF because the file is .h2
#include "myfile.h2"
Now, let's see the new rules
#define is inside the C grammar.
If must be used in some specific places.
The same for #if #else...
The defined symbol is no more a macro. It's a symbol inside
the AST.
The #if is branch inside the AST like normal if / else.
The #define in pre-processor ON it's not the same (don't mix) with
#define when pre-processor is OFF.
sizeof and other new features can be added to #if expressions.
For instance, we can ask if the function F was defined or some variable.
Macro expansion?
Besides simple macros I don't have idea the new rules.
So I don't known what would be allowed. For the worst case, only
simple #define would work.
Why to do this? For many reasons.
For safer code, for refactoring tools, better tools etc.
What do you think?
fir
2017-02-15 08:57:50 UTC
Permalink
Raw Message
Post by fir
Post by Thiago Adams
I was wondering how to eliminate pre-processor
keeping compatibility with existing C code.
Let's say we move the pre-processor for parser in
a new C2 language. (file extensions .h2 and .c2)
i am using C2 name for my improved c witm my set of extensions and improvements, i use it like 10 years, recently thinked though if it
is best name i can know..
d would be maybe not bad (logicl consequence is d) but it is already used, still dont know; (will it be illegal to use d name then?)
C2 i use more like codeneme i chosen also a sign for that - slaine be that sign
http://minddetonator.htw.pl/c2.jpg
(slaine has an axe which looks like c, also slaine afair was old irish who was fighting
with fomorians, (same as firbolgs maybe) - on old irish fir means men/people and i recently
get know my firish nick has old irish connotations it seem fits fine)
recent additions was resizable local arrays
in heap/stack way, optionall ommiting type declarations for locals and those parent
pathes of parent functions;

before it was automatic chunk support an syntax for multimodule methods* (also some
smaller things)

*maybe good that i refreshed thet coz it
seems to dim in my memory and this is interesting


[Character a] void hit() [Character b]
{
b hp -= a hit;
}



void mian()
{
Character jim, alan;
jim hit alan;

printf("alan health %d", alan.hp)
}
fir
2017-02-15 15:21:38 UTC
Permalink
Raw Message
Post by fir
before it was automatic chunk support an syntax for multimodule methods* (also some
smaller things)
*maybe good that i refreshed thet coz it
seems to dim in my memory and this is interesting
[Character a] void hit() [Character b]
{
b hp -= a hit;
}
void mian()
{
Character jim, alan;
jim hit alan;
printf("alan health %d", alan.hp)
}
this above was very interesting remark (to me)
but still i though this example was not fully understood (by me)

some claryfication would be to rematk that in
such C2 function could be a method and a function at once - here above hit(5) would be function call alan hit jim would be a method -way call (alan hit(5) jim would be both at once)
such distinguishment on two categories/families could be usefull (and still needs some understanding)
fir
2017-02-15 15:24:03 UTC
Permalink
Raw Message
Post by fir
Post by fir
before it was automatic chunk support an syntax for multimodule methods* (also some
smaller things)
*maybe good that i refreshed thet coz it
seems to dim in my memory and this is interesting
[Character a] void hit() [Character b]
{
b hp -= a hit;
}
void mian()
{
Character jim, alan;
jim hit alan;
printf("alan health %d", alan.hp)
}
this above was very interesting remark (to me)
but still i though this example was not fully understood (by me)
some claryfication would be to rematk that in
such C2 function could be a method and a function at once - here above hit(5) would be function call alan hit jim would be a method -way call (alan hit(5) jim would be both at once)
such distinguishment on two categories/families could be usefull (and still needs some understanding)
with this view alos clasical c functions are methods (If function reads or changes module variables it is a method, if it takes or returns values it is a function - if both it is both)
fir
2017-02-15 16:10:37 UTC
Permalink
Raw Message
Post by fir
Post by fir
Post by fir
before it was automatic chunk support an syntax for multimodule methods* (also some
smaller things)
*maybe good that i refreshed thet coz it
seems to dim in my memory and this is interesting
[Character a] void hit() [Character b]
{
b hp -= a hit;
}
void mian()
{
Character jim, alan;
jim hit alan;
printf("alan health %d", alan.hp)
}
this above was very interesting remark (to me)
but still i though this example was not fully understood (by me)
some claryfication would be to rematk that in
such C2 function could be a method and a function at once - here above hit(5) would be function call alan hit jim would be a method -way call (alan hit(5) jim would be both at once)
such distinguishment on two categories/families could be usefull (and still needs some understanding)
with this view alos clasical c functions are methods (If function reads or changes module variables it is a method, if it takes or returns values it is a function - if both it is both)
and if touches more modules is multimethod
(at least i call it this name 2-method 3-method n-method)

//

it can be maybe noted that c supports modules but in weak form/partially (if it can be said
that way 'support' becouse c seen it i think other way, so what c supports is something and this something only mesh with modules maybe)

i personally was describing how one can add
more modular supprt to c (it was a pack of notices, possibly not finished, (i am also
no 100% sure if this just need to go in,
and no other way is better, but at some level seem probably ok) , maybe later i will add
yet more to this) [it will be no doubt interesting to really/fully understood it]

//

i once ago had written about what i call f/s
unificiation (idea of uniffying two main families you got in c functions and structures
(entities)) now this m/f division seem also curious (interesting) to me - (as modules was expected this m/f was somewhat unexpected)
Thiago Adams
2017-02-15 11:05:36 UTC
Permalink
Raw Message
Some interesting links about refactoring of C programs and about
the difficulties caused by pre-processing.

PROGRAM REFACTORING IN THE PRESENCE OF PREPROCESSOR DIRECTIVES
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1012.294&rep=rep1&type=pdf

Refactoring C with Conditional Compilation
http://www.informatik.uni-bremen.de/st/lehre/Arte-fakt/Seminar/papers/17/Refactoring%20C%20with%20Conditional%20Compilation.pdf

Challenges of Refactoring C Programs
https://www.researchgate.net/profile/Alejandra_Garrido2/publication/234805773_Challenges_of_Refactoring_C_Programs/links/0c96051cc23aaa9dd1000000.pdf

(My interest is also source-to-source transformation.)
Post by Thiago Adams
Macro expansion?
Besides simple macros I don't have idea the new rules.
So I don't known what would be allowed. For the worst case, only
simple #define would work.
MISRA C has this rule:

"Rule 19.4 (required): C macros shall only expand to a braced
initialiser, a constant,a string literal, a parenthesized expression,
a type qualifier, a storage class specifier, or a do-while-zero
construct."

This is interesting and this rule (or something similar)
could be applied for the grammar that include "pre-processing".
BartC
2017-02-15 11:38:33 UTC
Permalink
Raw Message
Post by Thiago Adams
Some interesting links about refactoring of C programs and about
the difficulties caused by pre-processing.
PROGRAM REFACTORING IN THE PRESENCE OF PREPROCESSOR DIRECTIVES
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1012.294&rep=rep1&type=pdf
OK, 220 pages to tell us what everyone knows. (Did the author succeed in
re-factoring arbitrary C code that may be full of conditionals and that
may depend on a specific compiler?)
Post by Thiago Adams
Refactoring C with Conditional Compilation
http://www.informatik.uni-bremen.de/st/lehre/Arte-fakt/Seminar/papers/17/Refactoring%20C%20with%20Conditional%20Compilation.pdf
Challenges of Refactoring C Programs
https://www.researchgate.net/profile/Alejandra_Garrido2/publication/234805773_Challenges_of_Refactoring_C_Programs/links/0c96051cc23aaa9dd1000000.pdf
(My interest is also source-to-source transformation.)
Post by Thiago Adams
Macro expansion?
Besides simple macros I don't have idea the new rules.
So I don't known what would be allowed. For the worst case, only
simple #define would work.
"Rule 19.4 (required): C macros shall only expand to a braced
initialiser, a constant,a string literal, a parenthesized expression,
a type qualifier, a storage class specifier, or a do-while-zero
construct."
This is interesting and this rule (or something similar)
could be applied for the grammar that include "pre-processing".
So your proposal was to forget the existing CPP, and create a new one
that is properly integrated into the main syntax.

MISRA's approach is to restrict the use of macros to a small set of
well-structured common uses.

My approach, in my own languages, was to not to have such powerful
preprocessor as C's at all. And that hasn't presented any particular
problems in decades of coding.

(I have used include files; I have very rarely used conditional code;
and I occasionally use a simple parameterless macro: just a straight
text replacement.)

I think, if there was a pressing need for something, then I would
introduce a new language feature for it without resorting to macros as
seems to be the C solution for everything. And programs full of macros,
as well as being less readable and causing problems for refactoring, are
slower to compile.

Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.

(Include files I don't see as problematical; a re-factor program
can read the the file and make use of the information; then the include
file is re-factored separately. The end result will still use #include.)
--
Bartc
Thiago Adams
2017-02-15 12:28:06 UTC
Permalink
Raw Message
Post by BartC
Post by Thiago Adams
Some interesting links about refactoring of C programs and about
the difficulties caused by pre-processing.
PROGRAM REFACTORING IN THE PRESENCE OF PREPROCESSOR DIRECTIVES
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1012.294&rep=rep1&type=pdf
OK, 220 pages to tell us what everyone knows. (Did the author succeed in
re-factoring arbitrary C code that may be full of conditionals and that
may depend on a specific compiler?)
Post by Thiago Adams
Refactoring C with Conditional Compilation
http://www.informatik.uni-bremen.de/st/lehre/Arte-fakt/Seminar/papers/17/Refactoring%20C%20with%20Conditional%20Compilation.pdf
Challenges of Refactoring C Programs
https://www.researchgate.net/profile/Alejandra_Garrido2/publication/234805773_Challenges_of_Refactoring_C_Programs/links/0c96051cc23aaa9dd1000000.pdf
(My interest is also source-to-source transformation.)
Post by Thiago Adams
Macro expansion?
Besides simple macros I don't have idea the new rules.
So I don't known what would be allowed. For the worst case, only
simple #define would work.
"Rule 19.4 (required): C macros shall only expand to a braced
initialiser, a constant,a string literal, a parenthesized expression,
a type qualifier, a storage class specifier, or a do-while-zero
construct."
This is interesting and this rule (or something similar)
could be applied for the grammar that include "pre-processing".
So your proposal was to forget the existing CPP, and create a new one
that is properly integrated into the main syntax.
Yes, but at same time being compatible when including system header or
normal C code. (Activates "CPP ON" for compatibility)
Post by BartC
MISRA's approach is to restrict the use of macros to a small set of
well-structured common uses.
My approach, in my own languages, was to not to have such powerful
preprocessor as C's at all. And that hasn't presented any particular
problems in decades of coding.
(I have used include files; I have very rarely used conditional code;
and I occasionally use a simple parameterless macro: just a straight
text replacement.)
I think, if there was a pressing need for something, then I would
introduce a new language feature for it without resorting to macros as
seems to be the C solution for everything. And programs full of macros,
as well as being less readable and causing problems for refactoring, are
slower to compile.
Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.
(Include files I don't see as problematical; a re-factor program
can read the the file and make use of the information; then the include
file is re-factored separately. The end result will still use #include.)
My suggestion was to make C' work without pre-processing in a way
that most of people would not notice. Nothing new to learn.

I don't see #include and #if, #else .. as problem for a parser.

The C# language, for instance, which has a lot of refactoring tools,
has #if etc.

C# Preprocessor Directives
https://msdn.microsoft.com/en-us/library/ed8yd1ha.aspx

"Although the compiler does not have a separate preprocessor, the directives described in this section are processed as if there were one. "

The grammar of C# : http://www.cs.vu.nl/grammarware/browsable/CSharp/grammar.html

The use of #define is limited.
BartC
2017-02-15 13:22:11 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by BartC
So your proposal was to forget the existing CPP, and create a new one
that is properly integrated into the main syntax.
My suggestion was to make C' work without pre-processing in a way
that most of people would not notice. Nothing new to learn.
I don't see #include and #if, #else .. as problem for a parser.
How does it interact with normal 'if'?

Both ifs have a well-formed nested structure but what happens if these
two lots of structure conflict? Would the following be OK for example:

#if X
if (cond) {
#endif
stmt1;
}

I can see difficulties with supporting this in a recursive-descent parser.

If you stipulate that what's in each #if branch must be a
self-contained, well-formed bit of code, then it becomes harder to tell
the difference between #if and if.

I suppose one difference might be that the contents of a false #if
branch need not compile, beyond having to contain valid syntax. But that
just brings up the issue of what is allowed in a #if 0 branch and what
isn't.
Post by Thiago Adams
The grammar of C# : http://www.cs.vu.nl/grammarware/browsable/CSharp/grammar.html
The use of #define is limited.
You can say that. A #define symbol is either defined or undefined; there
is no replacement text associated with it. That's even simpler than what
I use!
--
Bartc
Thiago Adams
2017-02-15 13:47:40 UTC
Permalink
Raw Message
Post by BartC
Post by Thiago Adams
Post by BartC
So your proposal was to forget the existing CPP, and create a new one
that is properly integrated into the main syntax.
My suggestion was to make C' work without pre-processing in a way
that most of people would not notice. Nothing new to learn.
I don't see #include and #if, #else .. as problem for a parser.
How does it interact with normal 'if'?
Both ifs have a well-formed nested structure but what happens if these
#if X
if (cond) {
#endif
stmt1;
}
I can see difficulties with supporting this in a recursive-descent parser.
If you stipulate that what's in each #if branch must be a
self-contained, well-formed bit of code, then it becomes harder to tell
the difference between #if and if.
I suppose one difference might be that the contents of a false #if
branch need not compile, beyond having to contain valid syntax. But that
just brings up the issue of what is allowed in a #if 0 branch and what
isn't.
Post by Thiago Adams
The grammar of C# : http://www.cs.vu.nl/grammarware/browsable/CSharp/grammar.html
The use of #define is limited.
You can say that. A #define symbol is either defined or undefined; there
is no replacement text associated with it. That's even simpler than what
I use!
--
I tried in C#:

class Program
{
static void Main(string[] args)
{
#if DEBUG
int i = 0;
if (i != 0)
{
#else
int i = 1;

#endif
}
}
}

I was not expecting this to compile. But it compiled.

So the region inside #if doesn't need to be valid.

The grammar at
http://www.cs.vu.nl/grammarware/browsable/CSharp/grammar.html

has (conditional-section:)

So, some problems of macros still there in C#.
For refactoring for instance,

class Program
{
#if DEBUG
static void f2()
{
#else
static void f2()
{
#endif
}

static void Main(string[] args)
{
f2();
}
}

When I rename f2, only the active branch (DEBUG) was renamed.
But the problem of "rebuild" the font is not there because
the #if is inside the AST.
Thiago Adams
2017-02-15 18:06:29 UTC
Permalink
Raw Message
On Wednesday, February 15, 2017 at 10:28:15 AM UTC-2, Thiago Adams wrote:
[...]
Post by Thiago Adams
Yes, but at same time being compatible when including system header or
normal C code. (Activates "CPP ON" for compatibility)
More details about this idea:

Let's say I have "my.c" that includes "sqlite.h"

#include <sqlite.h>

When including sqlite.h c preprocessor will work normally.
So, my.c will not see any macro during this expansion.
After sqlite.h inclusion we still have the macro definitions
present. (hash table of defines)

If I include other file.h the macro definitions will be there.

#include <sqlite.h>
#include "file.h"

When I return to my.c (file being compiled) then I am
in the "preprocessor off" mode.

Let's say I have:

int main()
{
int i = 2 * A;
}

and A was defined inside "file.h"

#define A 1 + 2

When the parser see A it will check if A is a macro.
If it is, the parser will check if A is "compatible"
macro (constant, constant-expression ...)
If not the code will not compile.

Note that when are including file.h we are not going to see
the definition (#define) of A. This definition will be "eat".
But we see the result of definition. (hash map of defines after inclusion)

#defines out of the #include that are parsed by a new grammar.
#define inside of includes are parsed normally (as it is today).

So, what is the advantage?

I can rebuild my.c from AST and I have safer macros.
Thiago Adams
2017-02-15 13:24:21 UTC
Permalink
Raw Message
On Wednesday, February 15, 2017 at 9:38:51 AM UTC-2, Bart wrote:
...
Post by BartC
I think, if there was a pressing need for something, then I would
introduce a new language feature for it without resorting to macros as
seems to be the C solution for everything. And programs full of macros,
as well as being less readable and causing problems for refactoring, are
slower to compile.
Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.
Some samples to make the idea clear:
I will allow empty define, constant or constant-expression with ()

C' Grammar:

define-declaration:
#define identifier
#define identifier constant
#define identifier ( constant-expression )

Here #define is no more preprocessor.
The constant-expression can be evaluated and the value
is used as a constant. (Like enum constant)

Sample C' language with no preprocessor:

#define DEBUG

//This will not compile (missing ( ) )
//#define A 1 + 2

//This will compile
#define A (1 + 2)


#ifdef DEBUG
int i = 2 * A;
printf("%d", i); //prints 6
#endif

Today I use macros for initializers.
To cover this case I would have to add
some macro-initializer without "designator".

define-declaration:
#define identifier
#define identifier constant
#define identifier ( constant-expression )
#define identifier macro-initializer


Sample:

#define POINT_INIT {0, 0}

Point pt = POINT_INIT;

So macros could be:
- Simple named identifier for #if usage
- Constants (like enums)
- Macro-Initializer
BartC
2017-02-15 13:57:28 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by BartC
Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.
I will allow empty define, constant or constant-expression with ()
#define identifier
#define identifier constant
#define identifier ( constant-expression )
Here #define is no more preprocessor.
The constant-expression can be evaluated and the value
is used as a constant. (Like enum constant)
#define DEBUG
//This will not compile (missing ( ) )
//#define A 1 + 2
//This will compile
#define A (1 + 2)
- Simple named identifier for #if usage
- Constants (like enums)
- Macro-Initializer
If I concentrate on the use of #define for constant values. This is a
feature that does not have to be in any sort of preprocessor at all.

For years I've had named constants as a feature of my languages.
Examples (I use 'const' but 'constant' is shown here to avoid confusion
with C's 'const'):

constant maxlevel = 50 # type is int
constant pi = 3.14159265359 # type is double
constant x = pi*maxlevel+1 # 158.08, type double
constant real fsix = 6 # type is double
constant infile = "input" # type is char*

(Type is optional and picked up from the expression.)

This declares a fully typed, fully scoped named constant, unlike normal
#define, but a bit like C's enums. These names can't be used as lvalues.
They can be used for array bounds and switch-case values.

A very easy feature to support, and eliminates half the uses of #define.

And probably much more efficient than how C uses #define:

#define x (pi*maxlevel+1)

First, the "(...)" are necessary here because you don't know how the
surrounding code, when the macro is invoked, will interact with the *
and + operators.

Second, if 'x' is used 100 times, then it is necessary to expand, parse,
type-analyse and reduce (to a single constant) "(pi*maxlevel+1)" 100 times.

Compare that with using my 'constant x' which (once resolved) is just
replaced by 158.08 of known type double.

It's one of these no-brainer features that some languages stubbornly
refuse to adopt.
--
Bartc
Tim Rentsch
2017-02-19 18:27:03 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by BartC
Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.
[...]
- Simple named identifier for #if usage
- Constants (like enums)
- Macro-Initializer
If I concentrate on the use of #define for constant values. This is a
feature that does not have to be in any sort of preprocessor at all.
It does if the symbols being defined are used in #if or
other preprocessor conditionals.
Ben Bacarisse
2017-02-15 15:16:48 UTC
Permalink
Raw Message
Post by Thiago Adams
...
Post by BartC
I think, if there was a pressing need for something, then I would
introduce a new language feature for it without resorting to macros as
seems to be the C solution for everything. And programs full of macros,
as well as being less readable and causing problems for refactoring, are
slower to compile.
Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.
I will allow empty define, constant or constant-expression with ()
#define identifier
#define identifier constant
#define identifier ( constant-expression )
See below...
Post by Thiago Adams
Here #define is no more preprocessor.
The constant-expression can be evaluated and the value
is used as a constant. (Like enum constant)
#define DEBUG
//This will not compile (missing ( ) )
//#define A 1 + 2
//This will compile
#define A (1 + 2)
If you make this a part of the language, then there is no need to insist
on parentheses.

But I would urge the use of something other the #define (#substitute?).
In fact, there would no longer be any need the use the clumsy
line-oriented syntax of the preprocessor.

<snip>
--
Ben.
Thiago Adams
2017-02-15 16:14:36 UTC
Permalink
Raw Message
Post by Ben Bacarisse
Post by Thiago Adams
...
Post by BartC
I think, if there was a pressing need for something, then I would
introduce a new language feature for it without resorting to macros as
seems to be the C solution for everything. And programs full of macros,
as well as being less readable and causing problems for refactoring, are
slower to compile.
Since it seems you are thinking of creating a C replacement, you can
start by investigating existing uses of macros, which may have to go
further then MISRA, and seeing how they would look as proper features
directly in the language.
I will allow empty define, constant or constant-expression with ()
#define identifier
#define identifier constant
#define identifier ( constant-expression )
See below...
Post by Thiago Adams
Here #define is no more preprocessor.
The constant-expression can be evaluated and the value
is used as a constant. (Like enum constant)
#define DEBUG
//This will not compile (missing ( ) )
//#define A 1 + 2
//This will compile
#define A (1 + 2)
If you make this a part of the language, then there is no need to insist
on parentheses.
The idea is that if you have a valid code in C that works
I want to copy-paste it to C' and get the same result.

If
#define X 1 + 2
was allowed, a big surprise would happen in the final result.

Incompatibilities should be caught in compile time.

--
One difference between #define and enuns, is that double types would be allowed.

#define PI 3.1415
Ben Bacarisse
2017-02-15 18:01:35 UTC
Permalink
Raw Message
<snip>
Post by Thiago Adams
Post by Ben Bacarisse
Post by Thiago Adams
I will allow empty define, constant or constant-expression with ()
#define identifier
#define identifier constant
#define identifier ( constant-expression )
See below...
Post by Thiago Adams
Here #define is no more preprocessor.
The constant-expression can be evaluated and the value
is used as a constant. (Like enum constant)
#define DEBUG
//This will not compile (missing ( ) )
//#define A 1 + 2
//This will compile
#define A (1 + 2)
If you make this a part of the language, then there is no need to insist
on parentheses.
The idea is that if you have a valid code in C that works
I want to copy-paste it to C' and get the same result.
If
#define X 1 + 2
was allowed, a big surprise would happen in the final result.
Right. But my point is that if you restrict #define to permit only
certai forms, you can (and should) also restrict what it means. You can
simple say that after

#define X 1 + 2

X means (1 + 2). And then you notice don't need the special line
oriented syntax and you can use a more comfortable syntax.

<snip>
--
Ben.
BartC
2017-02-15 18:25:07 UTC
Permalink
Raw Message
Post by Ben Bacarisse
Right. But my point is that if you restrict #define to permit only
certai forms, you can (and should) also restrict what it means. You can
simple say that after
#define X 1 + 2
X means (1 + 2). And then you notice don't need the special line
oriented syntax and you can use a more comfortable syntax.
That's the best part of the preprocessor, in that it is line-oriented
(no semicolons) and mandates consistent block delimiters (no {,} in a
dozen placement styles, or often none at all).

In that respect it's higher level than C.
--
bartc
Ben Bacarisse
2017-02-15 19:14:30 UTC
Permalink
Raw Message
Post by BartC
Post by Ben Bacarisse
Right. But my point is that if you restrict #define to permit only
certai forms, you can (and should) also restrict what it means. You can
simple say that after
#define X 1 + 2
X means (1 + 2). And then you notice don't need the special line
oriented syntax and you can use a more comfortable syntax.
That's the best part of the preprocessor, in that it is line-oriented
(no semicolons) and mandates consistent block delimiters (no {,} in a
dozen placement styles, or often none at all).
You've become so tribal that you are prepared to argue against yourself!
I was echoing your thoughts -- that with a sufficiently restricted
#define you might as well have a "constant" definition.
--
Ben.
s***@casperkitty.com
2017-02-15 18:25:12 UTC
Permalink
Raw Message
Post by Ben Bacarisse
Right. But my point is that if you restrict #define to permit only
certai forms, you can (and should) also restrict what it means. You can
simple say that after
#define X 1 + 2
X means (1 + 2). And then you notice don't need the special line
oriented syntax and you can use a more comfortable syntax.
Breaking away from line-oriented syntax would make it possible to have a
macro change the meaning of local or global "preprocessor" symbols. I
think there are advantages to having a language specified in such a way
that doesn't make metaprogramming Turing-complete (which would make the
question of whether a particular program could be compiled, undecidable)
but still allows limited looping constructs. Without an ability to replace
expressions with simple values, however, compilation could easily become
very slow if repeated invocations of a macro yield 1, 1+(1), 1+(1+(1)),
etc. rather than 1, 2, 3, etc.
BartC
2017-02-15 18:45:16 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Ben Bacarisse
Right. But my point is that if you restrict #define to permit only
certai forms, you can (and should) also restrict what it means. You can
simple say that after
#define X 1 + 2
X means (1 + 2). And then you notice don't need the special line
oriented syntax and you can use a more comfortable syntax.
Breaking away from line-oriented syntax would make it possible to have a
macro change the meaning of local or global "preprocessor" symbols. I
think there are advantages to having a language specified in such a way
that doesn't make metaprogramming Turing-complete (which would make the
question of whether a particular program could be compiled, undecidable)
but still allows limited looping constructs. Without an ability to replace
expressions with simple values, however, compilation could easily become
very slow if repeated invocations of a macro yield 1, 1+(1), 1+(1+(1)),
etc. rather than 1, 2, 3, etc.
This is a program that is very slow to compile because of heavy use of
macros:

http://pastebin.com/PfRbGHns

(Originally posted by BGB as part of gcc bug report, as gcc was taking
an inordinately long time to compile it. Although most of that was
dealing with the result of the preprocessing.)

This 1000-line file took gcc 1.5 seconds to preprocess (-E).

With my own cpp, preprocessing was up to 3 magnitudes slower than I
would have expected from a file of that size.
--
bartc
GOTHIER Nathan
2017-02-15 19:03:33 UTC
Permalink
Raw Message
On Wed, 15 Feb 2017 18:45:16 +0000
Post by BartC
This is a program that is very slow to compile because of heavy use of
http://pastebin.com/PfRbGHns
Heavy use of C macros is a C bad use!
Post by BartC
(Originally posted by BGB as part of gcc bug report, as gcc was taking
an inordinately long time to compile it. Although most of that was
dealing with the result of the preprocessing.)
This 1000-line file took gcc 1.5 seconds to preprocess (-E).
With my own cpp, preprocessing was up to 3 magnitudes slower than I
would have expected from a file of that size.
--
bartc
Why people try to implement bad function-like macros when they can easily
implement good C functions? It's like people implementing 3D engines or video
codecs in Javascript...

Once more, I'm not very fond of VeryLongIdentifiersThatBreakMyFingers. Good
identifiers should be less than 10 characters with meaningful characters.
j***@verizon.net
2017-02-15 18:12:43 UTC
Permalink
Raw Message
Post by Thiago Adams
I was wondering how to eliminate pre-processor
keeping compatibility with existing C code.
Since there's essentially no feature of C pre-processing that hasn't already been used by existing code, if compatibility with existing C code is important to you, then it would seem you were proposing to implement all of those features, but to do so without having a distinct pre-processor step. In principle, that should be possible, though I don't know that there's any big advantage to be achieved by doing it that way.

On Wednesday, February 15, 2017 at 8:24:34 AM UTC-5, Thiago Adams wrote:
...
Post by Thiago Adams
I will allow empty define, constant or constant-expression with ()
#define identifier
#define identifier constant
#define identifier ( constant-expression )
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
s***@casperkitty.com
2017-02-15 18:51:20 UTC
Permalink
Raw Message
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
IMHO, a good derived language should strive to ensure that outside of
contrived scenarios, code which has meaning in the original language
will either have the same meaning in the new language, or else it will
fail compilation but be easily replaceable with code that would have
the same meaning in both languages and be as good or better than the
original code in the original language.

For example, if I were designing a language I would distinguish between
values which are derived from arithmetic expressions that might wrap or
overflow, and values which are not, and I would forbid conversion of
such values directly to larger types, *even with a cast operator*. Thus,
given "uint32_t x,y; uint64_t z;" I would allow "x+=y; z=x;" but I would
not allow "z=x+y;" nor even "z=(uint64_t)(x+y);". I would suggest that
even in C, "z=(uint32_t)(x+y);" would be clearer and thus "better" than
the forms without the uint32_t cast. Not only does the uint32_t cast help
make it obvious what the code actually does, but more importantly *it
shows what the author of the code is expecting it to do*.

Adding the aforementioned rules would make the resulting language
incompatible with some C programs, but the vast majority of such programs
could readily be adapted to work with the new language without making them
less suitable for use in C (the only situations I can see that might be
problematic would involve macro constructs, and I doubt they'd post much
difficulty outside of contrived scenarios).
Thiago Adams
2017-02-15 20:14:13 UTC
Permalink
Raw Message
Post by j***@verizon.net
Post by Thiago Adams
I was wondering how to eliminate pre-processor
keeping compatibility with existing C code.
Since there's essentially no feature of C pre-processing that hasn't already been used by existing code, if compatibility with existing C code is important to you, then it would seem you were proposing to implement all of those features, but to do so without having a distinct pre-processor step. In principle, that should be possible, though I don't know that there's any big advantage to be achieved by doing it that way.
...
Post by Thiago Adams
I will allow empty define, constant or constant-expression with ()
#define identifier
#define identifier constant
#define identifier ( constant-expression )
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
Yes. Exactly.
I want compatibility with existing code.

So, inside #include we are in normal mode.(as it is today)

---file.h---
#define INLINE inline

INLINE void F();
---

---file.c---
#include "file.h"
-----------------------

We can use #define without restriction (as it is today).
The parser (when parsing file.c) will enter inside the include
and it will see only preprocessed tokens:

inline void F();

(No macros here)


But, if you put a define outside:

---file.c---
#include "file.h"
#define X 1 + 2
-----------------------

Now this #defined is parsed with new grammar rules. (see alternative below)

The parser will not see any preprocessor tokens from file.h
but it still have in memory the #defines (original) created
inside file.h.
So the INLINE macro still present on the memory.

Now, if I try:

---file.c---
#include "file.h"

INLINE void f2();
-----------------------

This INLINE is macro created inside file.h
If this macro doesn't follow the grammar rules
(as it was defined inside file.c) I can generate
an error.

So, one option is instead of adding grammar rules for
#define, we could allow everything as it is today.

Then, at expansion we could apply the rules.

For instance:

--file.c--

#define X 1 + 2

#include "file2.h" //file 2 will see X as normal macro


int main()
{
//Now, only here, at expansion we will check the macro
//and complain
int i = 2 * X;
}

Again, what is the advantage?

I can rebuild file.c. (refactoring for instance)

The usage of macros inside file.c follow some
rules and it's safer.
BartC
2017-02-15 20:25:32 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
Yes. Exactly.
I want compatibility with existing code.
So, inside #include we are in normal mode.(as it is today)
---file.h---
#define INLINE inline
INLINE void F();
---
---file.c---
#include "file.h"
-----------------------
We can use #define without restriction (as it is today).
The parser (when parsing file.c) will enter inside the include
inline void F();
(No macros here)
---file.c---
#include "file.h"
#define X 1 + 2
-----------------------
You've already lost me.

If someone sees a code fragment containing '#define', is that processed
according to the old or new rules? What happens when people copy and
paste across files, or just want to share code.

It's too confusing.

Why not use something different from '#' (I think @ is available), or
change the directives, so #newdefine or #Define or whatever?
--
bartc
Thiago Adams
2017-02-15 20:50:45 UTC
Permalink
Raw Message
Post by BartC
Post by Thiago Adams
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
Yes. Exactly.
I want compatibility with existing code.
So, inside #include we are in normal mode.(as it is today)
---file.h---
#define INLINE inline
INLINE void F();
---
---file.c---
#include "file.h"
-----------------------
We can use #define without restriction (as it is today).
The parser (when parsing file.c) will enter inside the include
inline void F();
(No macros here)
---file.c---
#include "file.h"
#define X 1 + 2
-----------------------
You've already lost me.
If someone sees a code fragment containing '#define', is that processed
according to the old or new rules? What happens when people copy and
paste across files, or just want to share code.
It's too confusing.
change the directives, so #newdefine or #Define or whatever?
There are two options and I think this is
creating confusion.

1 - Preprocessor doesn't exist anymore.
Everything must follow new rules.
Some existing code will not compile.

2 - Preprocessor still working when inside the included files.
New rules are applied only at the translation unit outside the
include.

Here the description of 2:


The program starts parsing "file.c"
Let's say there is an option "-parse using new macro rules"

We have two files. (file.c and file.h)

---file.h---
#define INLINE inline

INLINE void F();
---

---file.c---
#include "file.h"
-----------------------

At the beginning we are parsing file.c.
We can see all tokens.

So the parser will see the token "#include".

When it see #include it enters in a "MACRO ON MODE".
So the parser will not see anymore any preprocessor
token #include, #define etc..

It will see only:

inline void F();

and then the parser of file.h finish.

We are again at the start state, and the parser can
see #include, #define.

The parsed macros from file.h still alive in memory.

There is a hasp map ["INLINE"] -> "inline"
#define INLINE inline

If you include another file for instance file2.h after file.h
it would see INLINE and it could expand INLINE because
inside the include we are in "MACRO ON MODE".


But if you try to expand the INLINE macro inside file.c

---file.c---
#include "file.h"

INLINE void f3(); // <- here

-----------------------

Then the INLINE macro must follow the rules for expansion.
The rules would be applied only at expansion inside .c file.

---------

For the option 1 the rules are applied everywhere.
BartC
2017-02-16 11:50:38 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by BartC
You've already lost me.
If someone sees a code fragment containing '#define', is that processed
according to the old or new rules? What happens when people copy and
paste across files, or just want to share code.
It's too confusing.
change the directives, so #newdefine or #Define or whatever?
There are two options and I think this is
creating confusion.
1 - Preprocessor doesn't exist anymore.
Everything must follow new rules.
Some existing code will not compile.
2 - Preprocessor still working when inside the included files.
New rules are applied only at the translation unit outside the
include.
The program starts parsing "file.c"
Let's say there is an option "-parse using new macro rules"
We have two files. (file.c and file.h)
---file.h---
#define INLINE inline
INLINE void F();
---
---file.c---
#include "file.h"
-----------------------
At the beginning we are parsing file.c.
We can see all tokens.
So the parser will see the token "#include".
When it see #include it enters in a "MACRO ON MODE".
So the parser will not see anymore any preprocessor
token #include, #define etc..
inline void F();
and then the parser of file.h finish.
We are again at the start state, and the parser can
see #include, #define.
The parsed macros from file.h still alive in memory.
There is a hasp map ["INLINE"] -> "inline"
#define INLINE inline
If you include another file for instance file2.h after file.h
it would see INLINE and it could expand INLINE because
inside the include we are in "MACRO ON MODE".
But if you try to expand the INLINE macro inside file.c
---file.c---
#include "file.h"
INLINE void f3(); // <- here
-----------------------
Then the INLINE macro must follow the rules for expansion.
The rules would be applied only at expansion inside .c file.
---------
For the option 1 the rules are applied everywhere.
Sorry, but that the fact that you need all this explanation is not a
good sign. (I tried to understand but my brain seems unwilling.) Compare
that to this:

#define X A B C // old rules
.define Y D E F // new rules

Do you see how much simpler it is to explain? And how much easier for
someone to glance at a bit of code and know immediately, without any
other context, whether it is interpreted under the old or new rules?

(Actually you can't use ".define" because a normal "." can occur at the
start of a line. "@" is maybe a better choice but I didn't like it.
#Define is another option. Whatever it is, you need something to
distinguish '#define' from '#define'!)
--
Bartc
Thiago Adams
2017-02-16 12:44:19 UTC
Permalink
Raw Message
On Thursday, February 16, 2017 at 9:50:48 AM UTC-2, Bart wrote:
[...]
Post by BartC
Sorry, but that the fact that you need all this explanation is not a
good sign. (I tried to understand but my brain seems unwilling.) Compare
#define X A B C // old rules
.define Y D E F // new rules
Do you see how much simpler it is to explain? And how much easier for
someone to glance at a bit of code and know immediately, without any
other context, whether it is interpreted under the old or new rules?
(Actually you can't use ".define" because a normal "." can occur at the
#Define is another option. Whatever it is, you need something to
distinguish '#define' from '#define'!)
The explanation is big because is how it works.

The "manual" is much simpler:

- [ ] Use c preprocessor (all expansions are allowed)
- [X] Use parser-preprocessor
[X] Allow expansion of constants
[X] Allow expansion of do while()
..etc..

The idea is that the programmer will not notice the difference
in most cases.

There is no need for .define @define.

(
I created this confusion because at the begging I suggest
changing the grammar for #define. Then I realized that the rules
are better applied at expansion and the code is more compatible with C.
)

#define will allow everything (normally)
because the rules are checked at expansion.

I understand when you say that some C' language could have
a syntax for constants.

But, think the key for real and immediate use is to be very
compatible with C. Also for general success.
I think C++ is very successful because of this.
I not sure if any other language like D ou Rust will be more
used than C++ because C++ is easy to compile C and integrate with
Window, Android, Mac, Linux, embedded.

Also, C is interesting because it is simple. Adding new rules
can make it complicated.
C++ programmers will say "don't use macros!"
But many real programs are using macros in a safe way.
However, even in a program that is correct, there are potential problems.
Moving the pre-processor do parser will allow more checking.
Thiago Adams
2017-02-16 13:28:24 UTC
Permalink
Raw Message
Post by Thiago Adams
[...]
Post by BartC
Sorry, but that the fact that you need all this explanation is not a
good sign. (I tried to understand but my brain seems unwilling.) Compare
#define X A B C // old rules
.define Y D E F // new rules
Do you see how much simpler it is to explain? And how much easier for
someone to glance at a bit of code and know immediately, without any
other context, whether it is interpreted under the old or new rules?
(Actually you can't use ".define" because a normal "." can occur at the
#Define is another option. Whatever it is, you need something to
distinguish '#define' from '#define'!)
The explanation is big because is how it works.
- [ ] Use c preprocessor (all expansions are allowed)
- [X] Use parser-preprocessor
[X] Allow expansion of constants
[X] Allow expansion of do while()
..etc..
The idea is that the programmer will not notice the difference
in most cases.
(
I created this confusion because at the begging I suggest
changing the grammar for #define. Then I realized that the rules
are better applied at expansion and the code is more compatible with C.
)
#define will allow everything (normally)
because the rules are checked at expansion.
One more sample: (from MISRA C)

/* The following are compliant */

#define PI 3.14159F /* Constant */
#define XSTAL 10000000 /* Constant */
#define CLOCK (XSTAL/16) /* Constant expression */
#define PLUS2(X) ((X) + 2) /* Macro expanding to expression */
#define STOR extern /* storage class specifier */
#define INIT(value){ (value), 0, 0} /* braced initialiser */
#define CAT (PI) /* parenthesised expression */
#define FILE_A "filename.h" /* string literal */
#define READ_TIME_32() \
do { \
DISABLE_INTERRUPTS (); \
time_now = (uint32_t)TIMER_HI << 16; \
time_now = time_now | (uint32_t)TIMER_LO; \
ENABLE_INTERRUPTS (); \
} while (0) /* example of do-while-zero */

/* the following are NOT compliant */

#define int32_t long /* use typedef instead */
#define STARTIF if( /* unbalanced () and language redefinition */
#define CAT PI /* non-parenthesised expression */

-------

Let's say:

/* The following are compliant */

#define PI 3.14159F /* Constant */
#define CAT (PI) /* parenthesised expression */

And I create a new macro X

#define X CAT

Now X is OK, not because I parsed "X CAT"
but because the expansion is (3.14159F).

The rule that can be applied at the #define
definition is a warning if the argument is not
inside parentheses.

A more sophisticated expansion could check the rules
during each expansion.

#define X 1 + 2
#define Y 2 * X

Here the checked expansion only at the end would be: 2 * 1 + 2

But if the rules were checked during expansion.

int i = Y; // 2 * X
// X is not a valid expansion
BartC
2017-02-16 14:32:11 UTC
Permalink
Raw Message
Post by Thiago Adams
But, think the key for real and immediate use is to be very
compatible with C. Also for general success.
I think C++ is very successful because of this.
I wonder why C doesn't just disappear then? As every C program, with
some tweaks as necessary, will compile as C++.
Post by Thiago Adams
I not sure if any other language like D ou Rust will be more
used than C++ because C++ is easy to compile C and integrate with
Window, Android, Mac, Linux, embedded.
Also, C is interesting because it is simple. Adding new rules
can make it complicated.
But C++ is complicated, and above you suggest that has not stopped it
being successful!
Post by Thiago Adams
C++ programmers will say "don't use macros!"
But many real programs are using macros in a safe way.
Most of the MISRA examples you gave involved constants, which I said
elsewhere are an easy feature that ought to be built-in to the language.

The others are:

#define PLUS2(X) ((X) + 2) /* Macro expanding to expression */

Inline function. (That such macros remove the need to specifier type is
something that needs thinking about, perhaps as a new language feature,
but which must under no circumstance involved introducing templates).

#define STOR extern /* storage class specifier */

Redefining the language (wasn't this prohibited?).

#define INIT(value){ (value), 0, 0} /* braced initialiser */

IMO, also redefining the language (even if it is a new version of C). An
initialiser with multiple elements is enclosed in {...}. This gets rid
of the braces.

#define CAT (PI) /* parenthesised expression */

Inline function.

#define READ_TIME_32() \
do { \
DISABLE_INTERRUPTS (); \
time_now = (uint32_t)TIMER_HI << 16; \
time_now = time_now | (uint32_t)TIMER_LO; \
ENABLE_INTERRUPTS (); \
} while (0) /* example of do-while-zero */

Inline function.

So you can go through programs and look at how macros can be eliminated.
For new code at least, because you need to support existing code.

If your intention is to enforce macro usage guidelines, a bit like
MISRA, to make it easier to analyse code, then that's simpler in that
you don't need to create a new language!
--
bartc
s***@casperkitty.com
2017-02-16 17:13:37 UTC
Permalink
Raw Message
Post by BartC
Post by Thiago Adams
But, think the key for real and immediate use is to be very
compatible with C. Also for general success.
I think C++ is very successful because of this.
I wonder why C doesn't just disappear then? As every C program, with
some tweaks as necessary, will compile as C++.
The ability to compile a C program as C++ is only useful if the resulting
program will run with known semantics. There are some constructs which
have defined behavior in C but have no convenient direct equivalent in C++.

Of course, the same sort of problem can exist along the path from C89
to later versions unless a compiler continues to support old semantics.
Some constructs in C89 have no practical equivalent in C99 (e.g. unless
one interprets C89 in a fashion that would not allow allocated storage to
hold objects of non-character type, it would be hard to write an efficient
function whose behavior was semantically equivalent to the following in
all cases that would be defined in C89).

void copy_as_bytes(void *dest, void *src, size_t n)
{ memmove(dest, src, n); }

The number of cases where C constructs have no C++ equivalent, however, is
larger than the number of cases where C89 constructs have no C99 equivalent.
Post by BartC
#define PLUS2(X) ((X) + 2) /* Macro expanding to expression */
Inline function. (That such macros remove the need to specifier type is
something that needs thinking about, perhaps as a new language feature,
but which must under no circumstance involved introducing templates).
Macros can do some things inline functions can't. It may be practical to
give some of those abilities to inline or other functions (e.g. include a
real pass-by-reference syntax, and add some special parameter types whose
values would be generated by the compiler rather than the caller source
code). For example, given:

void inc_and_log_it(__caller_file cf, __caller_line cl,
char const *msg, unsigned &var)
{
fprintf(logfile, "%s:%u\n", msg, var);
var++;
}

a function call log_it("FRED", x); would be interpreted as though the
function had been

void log_it(char const *cf, size_t cl,
char const *msg, unsigned * restrict v);
{
fprintf(logfile, "%s:%u\n", msg, *var);
(*var)++;
}

and the call had been log_it(__FILE__, __LINE__, "FRED", &x); except that
a compiler would be entitled to treat x afterward as though its address
had not been taken.
Post by BartC
#define STOR extern /* storage class specifier */
Redefining the language (wasn't this prohibited?).
A common pattern in some circles is to implement the "don't repeat yourself"
principle with external variables by having a header file like foo.h say
something like:

#ifndef FOO_C
#define FOO_EXTERN extern
#else
#define FOO_EXTERN
#endif

and then precede each declaration within the file with FOO_EXTERN. This
avoids the need to keep two sets of declarations in sync, and would be even
more useful if "extern" had priority over the presence of an initializer,
thus allowing e.g.

FOO_EXTERN const my_array[] = { 1,2,3,4,5,6 };

to be treated in files other than foo.c as a declaration of a complete
array type whose size would automatically match the number of items
therein. I've seen no way to define such things which would work as nicely
as that construct could. There are ways of achieving such functionality in
C99 by defining a macro containing all the elements and then using that
both for the initialization and for expressions that compute the size, but
nothing as clean and simple as the above.
Post by BartC
#define INIT(value){ (value), 0, 0} /* braced initialiser */
IMO, also redefining the language (even if it is a new version of C). An
initialiser with multiple elements is enclosed in {...}. This gets rid
of the braces.
A macro containing a list of comma-separated items without braces is more
dubious than one which contains the braces, though the former may be needed
in some contexts where two or more arrays need to be initialized with data
which is largely identical.
Post by BartC
#define CAT (PI) /* parenthesised expression */
Inline function.
#define READ_TIME_32() \
do { \
DISABLE_INTERRUPTS (); \
time_now = (uint32_t)TIMER_HI << 16; \
time_now = time_now | (uint32_t)TIMER_LO; \
ENABLE_INTERRUPTS (); \
} while (0) /* example of do-while-zero */
Inline function.
Compilers have a lot of freedom with functions declared inline. Consider
a macro to reverse all the bits in a 32-bit word using a combination of
masks and shifts. Invoking such a macro with a compile-time constant will
yield a compile-time constant result. An inline function, by contrast,
would on many compilers be likely to yield something far less efficient
(even though the function should devolve to a simple constant, many compilers
may decide for each function that it will always be inlined or never be
inlined, and since the combination of shifts and masks necessary to reverse
a non-constant value would be fairly complicated, many compilers would decide
never to inline the function).
Keith Thompson
2017-02-16 18:38:20 UTC
Permalink
Raw Message
***@casperkitty.com writes:
[...]
Post by s***@casperkitty.com
The ability to compile a C program as C++ is only useful if the resulting
program will run with known semantics. There are some constructs which
have defined behavior in C but have no convenient direct equivalent in C++.
For example?

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2017-02-16 19:55:53 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
Post by s***@casperkitty.com
The ability to compile a C program as C++ is only useful if the resulting
program will run with known semantics. There are some constructs which
have defined behavior in C but have no convenient direct equivalent in C++.
For example?
I'm not a C++ expert by any means, but by my understanding it requires
that after one union member is used, any attempt to use another member
without first using placement new yields UB. Adding placement new might
not be a problem except that the Standard makes no guarantee about the
contents of any portion of the newly-selected member--even parts which
would share a Common Initial Sequence with the previously accessed member.
Keith Thompson
2017-02-16 20:24:12 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
[...]
Post by s***@casperkitty.com
The ability to compile a C program as C++ is only useful if the
resulting program will run with known semantics. There are some
constructs which have defined behavior in C but have no convenient
direct equivalent in C++.
For example?
I'm not a C++ expert by any means, but by my understanding it requires
that after one union member is used, any attempt to use another member
without first using placement new yields UB. Adding placement new might
not be a problem except that the Standard makes no guarantee about the
contents of any portion of the newly-selected member--even parts which
would share a Common Initial Sequence with the previously accessed member.
C++ has the common initial sequence rule; C++11 9.2p19 [class.mem] and
9.3p1 [class.union].

I don't see anything in the C++ standard that permits writing to one
member of a union and reading another, as C permits, but I could very
easily be missing something.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
David Brown
2017-02-16 20:32:42 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
[...]
Post by s***@casperkitty.com
The ability to compile a C program as C++ is only useful if the resulting
program will run with known semantics. There are some constructs which
have defined behavior in C but have no convenient direct equivalent in C++.
For example?
I'm not a C++ expert by any means, but by my understanding it requires
that after one union member is used, any attempt to use another member
without first using placement new yields UB. Adding placement new might
not be a problem except that the Standard makes no guarantee about the
contents of any portion of the newly-selected member--even parts which
would share a Common Initial Sequence with the previously accessed member.
Placement new constructs a new object at a given address. You can't use
it as a way to access existing data as though it were a different type.
If you have a C++ union containing classes with constructors and
destructors, then to change which type you store in a union you must
first explicitly destruct the existing member, then use placement new to
construct a new one. This is /not/ type-punning - it is the equivalent
in C of simply assigning to a different member (since C objects have no
constructors or destructors).

C++11 and C++14 explicitly allow access to a common initial sequence,
just like C. I assume that also applies to earlier C++ standards, but I
don't have them handy.

Regarding access to POD members with no constructors or destructors, C++
treats unions mostly like C89 - if you have a union of a float and an
int, and assign to the float, you are not allowed to read out the
representation as the int. (In C89, this is implementation-defined
behaviour - in C++, it is not discussed and therefore undefined.) This
is hardly surprising, since C++ effectively forked from C89, and the
"type punning" feature of unions was not standardised until C99.

However, the reason "type punning" via unions was standardised in C99 is
that compilers already implemented it and people relied upon it. I
suspect you will find that any C++ compiler will support type punning
via unions, at least when constructors and destructors are not involved
- if you use type punning to gain access to an invalid object, that will
be UB. Certainly C++ does not /require/ that such type punning is UB.
Keith Thompson
2017-02-16 20:55:31 UTC
Permalink
Raw Message
David Brown <***@hesbynett.no> writes:
[...]
Post by David Brown
However, the reason "type punning" via unions was standardised in C99 is
that compilers already implemented it and people relied upon it. I
suspect you will find that any C++ compiler will support type punning
via unions, at least when constructors and destructors are not involved
- if you use type punning to gain access to an invalid object, that will
be UB. Certainly C++ does not /require/ that such type punning is UB.
I'm not sure that last sentence makes sense. Either it's undefined
behavior or it isn't. If the C++ standard doesn't define the behavior,
then it's UB -- even if all implementations behave consistently. UB
doesn't mean "it blows up".

The C++ definition is similar to C's. Quoting C++11:

undefined behavior [defns.undefined]

behavior for which this International Standard imposes no
requirements [ Note: Undefined behavior may be expected when
this International Standard omits any explicit definition
of behavior or when a program uses an erroneous construct
or erroneous data. Permissible undefined behavior ranges from
ignoring the situation completely with unpredictable results, to
behaving during translation or program execution in a documented
manner characteristic of the environment (with or without the
issuance of a diagnostic message), to terminating a translation
or execution (with the issuance of a diagnostic message).
Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
David Brown
2017-02-16 21:41:27 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
Post by David Brown
However, the reason "type punning" via unions was standardised in C99 is
that compilers already implemented it and people relied upon it. I
suspect you will find that any C++ compiler will support type punning
via unions, at least when constructors and destructors are not involved
- if you use type punning to gain access to an invalid object, that will
be UB. Certainly C++ does not /require/ that such type punning is UB.
I'm not sure that last sentence makes sense. Either it's undefined
behavior or it isn't. If the C++ standard doesn't define the behavior,
then it's UB -- even if all implementations behave consistently. UB
doesn't mean "it blows up".
Supercat said that it "requires [that it] yields UB", which I
interpreted as him thinking the compiler /had/ to leave the behaviour
here without definition. It is not defined by the C Standards, but
there is no requirement for an implementation to leave it without a
definition.
Post by Keith Thompson
undefined behavior [defns.undefined]
behavior for which this International Standard imposes no
requirements [ Note: Undefined behavior may be expected when
this International Standard omits any explicit definition
of behavior or when a program uses an erroneous construct
or erroneous data. Permissible undefined behavior ranges from
ignoring the situation completely with unpredictable results, to
behaving during translation or program execution in a documented
manner characteristic of the environment (with or without the
issuance of a diagnostic message), to terminating a translation
or execution (with the issuance of a diagnostic message).
Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
Keith Thompson
2017-02-16 22:02:31 UTC
Permalink
Raw Message
Post by David Brown
Post by Keith Thompson
[...]
Post by David Brown
However, the reason "type punning" via unions was standardised in C99 is
that compilers already implemented it and people relied upon it. I
suspect you will find that any C++ compiler will support type punning
via unions, at least when constructors and destructors are not involved
- if you use type punning to gain access to an invalid object, that will
be UB. Certainly C++ does not /require/ that such type punning is UB.
I'm not sure that last sentence makes sense. Either it's undefined
behavior or it isn't. If the C++ standard doesn't define the behavior,
then it's UB -- even if all implementations behave consistently. UB
doesn't mean "it blows up".
Supercat said that it "requires [that it] yields UB", which I
interpreted as him thinking the compiler /had/ to leave the behaviour
here without definition. It is not defined by the C Standards, but
there is no requirement for an implementation to leave it without a
definition.
Supercat was either incorrect or stating something trivial, depending on
how you interpret it. If a construct has undefined behavior, that means
only that the standard does not define its behavior. It places no
requirements of any kind on implementations. If a constructs behavior
is not defined by the standard and an implementation chooses to define
its behavior, then it still has "undefined behavior" as the standard
defines and uses that phrase.

I'm not sure what it means for something to "yield UB".
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2017-02-16 23:08:35 UTC
Permalink
Raw Message
Post by David Brown
Supercat said that it "requires [that it] yields UB", which I
interpreted as him thinking the compiler /had/ to leave the behaviour
here without definition. It is not defined by the C Standards, but
there is no requirement for an implementation to leave it without a
definition.
I got two related concepts tangled up:

1. Once a union member has been accessed, access to any other member
"requires" that it first be constructed with placement new.

2. Once a union member has been accessed, access to any other member
without first be constructing the latter with placement new "yields
UB".

From what I can tell, unions only exist in C++ for the purpose of allowing
C++ to interact with C code that uses such types. C needs to allow for the
possibility that code will reuse storage without using something like a
placement new to let the compiler know, because C doesn't *have* anything
like a placement new that code *could* use to let the compiler know storage
will be reused.
s***@casperkitty.com
2017-02-16 21:32:50 UTC
Permalink
Raw Message
Post by David Brown
Regarding access to POD members with no constructors or destructors, C++
treats unions mostly like C89 - if you have a union of a float and an
int, and assign to the float, you are not allowed to read out the
representation as the int. (In C89, this is implementation-defined
behaviour - in C++, it is not discussed and therefore undefined.) This
is hardly surprising, since C++ effectively forked from C89, and the
"type punning" feature of unions was not standardised until C99.
Type punning, whether via pointers or unions, is crucial in some kinds of
programming, but not needed at all in others. Further, the cost of
supporting it can be much larger on some platforms than others (e.g. on
systems with separate floating-point and integer pipes, support for type
punning may necessitate code that forces one pipeline to stall while waiting
for an action on the other to complete). Consequently, what behaviors would
be reasonable for an implementation will depend upon the target platform and
application field.
Post by David Brown
However, the reason "type punning" via unions was standardised in C99 is
that compilers already implemented it and people relied upon it. I
suspect you will find that any C++ compiler will support type punning
via unions, at least when constructors and destructors are not involved
- if you use type punning to gain access to an invalid object, that will
be UB. Certainly C++ does not /require/ that such type punning is UB.
A guiding principle behind C89 seems to have been that if some
implementations define a behavior for some actions but others might benefit
from not having to do so, then absent some reason to change the status quo,
the actions should be deemed to invoke Undefined Behavior. I see no evidence
that the authors of C89 were intending that C89 be regarded as exhaustively
describing all the guarantees that any programmer would ever be entitled to
expect even when using implementations that were supposed to be suitable for
things like system programming. I think most compiler writers in the 1990s
understood that.

Neither the C Standard nor the C++ Standard requires compilers to generate
bogus code in cases where an implementation that processed things according
to a precise load-store model would yield a predictable result, but since
the C++ Standard was written to describe a new language rather than an
existing one, the concept of "Undefined Behavior" used to be viewed a bit
differently from how it used to be viewed in C.
Ian Collins
2017-02-17 08:19:56 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by David Brown
Regarding access to POD members with no constructors or destructors, C++
treats unions mostly like C89 - if you have a union of a float and an
int, and assign to the float, you are not allowed to read out the
representation as the int. (In C89, this is implementation-defined
behaviour - in C++, it is not discussed and therefore undefined.) This
is hardly surprising, since C++ effectively forked from C89, and the
"type punning" feature of unions was not standardised until C99.
Type punning, whether via pointers or unions, is crucial in some kinds of
programming, but not needed at all in others.
One reason C++ is gaining std::variant (a type-safe union).
--
Ian
GOTHIER Nathan
2017-02-17 11:01:20 UTC
Permalink
Raw Message
On Fri, 17 Feb 2017 21:19:56 +1300
Post by Ian Collins
Post by s***@casperkitty.com
Post by David Brown
Regarding access to POD members with no constructors or destructors, C++
treats unions mostly like C89 - if you have a union of a float and an
int, and assign to the float, you are not allowed to read out the
representation as the int. (In C89, this is implementation-defined
behaviour - in C++, it is not discussed and therefore undefined.) This
is hardly surprising, since C++ effectively forked from C89, and the
"type punning" feature of unions was not standardised until C99.
Type punning, whether via pointers or unions, is crucial in some kinds of
programming, but not needed at all in others.
One reason C++ is gaining std::variant (a type-safe union).
--
Ian
Nothing is safe from bad programmers... In the new technology field too many
people believe "new" means "better" because they don't understand how it works.
I consider C++ (and C11 too with annex K) as a regression since it postulates
the programmer is stupid.
s***@casperkitty.com
2017-02-17 17:16:53 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
Nothing is safe from bad programmers... In the new technology field too many
people believe "new" means "better" because they don't understand how it works.
I consider C++ (and C11 too with annex K) as a regression since it postulates
the programmer is stupid.
I question the value of taking badly designed library functions like
strncat and define bounds-checked versions which retain the original
problems, rather than providing well-designed bounds-checked versions.
I also think that in many cases a model where functions behave safely
[e.g. clipping an over-length string] but set a latching error flag
would likely be more amenable to optimization than one which uses
error handlers. A compiler given code like:

action1();
action2();
if (!error) ....

would be able to perform action1 and action2 in parallel, but a compiler
would have to run the actions in sequence if action1() might call an error
handler that never returns.

Nonetheless, I don't see Annex K as being nearly as harmful as the notion
that the Standard makes any effort to forbid all forms of unreasonable
behavior, and that the quality of a implementation is not reliant upon its
creator using good judgment as to what behaviors are reasonable for a
particular target platform and application field.
GOTHIER Nathan
2017-02-17 17:55:02 UTC
Permalink
Raw Message
On Fri, 17 Feb 2017 09:16:53 -0800 (PST)
Post by s***@casperkitty.com
Nonetheless, I don't see Annex K as being nearly as harmful as the notion
that the Standard makes any effort to forbid all forms of unreasonable
behavior, and that the quality of a implementation is not reliant upon its
creator using good judgment as to what behaviors are reasonable for a
particular target platform and application field.
Like a security belt, bounds-checked functions provide a false security and can
be source of collateral domages. They have been included in C11 only because of
Microsoft's bad programmers. Consequently, the C committee failed in providing
de facto standard to C programmers.
Ian Collins
2017-02-17 19:07:39 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
On Fri, 17 Feb 2017 21:19:56 +1300
Post by Ian Collins
Post by s***@casperkitty.com
Post by David Brown
Regarding access to POD members with no constructors or destructors, C++
treats unions mostly like C89 - if you have a union of a float and an
int, and assign to the float, you are not allowed to read out the
representation as the int. (In C89, this is implementation-defined
behaviour - in C++, it is not discussed and therefore undefined.) This
is hardly surprising, since C++ effectively forked from C89, and the
"type punning" feature of unions was not standardised until C99.
Type punning, whether via pointers or unions, is crucial in some kinds of
programming, but not needed at all in others.
One reason C++ is gaining std::variant (a type-safe union).
Nothing is safe from bad programmers... In the new technology field too many
people believe "new" means "better" because they don't understand how it works.
I consider C++ (and C11 too with annex K) as a regression since it postulates
the programmer is stupid.
That response proves it...
--
Ian
GOTHIER Nathan
2017-02-17 19:12:27 UTC
Permalink
Raw Message
On Sat, 18 Feb 2017 08:07:39 +1300
Post by Ian Collins
Post by GOTHIER Nathan
Nothing is safe from bad programmers... In the new technology field too many
people believe "new" means "better" because they don't understand how it works.
I consider C++ (and C11 too with annex K) as a regression since it postulates
the programmer is stupid.
That response proves it...
--
Ian
Your anger against my answer only prove the fact that you need diapers to write
C-like programs with C++. :-P
Ian Collins
2017-02-17 22:04:41 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
On Sat, 18 Feb 2017 08:07:39 +1300
Post by Ian Collins
Post by GOTHIER Nathan
Nothing is safe from bad programmers... In the new technology field too many
people believe "new" means "better" because they don't understand how it works.
I consider C++ (and C11 too with annex K) as a regression since it postulates
the programmer is stupid.
That response proves it...
{please don't quote signatures}
Post by GOTHIER Nathan
Your anger against my answer only prove the fact that you need diapers to write
C-like programs with C++. :-P
There was no anger and we don't have diapers in my part of the world.

While there wasn't any anger there is incredulity that some programmers
continue to resist language features designed to make programming both
easier and safer. Mind you I'm still amazed at the resistance I
encounter to writing unit tests.

I guess there will always be people who refuse to wear their seat-belts
when they drive, so I shouldn't really be surprised at anything these days.
--
Ian
BartC
2017-02-17 22:38:30 UTC
Permalink
Raw Message
Post by Ian Collins
Post by GOTHIER Nathan
Your anger against my answer only prove the fact that you need diapers to write
C-like programs with C++. :-P
There was no anger and we don't have diapers in my part of the world.
While there wasn't any anger there is incredulity that some programmers
continue to resist language features designed to make programming both
easier and safer. Mind you I'm still amazed at the resistance I
encounter to writing unit tests.
I guess there will always be people who refuse to wear their seat-belts
when they drive, so I shouldn't really be surprised at anything these days.
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.

Otherwise there are plenty of languages that are easier and safer than
either C or C++.
--
Bartc
Ian Collins
2017-02-17 22:47:33 UTC
Permalink
Raw Message
Post by BartC
Post by Ian Collins
Post by GOTHIER Nathan
Your anger against my answer only prove the fact that you need diapers to write
C-like programs with C++. :-P
There was no anger and we don't have diapers in my part of the world.
While there wasn't any anger there is incredulity that some programmers
continue to resist language features designed to make programming both
easier and safer. Mind you I'm still amazed at the resistance I
encounter to writing unit tests.
I guess there will always be people who refuse to wear their seat-belts
when they drive, so I shouldn't really be surprised at anything these days.
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
Otherwise there are plenty of languages that are easier and safer than
either C or C++.
But they aren't used in the same space.
--
Ian
s***@casperkitty.com
2017-02-17 22:55:10 UTC
Permalink
Raw Message
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put. Unfortunately, to continue the analogy, some car makers
seem to feel that any seat belts which didn't meet the official safety
standards should be considered useless, and a car should make itself "more
efficient" by eliminating them altogether.
Ian Collins
2017-02-17 23:11:38 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
--
Ian
Chris M. Thomasson
2017-02-17 23:26:07 UTC
Permalink
Raw Message
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
Must think about a reason why some want to keep the corks on the end of
all the forks in a language as a whole. Well, we have to keep unsafe
programmers safe, and C is too dangerous for basically any programmer to
use correctly; Some postulate: C is far too great in its generosity wrt
access to programmability. Imvvho, that type of mentality should not be
used to destroy C, and/or C++.
s***@casperkitty.com
2017-02-18 00:15:31 UTC
Permalink
Raw Message
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
The problem is that modern C compilers will sometimes use the fact that a
code path doesn't check for and avoid overflow in one computation whose
result will end up being irrelevant to requirements as justification for
omitting programmer-supplied overflow checks in cases where they would
actually matter.

For example, if a programmer needs a function which returns x*y+L if
x and y are in the range +/- 10,000, or any value otherwise, the code most
compilers would naturally generate given:

long long mulcomp(int x, int y, long long z) { return x*y + z; }

would satisfy that requirement except in cases where the compiler percolates
overflow-based assumptions to the surrounding code. Depending upon nearby
register usage, it might be faster for the compiler to use a multiply
instruction that yields a 32-bit result and sign-extend it, or use one that
yields a 64-bit result, and given the application requirements it would be
helpful to let the compiler choose whichever one it saw fit.

The problem is that if modern C compilers happen to realize that x would
need to be within a certain range to prevent overflow, they may decide to
omit checks elsewhere that would try to ensure x doesn't exceed 60,000.
The only way a programmer can ensure that a compiler won't omit the checks
that are actually needed would be to make the code needlessly slow and
clunky in cases where it shouldn't matter.
Ian Collins
2017-02-18 00:22:47 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
The problem is that modern C compilers will sometimes use the fact that a
code path doesn't check for and avoid overflow in one computation whose
result will end up being irrelevant to requirements as justification for
omitting programmer-supplied overflow checks in cases where they would
actually matter.
That wasn't the topic of this sub-thread.
--
Ian
s***@casperkitty.com
2017-02-18 00:37:48 UTC
Permalink
Raw Message
Post by Ian Collins
Post by s***@casperkitty.com
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
The problem is that modern C compilers will sometimes use the fact that a
code path doesn't check for and avoid overflow in one computation whose
result will end up being irrelevant to requirements as justification for
omitting programmer-supplied overflow checks in cases where they would
actually matter.
That wasn't the topic of this sub-thread.
It is precisely the subject of my earlier analogy: the compiler decides
to eliminate the safety-check code (seatbelt) because it decides that there
isn't enough safety-check code to satisfy it, even though there would have
been enough to meet requirements if the compiler had just left it alone.
David Brown
2017-02-18 10:57:08 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Ian Collins
Post by s***@casperkitty.com
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
The problem is that modern C compilers will sometimes use the fact that a
code path doesn't check for and avoid overflow in one computation whose
result will end up being irrelevant to requirements as justification for
omitting programmer-supplied overflow checks in cases where they would
actually matter.
That wasn't the topic of this sub-thread.
It is precisely the subject of my earlier analogy: the compiler decides
to eliminate the safety-check code (seatbelt) because it decides that there
isn't enough safety-check code to satisfy it, even though there would have
been enough to meet requirements if the compiler had just left it alone.
You /think/ it is precisely the subject of /every/ thread in this group.
We've heard it all before.

And do you write your shopping lists like your function specifications?
"If the bananas are not too green, buy 4 - if not, any number of bananas
will do".
Richard Heathfield
2017-02-18 07:13:51 UTC
Permalink
Raw Message
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
What a terrible analogy!

A computer programmer has many advantages over a driver. He can drive
the journey in perfect safety a million times, creating all kinds of
artificial hazards for his car to negotiate. Whenever there is a smash
and virtual bodies end up strewn over the landscape, he can chuckle to
himself that he's glad he's not an analogy while he investigates which
particular hazard caused which particular problem. Then the virtual
corpses pick themselves up, dust themselves down, and get willingly back
into the car for another crack at it.

They say the coward dies a thousand deaths, but the test dummy is much
braver, and dies a million.

Only when they can safely negotiate every hazard he can imagine (and
quite a few that his pals Janet and Charlie can imagine because their
imaginations are more fevered than his) will he even consider letting
the car go into production.

Next you'll be saying that assertions are like lifeboats. (And I'm ready
for you if you do!)

Unit tests, young man! Learn all about unit tests! Then you'll find that
you crash a lot more than before, but you will do so in the safety of
the lab, where men are fake men, women are fake women, and small furry
creatures from Alpha Centauri are /fake/ small furry creatures from
Alpha Centauri.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29 July 1999
No small, furry creatures from Alpha Centauri were harmed
during the writing of this article.
Ian Collins
2017-02-18 07:26:25 UTC
Permalink
Raw Message
Post by Richard Heathfield
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
What a terrible analogy!
Well maybe, but it didn't start out as an analogy, more of a reflection
on my continuing amazement at the stupidity of the human race :) You
can lead a horse to water and all that...
Post by Richard Heathfield
A computer programmer has many advantages over a driver. He can drive
the journey in perfect safety a million times, creating all kinds of
artificial hazards for his car to negotiate. Whenever there is a smash
and virtual bodies end up strewn over the landscape, he can chuckle to
himself that he's glad he's not an analogy while he investigates which
particular hazard caused which particular problem. Then the virtual
corpses pick themselves up, dust themselves down, and get willingly back
into the car for another crack at it.
They say the coward dies a thousand deaths, but the test dummy is much
braver, and dies a million.
Only when they can safely negotiate every hazard he can imagine (and
quite a few that his pals Janet and Charlie can imagine because their
imaginations are more fevered than his) will he even consider letting
the car go into production.
Next you'll be saying that assertions are like lifeboats. (And I'm ready
for you if you do!)
Unit tests, young man! Learn all about unit tests! Then you'll find that
you crash a lot more than before, but you will do so in the safety of
the lab, where men are fake men, women are fake women, and small furry
creatures from Alpha Centauri are /fake/ small furry creatures from
Alpha Centauri.
Did you miss: "Mind you I'm still amazed at the resistance I
encounter to writing unit tests."

:)
--
Ian
Richard Heathfield
2017-02-18 09:17:04 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Richard Heathfield
Post by Ian Collins
Post by s***@casperkitty.com
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
More importantly, it might require using some kind of restraint which doesn't
meet all the applicable safety standards that would be applicable to seat
belts, but would nonetheless meet the requirements of the purpose to which
they would be put.
At least if the seat belt, even a compromised one, is fitted the driver
has a choice. If it's not there, no matter how careful the driver, he
is still at risk...
What a terrible analogy!
Well maybe, but it didn't start out as an analogy, more of a reflection
on my continuing amazement at the stupidity of the human race :) You
can lead a horse to water and all that...
Post by Richard Heathfield
A computer programmer has many advantages over a driver. He can drive
the journey in perfect safety a million times, creating all kinds of
artificial hazards for his car to negotiate. Whenever there is a smash
and virtual bodies end up strewn over the landscape, he can chuckle to
himself that he's glad he's not an analogy while he investigates which
particular hazard caused which particular problem. Then the virtual
corpses pick themselves up, dust themselves down, and get willingly back
into the car for another crack at it.
They say the coward dies a thousand deaths, but the test dummy is much
braver, and dies a million.
Only when they can safely negotiate every hazard he can imagine (and
quite a few that his pals Janet and Charlie can imagine because their
imaginations are more fevered than his) will he even consider letting
the car go into production.
Next you'll be saying that assertions are like lifeboats. (And I'm ready
for you if you do!)
Unit tests, young man! Learn all about unit tests! Then you'll find that
you crash a lot more than before, but you will do so in the safety of
the lab, where men are fake men, women are fake women, and small furry
creatures from Alpha Centauri are /fake/ small furry creatures from
Alpha Centauri.
Did you miss: "Mind you I'm still amazed at the resistance I
encounter to writing unit tests."
I am well aware of your penchant for unit tests. That's *precisely* why
I replied as I did.
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within
GOTHIER Nathan
2017-02-18 00:18:07 UTC
Permalink
Raw Message
On Fri, 17 Feb 2017 22:38:30 +0000
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety. (Benjamin FRANKLIN, November 11
1755)
Post by BartC
Otherwise there are plenty of languages that are easier and safer than
either C or C++.
Indeed. Some programmers should give up C and use so-called safe languages like
Basic or LISP.
Ian Collins
2017-02-18 00:23:43 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
On Fri, 17 Feb 2017 22:38:30 +0000
Post by BartC
People use C for various reasons including extra control, and that might
require not wearing a seat-belt.
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety. (Benjamin FRANKLIN, November 11
1755)
In this instance no liberty is being surrendered.
--
Ian
GOTHIER Nathan
2017-02-18 00:45:57 UTC
Permalink
Raw Message
On Sat, 18 Feb 2017 13:23:43 +1300
Post by Ian Collins
In this instance no liberty is being surrendered.
I agree nobody is forced to implement annex K in a C11 conforming compiler. I
only observe the fact the C committee pushed forward a confidential
implementation from Microsoft as annex to the C standard while pretending to
build the C standard on wide spread practices. Similarly, I'm not forced to use
those so-called safe functions then why the C committee made the decision to
write the annex K if not by the lobbying of Microsoft bad programmers?

You should admit that whatever security features you can provide to
programmers, bad behaviors hurt in programs and in life in general. These
security features looks like magic pills to cure fat people without teaching
them how to feed in a healthy manner their body. Once more, hiding bugs under
the carpet makes their correction more difficult.
Ian Collins
2017-02-18 03:18:04 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
On Sat, 18 Feb 2017 13:23:43 +1300
Post by Ian Collins
In this instance no liberty is being surrendered.
I agree nobody is forced to implement annex K in a C11 conforming compiler.
I wasn't really considering annex K in of C11 when it comes to adding
safety features, I was thinking of the bits of C++ that allow one to
write safer code. I the Microsoft so called safe functions for be more
of an unattached seat belt!
Post by GOTHIER Nathan
You should admit that whatever security features you can provide to
programmers, bad behaviors hurt in programs and in life in general. These
security features looks like magic pills to cure fat people without teaching
them how to feed in a healthy manner their body. Once more, hiding bugs under
the carpet makes their correction more difficult.
If then language stops me mixing apples and oranges or helps me to
automate resource management, I'm all for it.
--
Ian
Richard Heathfield
2017-02-18 07:19:50 UTC
Permalink
Raw Message
On 18/02/17 03:18, Ian Collins wrote:
<snip>
Post by Ian Collins
If then language stops me mixing apples and oranges
There's nothing so terribly wrong about mixing apples and oranges. It
makes for a rather delicious fruit drink.
Post by Ian Collins
or helps me to
automate resource management, I'm all for it.
You know what they say: Lisp is for people who think resource management
is far too important to be left to the programmer, and C is for people
who think resource management is far too important to be left to the
machine.
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within
Ian Collins
2017-02-18 07:30:29 UTC
Permalink
Raw Message
Post by Ben Bacarisse
<snip>
Post by Ian Collins
If then language stops me mixing apples and oranges
There's nothing so terribly wrong about mixing apples and oranges. It
makes for a rather delicious fruit drink.
Until you teach into that bag of tasty apples and bite into an orange...
Post by Ben Bacarisse
Post by Ian Collins
or helps me to
automate resource management, I'm all for it.
You know what they say: Lisp is for people who think resource management
is far too important to be left to the programmer, and C is for people
who think resource management is far too important to be left to the
machine.
And C++ sits on the fence!
--
Ian
Richard Heathfield
2017-02-18 09:18:21 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Ben Bacarisse
<snip>
Post by Ian Collins
If then language stops me mixing apples and oranges
There's nothing so terribly wrong about mixing apples and oranges. It
makes for a rather delicious fruit drink.
Until you teach into that bag of tasty apples and bite into an orange...
You have to mix them in the right way. Unit test your fruit. If you find
yourself biting into an orange, re-arrange your universe a little and
re-test.
Post by Ian Collins
Post by Ben Bacarisse
Post by Ian Collins
or helps me to
automate resource management, I'm all for it.
You know what they say: Lisp is for people who think resource management
is far too important to be left to the programmer, and C is for people
who think resource management is far too important to be left to the
machine.
And C++ sits on the fence!
Er, quite so.
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within
Malcolm McLean
2017-02-18 15:15:44 UTC
Permalink
Raw Message
Post by Ian Collins
And C++ sits on the fence!
C++ is actually quite clever in that it has evolved into what is
in effect a garbage-collected language without ever actually having
to mandate GC machinery.
Modern C++ programmers don't use new or delete, or write destructors,
its all done by std:: objects, like containers and smart pointers.
GOTHIER Nathan
2017-02-18 07:43:01 UTC
Permalink
Raw Message
On Sat, 18 Feb 2017 07:19:50 +0000
Post by Richard Heathfield
You know what they say: Lisp is for people who think resource management
is far too important to be left to the programmer, and C is for people
who think resource management is far too important to be left to the
machine.
This make me think about some people prefer to take a bus instead of driving
a car because they are not so confident in their own skills... :o)
Richard Heathfield
2017-02-18 09:22:01 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
On Sat, 18 Feb 2017 07:19:50 +0000
Post by Richard Heathfield
You know what they say: Lisp is for people who think resource management
is far too important to be left to the programmer, and C is for people
who think resource management is far too important to be left to the
machine.
This make me think about some people prefer to take a bus instead of driving
a car because they are not so confident in their own skills... :o)
And in many cases they're right. In fact, they don't have to know how to
drive /at all/ if they take the bus - as long as they don't mind the
journey taking a bit longer than it would by car.

Also, the bus may be cheaper than a car if one doesn't travel very
often, and it means the roads are less cluttered with cars, making the
journey more pleasant for those who do know how to drive cars.

A fine analogy, in fact --- but beware! Most buses don't supply seat-belts.
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within
s***@casperkitty.com
2017-02-18 17:44:38 UTC
Permalink
Raw Message
Post by Richard Heathfield
You know what they say: Lisp is for people who think resource management
is far too important to be left to the programmer, and C is for people
who think resource management is far too important to be left to the
machine.
I'd say LISP is for people who prefer to work with nested collections of
values, as values, rather than as a type of resource. Ritchie's language
is for people who want the power to manage memory as a resource which can
be repurposed as the programmer sees fit, even though that means that only
fixed-sized objects can behave as values.
BartC
2017-02-18 11:33:34 UTC
Permalink
Raw Message
Post by Ian Collins
Post by GOTHIER Nathan
On Sat, 18 Feb 2017 13:23:43 +1300
Post by Ian Collins
In this instance no liberty is being surrendered.
I agree nobody is forced to implement annex K in a C11 conforming compiler.
I wasn't really considering annex K in of C11 when it comes to adding
safety features, I was thinking of the bits of C++ that allow one to
write safer code. I the Microsoft so called safe functions for be more
of an unattached seat belt!
Post by GOTHIER Nathan
You should admit that whatever security features you can provide to
programmers, bad behaviors hurt in programs and in life in general. These
security features looks like magic pills to cure fat people without teaching
them how to feed in a healthy manner their body. Once more, hiding bugs under
the carpet makes their correction more difficult.
If then language stops me mixing apples and oranges or helps me to
automate resource management, I'm all for it.
How does C++ manage that? Can it literally stop you mixing apples and
oranges as easily as you can in Pascal or Ada?
--
Bartc
David Brown
2017-02-18 12:25:13 UTC
Permalink
Raw Message
Post by BartC
Post by Ian Collins
Post by GOTHIER Nathan
On Sat, 18 Feb 2017 13:23:43 +1300
Post by Ian Collins
In this instance no liberty is being surrendered.
I agree nobody is forced to implement annex K in a C11 conforming compiler.
I wasn't really considering annex K in of C11 when it comes to adding
safety features, I was thinking of the bits of C++ that allow one to
write safer code. I the Microsoft so called safe functions for be more
of an unattached seat belt!
Post by GOTHIER Nathan
You should admit that whatever security features you can provide to
programmers, bad behaviors hurt in programs and in life in general. These
security features looks like magic pills to cure fat people without teaching
them how to feed in a healthy manner their body. Once more, hiding bugs under
the carpet makes their correction more difficult.
If then language stops me mixing apples and oranges or helps me to
automate resource management, I'm all for it.
How does C++ manage that? Can it literally stop you mixing apples and
oranges as easily as you can in Pascal or Ada?
Since we are talking about programming languages, we cannot possibly be
"literally" mixing apples and oranges. But you are not the first person
to misuse the word "literally".

Here is a quick example - real life C++ code would probably use
templates to avoid duplication when creating a variety of fruit, and may
allow more advanced things like Richard's fruit salads. But it shows
the point:

class apples {
int x;
public:
apples(int y = 0) : x(y) {}
apples operator + (apples y) { return x + y.x; }
};

class oranges {
int x;
public:
oranges(int y = 0) : x(y) {}
oranges operator + (oranges y) { return x + y.x; }
};

apples addApples(apples a, apples b) {
return a + b;
}

oranges addOranges(oranges a, oranges b) {
return a + b;
}

apples mix(apples a, oranges b) {
return a + b;
// ERROR!! no match for "operator + "
}
BartC
2017-02-18 13:47:37 UTC
Permalink
Raw Message
Post by David Brown
Post by BartC
Post by Ian Collins
If then language stops me mixing apples and oranges or helps me to
automate resource management, I'm all for it.
How does C++ manage that? Can it literally stop you mixing apples and
oranges as easily as you can in Pascal or Ada?
Since we are talking about programming languages, we cannot possibly be
"literally" mixing apples and oranges. But you are not the first person
to misuse the word "literally".
And since we /are/ talking about programming languages, it should be
clear that I mean representations of apples and oranges, rather than
using them metaphorically to mean any kind of disparate data.
Post by David Brown
Here is a quick example - real life C++ code would probably use
templates to avoid duplication when creating a variety of fruit, and may
allow more advanced things like Richard's fruit salads. But it shows
class apples {
int x;
apples(int y = 0) : x(y) {}
apples operator + (apples y) { return x + y.x; }
};
class oranges {
int x;
oranges(int y = 0) : x(y) {}
oranges operator + (oranges y) { return x + y.x; }
};
apples addApples(apples a, apples b) {
return a + b;
}
oranges addOranges(oranges a, oranges b) {
return a + b;
}
apples mix(apples a, oranges b) {
return a + b;
// ERROR!! no match for "operator + "
}
And this is rather what I expected. C++ doesn't directly stop you mixing
apples and oranges, but you have invent and then implement a way of
doing that.

Now the problem domain shifts from your user program, where you are
concerned about type-safety, to implementing a solution to that
type-safety which itself is subject to its own sets of errors. And you
seem to be suggesting yet a further shift into template programming to
get around /that/ (because devising templates is completely foolproof).

So C++ doesn't really provide much of a practical solution, unless
having to DIY half the compiler is a solution. What you want is a
language which takes care of this stuff for you:

type Apples is new Integer;
type Oranges is new Integer;
a,b:Apples;
x,y:Oranges;
....
x := 5; -- 5 oranges
a := x; -- Type Error

So leave this sort of thing to Ada I think.
--
bartc
Jean-Marc Bourguet
2017-02-18 14:23:27 UTC
Permalink
Raw Message
Post by BartC
And this is rather what I expected. C++ doesn't directly stop you mixing
apples and oranges,
Considering that for most practical purposes, C is a subset of C++ (the
major difference is that void* are not implicitly convertible to other
pointer types), that wasn't unexpected.
Post by BartC
but you have invent and then implement a way of doing that.
C++ tend to give importance to provide building bricks and let you -- or
library providers -- build the usable stuff.
Post by BartC
So C++ doesn't really provide much of a practical solution, unless having
to DIY half the compiler is a solution. What you want is a language which
type Apples is new Integer;
type Oranges is new Integer;
a,b:Apples;
x,y:Oranges;
....
x := 5; -- 5 oranges
a := x; -- Type Error
So leave this sort of thing to Ada I think.
Well, I don't know how to write an Ada library which would correctly check
units (a time multiplied by an acceleration gives a speed) -- but I admit
that I haven't written any Ada code for 15 years --, I do know how to build
such a C++ library, and there are some available.

Considering the sophistication of C++ libraries, the fact that something
like this is no in very common usage says more about the people using C++
than about the language.

Yours,
--
Jean-Marc
BartC
2017-02-18 14:53:21 UTC
Permalink
Raw Message
Post by Jean-Marc Bourguet
Post by BartC
And this is rather what I expected. C++ doesn't directly stop you mixing
apples and oranges,
Considering that for most practical purposes, C is a subset of C++ (the
major difference is that void* are not implicitly convertible to other
pointer types), that wasn't unexpected.
Post by BartC
but you have invent and then implement a way of doing that.
C++ tend to give importance to provide building bricks and let you -- or
library providers -- build the usable stuff.
Post by BartC
So C++ doesn't really provide much of a practical solution, unless having
to DIY half the compiler is a solution. What you want is a language which
type Apples is new Integer;
type Oranges is new Integer;
a,b:Apples;
x,y:Oranges;
....
x := 5; -- 5 oranges
a := x; -- Type Error
So leave this sort of thing to Ada I think.
Well, I don't know how to write an Ada library which would correctly check
units (a time multiplied by an acceleration gives a speed) -- but I admit
that I haven't written any Ada code for 15 years --, I do know how to build
such a C++ library, and there are some available.
Have a look at Frink (https://frinklang.org/). I think this is the same
one I came across some years ago, which takes units, dimensions,
prefixes and so on to the extreme.

Once you see what is involved in doing this stuff comprehensively, you
will just feel liking giving up. You will find that, after all, a simple
'double' or 'int' will do for most things!
--
Bartc
BartC
2017-02-18 16:56:44 UTC
Permalink
Raw Message
Post by BartC
So C++ doesn't really provide much of a practical solution, unless
having to DIY half the compiler is a solution. What you want is a
type Apples is new Integer;
type Oranges is new Integer;
a,b:Apples;
x,y:Oranges;
....
x := 5; -- 5 oranges
a := x; -- Type Error
So leave this sort of thing to Ada I think.
BTW I tested with this handy site http://rextester.com/.

That shows Hello World in one of some 40 languages, and lets you modify
the code then compile and run the result within the browser. (No messy
installation that then doesn't work!)

(One interesting aspect is how long other languages take to compile even
Hello World compared with C (the 'vc' version excepted). That's those
that require compilation.

A very rough measure but there must be reasons for C++, D and so on are
taking longer.)
--
Bartc
Robert Wessel
2017-02-19 04:50:05 UTC
Permalink
Raw Message
Post by BartC
Post by David Brown
Post by BartC
Post by Ian Collins
If then language stops me mixing apples and oranges or helps me to
automate resource management, I'm all for it.
How does C++ manage that? Can it literally stop you mixing apples and
oranges as easily as you can in Pascal or Ada?
Since we are talking about programming languages, we cannot possibly be
"literally" mixing apples and oranges. But you are not the first person
to misuse the word "literally".
And since we /are/ talking about programming languages, it should be
clear that I mean representations of apples and oranges, rather than
using them metaphorically to mean any kind of disparate data.
Post by David Brown
Here is a quick example - real life C++ code would probably use
templates to avoid duplication when creating a variety of fruit, and may
allow more advanced things like Richard's fruit salads. But it shows
class apples {
int x;
apples(int y = 0) : x(y) {}
apples operator + (apples y) { return x + y.x; }
};
class oranges {
int x;
oranges(int y = 0) : x(y) {}
oranges operator + (oranges y) { return x + y.x; }
};
apples addApples(apples a, apples b) {
return a + b;
}
oranges addOranges(oranges a, oranges b) {
return a + b;
}
apples mix(apples a, oranges b) {
return a + b;
// ERROR!! no match for "operator + "
}
And this is rather what I expected. C++ doesn't directly stop you mixing
apples and oranges, but you have invent and then implement a way of
doing that.
Now the problem domain shifts from your user program, where you are
concerned about type-safety, to implementing a solution to that
type-safety which itself is subject to its own sets of errors. And you
seem to be suggesting yet a further shift into template programming to
get around /that/ (because devising templates is completely foolproof).
OTOH, the template only needs to be written once.
GOTHIER Nathan
2017-02-19 08:45:29 UTC
Permalink
Raw Message
On Sat, 18 Feb 2017 22:50:05 -0600
Post by Robert Wessel
OTOH, the template only needs to be written once.
Like any structure based object with inheritance.
David Brown
2017-02-19 12:35:59 UTC
Permalink
Raw Message
Post by BartC
Post by David Brown
Post by Ian Collins
If then language stops me mixing apples and oranges or helps me to
automate resource management, I'm all for it.
Here is a quick example - real life C++ code would probably use
templates to avoid duplication when creating a variety of fruit, and may
allow more advanced things like Richard's fruit salads. But it shows
class apples {
int x;
apples(int y = 0) : x(y) {}
apples operator + (apples y) { return x + y.x; }
};
class oranges {
int x;
oranges(int y = 0) : x(y) {}
oranges operator + (oranges y) { return x + y.x; }
};
apples addApples(apples a, apples b) {
return a + b;
}
oranges addOranges(oranges a, oranges b) {
return a + b;
}
apples mix(apples a, oranges b) {
return a + b;
// ERROR!! no match for "operator + "
}
And this is rather what I expected. C++ doesn't directly stop you mixing
apples and oranges, but you have invent and then implement a way of
doing that.
C++ lets you define apples and oranges in a way that stops them being
mixed. No language can directly stop you mixing apples and oranges
unless it directly supports apples and oranges.

If you prefer, you can use time units which are part of the standard
library, and don't need to be defined manually. Below, the first
function is typesafe and allowed, the second function produces a
compile-time error.

#include <chrono>

using namespace std;

chrono::seconds addASecond1(chrono::seconds timetaken) {
return timetaken + chrono::seconds(1);
}

chrono::seconds addASecond2(chrono::seconds timetaken) {
return timetaken + 1;
}


Out of the box, C++ cannot add seconds and integers - even though
"seconds" is in many ways just a kind of integer.
Post by BartC
Now the problem domain shifts from your user program, where you are
concerned about type-safety, to implementing a solution to that
type-safety which itself is subject to its own sets of errors. And you
seem to be suggesting yet a further shift into template programming to
get around /that/ (because devising templates is completely foolproof).
Template programming just saves a good deal of typing in cases like
this. That is all.
Post by BartC
So C++ doesn't really provide much of a practical solution, unless
having to DIY half the compiler is a solution. What you want is a
type Apples is new Integer;
type Oranges is new Integer;
a,b:Apples;
x,y:Oranges;
....
x := 5; -- 5 oranges
a := x; -- Type Error
So leave this sort of thing to Ada I think.
It would be nice if C++ made it that simple, but that does not mean that
C++ is not a practical solution. Often you will want to add more
features to your classes - your apples will have a "makeAPie()" method,
while your oranges will have"squeezeOutTheJuice()" and "mixWithVodka()"
methods. By the time you have made a full class out of them, then the
difference is smaller. You also might have very different requirements
and restrictions about your apples, compared to integers - it makes
sense to divide an apple by an integer, but not to divide an integer by
an apple. Again, if you want to express every feature and restriction,
you need to apply the effort in Ada or C++.

However, it would be nice if there some more convenient helpers and
classes for C++ out of the box. I guess they will come with C++20, once
concepts are part of the standard.
James R. Kuyper
2017-02-18 16:24:36 UTC
Permalink
Raw Message
...
Post by David Brown
Post by BartC
How does C++ manage that? Can it literally stop you mixing apples and
oranges as easily as you can in Pascal or Ada?
Since we are talking about programming languages, we cannot possibly be
"literally" mixing apples and oranges. But you are not the first person
to misuse the word "literally".
Did he misuse it? He applied it to "stop", not to "apples and oranges".
It seems to me that he's asking you whether C++ can actually stop the
comparison. You demonstrated how C++ can make it difficult to do such
things. However, writing a useful class that wraps a count of the number
of apples in such a way that it cannot be added to the count of oranges
is difficult, mainly because it will almost certainly need a way for the
user to access the actual number, as an unwrapped integer value. Once
it's been unwrapped, they can do anything they want to with that number,
Robert Wessel
2017-02-19 04:48:26 UTC
Permalink
Raw Message
On Sat, 18 Feb 2017 11:24:36 -0500, "James R. Kuyper"
Post by James R. Kuyper
...
Post by David Brown
Post by BartC
How does C++ manage that? Can it literally stop you mixing apples and
oranges as easily as you can in Pascal or Ada?
Since we are talking about programming languages, we cannot possibly be
"literally" mixing apples and oranges. But you are not the first person
to misuse the word "literally".
Did he misuse it? He applied it to "stop", not to "apples and oranges".
It seems to me that he's asking you whether C++ can actually stop the
comparison. You demonstrated how C++ can make it difficult to do such
things. However, writing a useful class that wraps a count of the number
of apples in such a way that it cannot be added to the count of oranges
is difficult, mainly because it will almost certainly need a way for the
user to access the actual number, as an unwrapped integer value. Once
it's been unwrapped, they can do anything they want to with that number,
OTOH, it's not possible to stop a sufficiently determined programmer
from developing a conversion from a counting type Apple to a counting
type Orange, no matter what the language does. For example, you could
binary search an Apple count, while simultaneously applying the same
adjustments to an Orange count. At the end of the process you've
converted an Apple type to an Orange type.
Malcolm McLean
2017-02-18 18:04:14 UTC
Permalink
Raw Message
Post by David Brown
Here is a quick example - real life C++ code would probably use
templates to avoid duplication when creating a variety of fruit, and may
allow more advanced things like Richard's fruit salads. But it shows
class apples {
int x;
apples(int y = 0) : x(y) {}
apples operator + (apples y) { return x + y.x; }
};
class oranges {
int x;
oranges(int y = 0) : x(y) {}
oranges operator + (oranges y) { return x + y.x; }
};
apples addApples(apples a, apples b) {
return a + b;
}
oranges addOranges(oranges a, oranges b) {
return a + b;
}
apples mix(apples a, oranges b) {
return a + b;
// ERROR!! no match for "operator + "
}
But apples and oranges are here both simply wrappers for int. Presumably
there's a need to ban operations other than addition, and two programmers
in different parts of the same large program have independently realised
that this is a requirement, and devised their own solutions.
And they don't fit together. Not because it's unsafe to add apples
and oranges, but because that's a distinction the compiler can make.

Bit-based programming does away with this. An "apple" is a 32 bit int,
and the bits mean something, which bit-based programming says you
must identify. The comparison with C++ is a bit unfair as it's only
a scrap example snippet, but the rule hasn't been followed. We
don't know what the (presumed) 32 bits of apples.x actually mean,
we don't know why it is allowed to add but not subtract or multiply
them.
David Brown
2017-02-19 16:12:30 UTC
Permalink
Raw Message
Post by Malcolm McLean
Post by David Brown
Here is a quick example - real life C++ code would probably use
templates to avoid duplication when creating a variety of fruit, and may
allow more advanced things like Richard's fruit salads. But it shows
class apples {
int x;
apples(int y = 0) : x(y) {}
apples operator + (apples y) { return x + y.x; }
};
class oranges {
int x;
oranges(int y = 0) : x(y) {}
oranges operator + (oranges y) { return x + y.x; }
};
apples addApples(apples a, apples b) {
return a + b;
}
oranges addOranges(oranges a, oranges b) {
return a + b;
}
apples mix(apples a, oranges b) {
return a + b;
// ERROR!! no match for "operator + "
}
But apples and oranges are here both simply wrappers for int. Presumably
there's a need to ban operations other than addition, and two programmers
in different parts of the same large program have independently realised
that this is a requirement, and devised their own solutions.
That is a problem for the management of the project - not a programming
language.
Post by Malcolm McLean
And they don't fit together. Not because it's unsafe to add apples
and oranges, but because that's a distinction the compiler can make.
It is unsafe to add apples and oranges - therefore we tell the compiler
to make the distinction.
Post by Malcolm McLean
Bit-based programming does away with this. An "apple" is a 32 bit int,
and the bits mean something, which bit-based programming says you
must identify.
Your solution to getting compiler help in avoiding logical mistakes such
as mixing apples and oranges, is to reduce everything to a bunch of
bits? So that now we can not only add our apples and oranges without
even a warning, we can also re-interpret the bits as floats, bit masks,
and probably even pointers. Marvellous!
Post by Malcolm McLean
The comparison with C++ is a bit unfair as it's only
a scrap example snippet, but the rule hasn't been followed. We
don't know what the (presumed) 32 bits of apples.x actually mean,
we don't know why it is allowed to add but not subtract or multiply
them.
Clearly a more complete example would allow things like subtracting
apples (perhaps with checks to avoid negative quantities), multiplying
by non-negative integers, and the construction of fruit salads. But
that would be a lot of extra code in the posting, for no benefit.
Malcolm McLean
2017-02-19 19:11:00 UTC
Permalink
Raw Message
Post by David Brown
Your solution to getting compiler help in avoiding logical mistakes such
as mixing apples and oranges, is to reduce everything to a bunch of
bits? So that now we can not only add our apples and oranges without
even a warning, we can also re-interpret the bits as floats, bit masks,
and probably even pointers. Marvellous!
Exactly.
Now you can't have type aliasing. (That's a situation where "apples" and "oranges" are separate identifiers for essentially the same thing. in
this case an integer that only supports addition).
Type alisaing is a bigger problem than type mismatching. Type mismatching
is just a bug with 99% of the time will come out on the first informal
test run. Type aliasing goes deep into the structure of the program.
Post by David Brown
Post by Malcolm McLean
The comparison with C++ is a bit unfair as it's only
a scrap example snippet, but the rule hasn't been followed. We
don't know what the (presumed) 32 bits of apples.x actually mean,
we don't know why it is allowed to add but not subtract or multiply
them.
Clearly a more complete example would allow things like subtracting
apples (perhaps with checks to avoid negative quantities), multiplying
by non-negative integers, and the construction of fruit salads. But
that would be a lot of extra code in the posting, for no benefit.
That'a fair enough. But if x is a count of apples, rather than, say,
a weight in Newtons or a mass in kilograms, the code should make
that clear. The ability to tag it with the identifier apples gives
you the illusion that you've documented it when really you haven't.
David Brown
2017-02-19 20:14:40 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by David Brown
Your solution to getting compiler help in avoiding logical mistakes
such as mixing apples and oranges, is to reduce everything to a
bunch of bits? So that now we can not only add our apples and
oranges without even a warning, we can also re-interpret the bits
as floats, bit masks, and probably even pointers. Marvellous!
Exactly.
I'm sorry - I thought the sarcasm was so obvious that I would not have
to explain it.
Post by Thiago Adams
Now you can't have type aliasing. (That's a situation where "apples"
and "oranges" are separate identifiers for essentially the same
thing. in this case an integer that only supports addition). Type
alisaing is a bigger problem than type mismatching. Type mismatching
is just a bug with 99% of the time will come out on the first
informal test run. Type aliasing goes deep into the structure of the
program.
I suppose your justification for these claims is going to be found in
the usual places, alongside your statistics proving your rule of three,
rule of two, and so on.
Post by Thiago Adams
Post by David Brown
The comparison with C++ is a bit unfair as it's only a scrap
example snippet, but the rule hasn't been followed. We don't know
what the (presumed) 32 bits of apples.x actually mean, we don't
know why it is allowed to add but not subtract or multiply them.
Clearly a more complete example would allow things like subtracting
apples (perhaps with checks to avoid negative quantities),
multiplying by non-negative integers, and the construction of fruit
salads. But that would be a lot of extra code in the posting, for
no benefit.
That'a fair enough. But if x is a count of apples, rather than, say,
a weight in Newtons or a mass in kilograms, the code should make
that clear. The ability to tag it with the identifier apples gives
you the illusion that you've documented it when really you haven't.
<sarcasm>
I see. The ability to tag it with "64-bit integer" gives you all the
documentation you need. Or I suppose it would tagged with "64-bit
float", because integers are only useful as array indices.
</sarcasm>

The point of using strong typing in situations like this is to make
logical errors in the program into compiler errors, so that they can be
caught as easily and as early as possible. Whether it is worth the
effort or not will depend on the circumstances, such as the complexity
of the code, and the balance between the effort spent writing such
classes, and the effort spent chasing bugs that they would have made
impossible. C++ gives you that choice, with greater convenience and
less cost than C.
GOTHIER Nathan
2017-02-19 20:22:09 UTC
Permalink
Raw Message
On Sun, 19 Feb 2017 21:14:40 +0100
Post by David Brown
The point of using strong typing in situations like this is to make
logical errors in the program into compiler errors, so that they can be
caught as easily and as early as possible. Whether it is worth the
effort or not will depend on the circumstances, such as the complexity
of the code, and the balance between the effort spent writing such
classes, and the effort spent chasing bugs that they would have made
impossible. C++ gives you that choice, with greater convenience and
less cost than C.
C++ is way overhyped and GTK+ proved that.
Malcolm McLean
2017-02-19 20:33:47 UTC
Permalink
Raw Message
Post by GOTHIER Nathan
On Sun, 19 Feb 2017 21:14:40 +0100
Post by David Brown
The point of using strong typing in situations like this is to make
logical errors in the program into compiler errors, so that they can be
caught as easily and as early as possible. Whether it is worth the
effort or not will depend on the circumstances, such as the complexity
of the code, and the balance between the effort spent writing such
classes, and the effort spent chasing bugs that they would have made
impossible. C++ gives you that choice, with greater convenience and
less cost than C.
C++ is way overhyped and GTK+ proved that.
To be fair GTK+ is out, Baby X is still in prototype.
Malcolm McLean
2017-02-19 20:31:11 UTC
Permalink
Raw Message
Post by David Brown
Post by Malcolm McLean
That'a fair enough. But if x is a count of apples, rather than, say,
a weight in Newtons or a mass in kilograms, the code should make
that clear. The ability to tag it with the identifier apples gives
you the illusion that you've documented it when really you haven't.
<sarcasm>
I see. The ability to tag it with "64-bit integer" gives you all the
documentation you need. Or I suppose it would tagged with "64-bit
float", because integers are only useful as array indices.
</sarcasm>
The point of using strong typing in situations like this is to make
logical errors in the program into compiler errors, so that they can be
caught as easily and as early as possible. Whether it is worth the
effort or not will depend on the circumstances, such as the complexity
of the code, and the balance between the effort spent writing such
classes, and the effort spent chasing bugs that they would have made
impossible. C++ gives you that choice, with greater convenience and
less cost than C.
Bits could logically mean anything, and there will be some situations,
such as when we have memory image of an external binary file, where
we've got to have a flexible and complex bit description language.

But mostly we know that the bits will be 64 bit integers, and those
integers will be counts of things in memory or indexes into memory
arrays. If the bits are data, they will be doubles, because most things
we can measure are continuous. We've also got strings and (Ben actually
pointed this one out) graph edges, or pointers.

The strings can't be 64 bits, everything else can be.

And now we have hardly any type problems. Someone can convert a
real to an integer, but the program will crash the instant he tries
to use it as an index, assuming a reasonable level of memory
protection.
We don't need complicated rules for specifying that if you add
apples to oranges you get a fruit salad. We just say "those bits
are the number of apples (64 bit int like all other ints), those
bits are the number or oranges. Now if you add apples to oranges you
get the number of pieces of fruit, which is maybe legitimate,
but the compiler can't hope to be able to model reality in that way.
Gareth Owen
2017-02-19 21:09:39 UTC
Permalink
Raw Message
Post by David Brown
I suppose your justification for these claims is going to be found in
the usual places, alongside your statistics proving your rule of
three, rule of two, and so on.
Yes, and if you're interested in retrieving them I can refer you to a
qualified proctologist.

Andrey Tarasevich
2017-02-16 22:48:58 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
Post by s***@casperkitty.com
The ability to compile a C program as C++ is only useful if the resulting
program will run with known semantics. There are some constructs which
have defined behavior in C but have no convenient direct equivalent in C++.
For example?
[...]
#include <stdio.h>

void foo(void) { printf("foo\n"); }
void bar(int a) { printf("bar %d\n", a); }
void baz(int x, int y) { printf("baz %d %d\n", x, y); };

int main()
{
void (*const a[3])() = { foo, bar, baz };
a[0]();
a[1](42);
a[2](4, 2);
}

Although, if I remember correctly, C is planning to obsolete the
"flexible" '()' parameter list.
--
Best regards,
Andrey Tarasevich
Keith Thompson
2017-02-16 23:46:07 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by Keith Thompson
[...]
Post by s***@casperkitty.com
The ability to compile a C program as C++ is only useful if the resulting
program will run with known semantics. There are some constructs which
have defined behavior in C but have no convenient direct equivalent in C++.
For example?
[...]
#include <stdio.h>
void foo(void) { printf("foo\n"); }
void bar(int a) { printf("bar %d\n", a); }
void baz(int x, int y) { printf("baz %d %d\n", x, y); };
int main()
{
void (*const a[3])() = { foo, bar, baz };
a[0]();
a[1](42);
a[2](4, 2);
}
Although, if I remember correctly, C is planning to obsolete the
"flexible" '()' parameter list.
Parameter lists using () have be "obsolescent" since C89/C90, and the
C99 rationale explicitly warns that they're likely to be removed from a
future standard aware of any plans by the committee to actually do so.

It is possible to do the equivalent without using old-style function
declarations. I won't claim that it's particularly convenient.

#include <stdio.h>

void foo(void) { printf("foo\n"); }
void bar(int a) { printf("bar %d\n", a); }
void baz(int x, int y) { printf("baz %d %d\n", x, y); }

int main(void) {
typedef void (func0)(void);
typedef void (func1)(int);
typedef void (func2)(int, int);
func0 *const a[] = { (func0*)foo, (func0*)bar, (func0*)baz };
((func0*)a[0])();
((func1*)a[1])(42);
((func2*)a[2])(4, 2);
}

Some might prefer to define typedefs for function pointer types rather
than function types. You could also do the whole thing without
typedefs, but IMHO they aid in readability.

I used `void()(void)` as a "generic" function type, which happens to
match the type of foo(). That might be error-prone in some cases. An
alternative is to define an otherwise unused struct type and define your
"generic" function type to take or return a value of that type, so that
all uses of the generic function type require a pointer cast.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Thiago Adams
2017-02-15 20:31:03 UTC
Permalink
Raw Message
On Wednesday, February 15, 2017 at 6:14:22 PM UTC-2, Thiago Adams wrote:
[...]
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
If the preprocessor was completely disabled and re-recreated
as new grammar rules then It would be hard to archive 100%
compatibility.
If we had 90%, probably the 10% remaining would be weird macros.

If the C language was moving into this direction, weird macros
could generate warnings preparing for the future.

#define X 1 + 2

//warning this macro can be invalid in future versions of C
Thiago Adams
2017-02-16 11:33:18 UTC
Permalink
Raw Message
[...]
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new language. C', in which only a small subset of the features of C's preprocessing is supported. C' will, inherently, be incompatible with any existing C code that makes use of any of those features not supported by C', so I assume that you're reducing the importance of compatibility with existing C code?
At the end, the idea is to move the preprocessor to parser with
options.

The parser will understand #define, #include etc... and at the point
of expansion of some macro it will expand normally and after expansion it
will check the rules if that expansion is allowed.

The compiler can have many flags.

One flag is - expand everything normally with preprocessor.
Parser will not see preprocessor tokens.

The other flags are used when the parser sees the preprocessor tokens
and do the expansions.

For each rule we can have flags

* Allow expansion of constant macro
* Allow expansion of ( expression ) macro
* Allow expansion of type-qualifier
* Allow expansion of storage-class-specifier (etc)
* Allow expansion of do statement while(0);
etc..

When we compile a file.c you can pass the flags to the compiler.

file.c
#include <stdio.h>

But when the compiler see <stdio.h> it can automatically set flags
for include directories.(flag expand this header normally)

The include is like a macro-expansion of that file using some flags
for that expansion.

Sample of grammar for expansion:

MISRA C
"C macros shall only expand to a braced initialiser, a constant,
a string literal, a parenthesised expression, a type qualifier, a
storage class specifier, or a do-while-zero construct."

initializer:
assignment-expression
{ initializer-list }
{ initializer-list , }
macro-identifier that expands to initializer

primary-expression:
identifier
constant
string-literal
( expression )
generic-selection
macro-identifier that expand to ( expression ) or constant or string-literal


type-qualifier:
const
restrict
volatile
_Atomic
macro-identifier that expands to const, restrict, volatile or _Atomic

storage-class-specifier:
typedef
extern
static
_Thread_local
auto
register
macro-identifier that expand to typedef, extern, static, _Thread_local, auto or register


iteration-statement:
while ( expression ) statement
do statement while ( expression ) ;
for ( expressionopt ; expressionopt ; expressionopt ) statement
for ( declaration expressionopt ; expressionopt ) statement
macro-identifier that expands to do statement while ( 0 ) ;



The implementation of this expansion can be done in place with context.
For instance:

#define X(a) do a++; while (0)

int i = 0;
X(i);

The parser will see X at iteration-statement
Then it expand X(i). The result of the expansion is
"do i++; while (0)"
It can try to build a temporary AST node for this expansion
using this context (i).

Then the parser check the node if it matches the rules.
Also at the expansion it can show some compile error.

Some macros are incomplete

#define BEGIN do {
#define END } while(0);

BEGIN

END

Flags:

* Allow incomplete expansion at function scope
* Allow incomplete expansion at struct-union scope
* Allow incomplete expansion at global scope

etc..

The expansion flags could be by region:

#pragma push preprocessor rules 1, 54, 54

#pragma pop preprocessor
j***@verizon.net
2017-02-15 20:47:48 UTC
Permalink
Raw Message
...
Post by Thiago Adams
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new
language. C', in which only a small subset of the features of C's
preprocessing is supported. C' will, inherently, be incompatible with any
existing C code that makes use of any of those features not supported by
C', so I assume that you're reducing the importance of compatibility with
existing C code?
Yes. Exactly.
I want compatibility with existing code.
---file.c---
#include "file.h"
#define X 1 + 2
-----------------------
Now this #defined is parsed with new grammar rules. (see alternative below)
So, existing code which contains such a #define that uses C features not supported by the new grammar rules for C' will not be compatible with C'. Correct?

So why did you say "Yes, exactly. I want compatibility with existing code." I would have expected you to express that idea by starting out with "Yes, exactly, I'm willing to sacrifice compatibility with existing code under some circumstances.", followed by an explanation of what those circumstances are.
Thiago Adams
2017-02-15 21:06:08 UTC
Permalink
Raw Message
Post by j***@verizon.net
...
Post by Thiago Adams
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new
language. C', in which only a small subset of the features of C's
preprocessing is supported. C' will, inherently, be incompatible with any
existing C code that makes use of any of those features not supported by
C', so I assume that you're reducing the importance of compatibility with
existing C code?
Yes. Exactly.
I want compatibility with existing code.
---file.c---
#include "file.h"
#define X 1 + 2
-----------------------
Now this #defined is parsed with new grammar rules. (see alternative below)
So, existing code which contains such a #define that uses C features not supported by the new grammar rules for C' will not be compatible with C'. Correct?
So why did you say "Yes, exactly. I want compatibility with existing code." I would have expected you to express that idea by starting out with "Yes, exactly, I'm willing to sacrifice compatibility with existing code under some circumstances.", followed by an explanation of what those circumstances are.
I was trying to say :
"Yes, exactly! This is the problem! But, I want compatibility."

There are two options.
I have explained in the other reply.

These options could be a compiler option and you could
try to compile in "restrict macro mode". (option 1)

If the library you are including failed to compile you could
use less restrict mode where only expansion outside include
will follow rules.

If you still have problems then the library requires
you use some weird macro in your code. You must disable
the restrict mode.


Another option could be:

#pragma classic-preprocessor on

#pragma classic-preprocessor off

(But in this case I would not be able to rebuild the file
inside this pragma)
Thiago Adams
2017-02-15 21:15:38 UTC
Permalink
Raw Message
Post by Thiago Adams
Post by j***@verizon.net
...
Post by Thiago Adams
Post by j***@verizon.net
That, on the other hand, makes it quite clear that you're proposing a new
language. C', in which only a small subset of the features of C's
preprocessing is supported. C' will, inherently, be incompatible with any
existing C code that makes use of any of those features not supported by
C', so I assume that you're reducing the importance of compatibility with
existing C code?
Yes. Exactly.
I want compatibility with existing code.
---file.c---
#include "file.h"
#define X 1 + 2
-----------------------
Now this #defined is parsed with new grammar rules. (see alternative below)
So, existing code which contains such a #define that uses C features not supported by the new grammar rules for C' will not be compatible with C'. Correct?
So why did you say "Yes, exactly. I want compatibility with existing code." I would have expected you to express that idea by starting out with "Yes, exactly, I'm willing to sacrifice compatibility with existing code under some circumstances.", followed by an explanation of what those circumstances are.
"Yes, exactly! This is the problem! But, I want compatibility."
There are two options.
I have explained in the other reply.
These options could be a compiler option and you could
try to compile in "restrict macro mode". (option 1)
If the library you are including failed to compile you could
use less restrict mode where only expansion outside include
will follow rules.
If you still have problems then the library requires
you use some weird macro in your code. You must disable
the restrict mode.
Looking at some real library, <sqlite3.h>

We can see that most of the #defines are for #if or constants.

The ones that are not #doesn't need to be used by the final
user of the library.

So:

#include <sqlite.h>


The user can use the macro SQLITE_VERSION [1] that is declared inside
sqlite.h in a very safe and compatible way.


[1]
#define SQLITE_VERSION "3.7.7.1"
Patrick.Schluter
2017-02-18 11:36:45 UTC
Permalink
Raw Message
Take a look at D language, that got rid of the preprocessor completely.

https://dlang.org/spec/spec.html

By doing that, it enabled a slew of techniques and solutions that opened
the power and awesomeness of CTFE (Compile Time Function Evaluation).

#include was replaced by an import mechanism that allowed fast compiles
(the file inclusion system of C and C++ are very heavy).

To replace most #define macros following features have been added or
expanded:
- enums: can be defined as any type and can be built programmatically at
compile time.
- mixins: allows to compose code with strings at compile time. A bit
like program Macros but at parse time (like what you propose in your post).
- static if: allows code selection at compile time (C++ constexpr if
came from a proposition made by D language creators, unfortunately C++'s
version is flawed).
This lead to beefing up the template system, which fortunately uses a
much more readable syntax than any of its concurrents, to make the
meta-programing capabilities of that language unique and very powerful
and all that, in a language very familiar to C programmers.
Patrick.Schluter
2017-02-18 11:41:42 UTC
Permalink
Raw Message
Post by Patrick.Schluter
Take a look at D language, that got rid of the preprocessor completely.
https://dlang.org/spec/spec.html
By doing that, it enabled a slew of techniques and solutions that opened
the power and awesomeness of CTFE (Compile Time Function Evaluation).
#include was replaced by an import mechanism that allowed fast compiles
(the file inclusion system of C and C++ are very heavy).
To replace most #define macros following features have been added or
- enums: can be defined as any type and can be built programmatically at
compile time.
- mixins: allows to compose code with strings at compile time. A bit
like program Macros but at parse time (like what you propose in your post).
- static if: allows code selection at compile time (C++ constexpr if
came from a proposition made by D language creators, unfortunately C++'s
version is flawed).
This lead to beefing up the template system, which fortunately uses a
much more readable syntax than any of its concurrents, to make the
meta-programing capabilities of that language unique and very powerful
and all that, in a language very familiar to C programmers.
To make things clear, I'm not advocating D per se. I'm only suggesting
to take a look at these features to see what is possible and what the
consequences or requirements are when doing something other compiler
authors have already tried (here Walter Bright).
Loading...