Discussion:
C Macros Badly Defined?
Add Reply
bartc
2017-05-15 10:46:04 UTC
Reply
Permalink
Raw Message
Take a look at this macro invocation:

#include <stdio.h>

#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2

int main(void) {

#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}

It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.

The thing is that every so often you come across troublesome macros like
this that only work on some compilers. But why should that be the case?
Are C's preprocessor and macro expansion rules really so poorly defined
that so many compilers get it wrong? (I certainly thought so when I
tried to implement a preprocessor earlier this year.)

Maybe, you can get more consistent behaviour by doing multiple passes,
so that all the #ifs are done first for example, then the macro expansion.

But then maybe someone will have a macro expansion that generates
#-directives, or other macro invocations created from parts joined
together with ## or that use #-stringifying, that gcc will somehow
manage to compile as expected! Then that becomes the benchmark for what
is expected to work.

So, does anyone actually know EXACTLY what the capabilities of the C
macro system are? Or do compilers just make them up as they go along?
With gcc in the lead. (I don't intend to make this work in my own
implementation. I believe there should be clearly-defined limits to what
is possible and what is considered reasonable.)

(This is not a made-up example; the following was posted in
comp.lang.python today in "How to install Python package from source on
If you're using 3.6, you'll have to build from source. The package has
a single C extension without external dependencies, so it should be a
straight-forward build if you have Visual Studio 2015+ installed with
the C/C++ compiler for x86. Ideally it should work straight from pip.
But I tried and it failed in 3.6.1 due to the new PySlice_GetIndicesEx
macro. Apparently MSVC doesn't like preprocessor code like this in
#if PY_MAJOR_VERSION >= 3
if (PySlice_GetIndicesEx(item, Py_SIZE(self),
#else
if (PySlice_GetIndicesEx((PySliceObject*)item, Py_SIZE(self),
#endif
&start, &stop, &step, &slicelength) < 0) {
It fails with a C1057 error (unexpected end of file in macro
expansion). The build will succeed if you copy the common line with
`&start` to each case and comment out the original line, such that the
macro invocation isn't split across an #if / #endif. This is an ugly
consequence of making PySlice_GetIndicesEx a macro. I wonder if it
could be written differently to avoid this problem.
--
bartc
Ben Bacarisse
2017-05-15 11:32:30 UTC
Reply
Permalink
Raw Message
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
Why is it surprising that tcc compiles it?
Post by bartc
The thing is that every so often you come across troublesome macros
like this that only work on some compilers. But why should that be the
case?
Compilers have bugs.
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. Do you think the
wording of the standard needs to be improved and, if so, how?

<snip>
Post by bartc
But then maybe someone will have a macro expansion that generates
#-directives
Macros can't (validly) expand to directives.

<snip>
--
Ben.
Tim Rentsch
2017-05-15 13:48:26 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
Why is it surprising that tcc compiles it?
Post by bartc
The thing is that every so often you come across troublesome macros
like this that only work on some compilers. But why should that be the
case?
Compilers have bugs.
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
Ben Bacarisse
2017-05-15 14:18:09 UTC
Reply
Permalink
Raw Message
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
I thought it was defined. Is it not?
--
Ben.
Tim Rentsch
2017-05-15 18:22:23 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
I thought it was defined. Is it not?
Section 6.10.3 paragraph 11 (discussing arguments for function-like
macro calls) says this:

[...] If there are sequences of preprocessing tokens within the
list of arguments that would otherwise act as preprocessing
directives, the behavior is undefined.

I believe this sentence applies in the code shown above, which
therefore would mean undefined behavior.
Ben Bacarisse
2017-05-15 19:40:44 UTC
Reply
Permalink
Raw Message
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
I thought it was defined. Is it not?
Section 6.10.3 paragraph 11 (discussing arguments for function-like
[...] If there are sequences of preprocessing tokens within the
list of arguments that would otherwise act as preprocessing
directives, the behavior is undefined.
I believe this sentence applies in the code shown above, which
therefore would mean undefined behavior.
I'm not sure. It definitely applies to

M(
#if C>=3
10,20,
#else
100,200,
#endif
30)

but in the example above I am not sure the #else or the #endif are there
when the arguments are being collected. The wording in the standard
talks about processing the "group" (shown in the syntax as between the
#else and the #endif) being "processed". I took that to mean that the
bounding directive are no longer there when expansion happens, but I can
see (now) that that's not the only way to take it.
--
Ben.
Tim Rentsch
2017-05-16 08:58:38 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
I thought it was defined. Is it not?
Section 6.10.3 paragraph 11 (discussing arguments for function-like
[...] If there are sequences of preprocessing tokens within the
list of arguments that would otherwise act as preprocessing
directives, the behavior is undefined.
I believe this sentence applies in the code shown above, which
therefore would mean undefined behavior.
I'm not sure. It definitely applies to
M(
#if C>=3
10,20,
#else
100,200,
#endif
30)
but in the example above I am not sure the #else or the #endif are there
when the arguments are being collected. The wording in the standard
talks about processing the "group" (shown in the syntax as between the
#else and the #endif) being "processed". I took that to mean that the
bounding directive are no longer there when expansion happens, but I can
see (now) that that's not the only way to take it.
I see. The idea that the subsequent directives might have
disappeared (ie, by the time the macro arguments are collected)
had not occurred to me. I don't have an airtight argument that
it's wrong but it does seem highly unlikely (at least it does to
me) that this is what was intended. Do you have a different
take on that?

FWIW, gcc -pedantic gives a diagnostic on the original program
(and with gcc -pedantic-errors causing the translation to fail).
Ben Bacarisse
2017-05-16 12:49:58 UTC
Reply
Permalink
Raw Message
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
I thought it was defined. Is it not?
Section 6.10.3 paragraph 11 (discussing arguments for function-like
[...] If there are sequences of preprocessing tokens within the
list of arguments that would otherwise act as preprocessing
directives, the behavior is undefined.
I believe this sentence applies in the code shown above, which
therefore would mean undefined behavior.
I'm not sure. It definitely applies to
M(
#if C>=3
10,20,
#else
100,200,
#endif
30)
but in the example above I am not sure the #else or the #endif are there
when the arguments are being collected. The wording in the standard
talks about processing the "group" (shown in the syntax as between the
#else and the #endif) being "processed". I took that to mean that the
bounding directive are no longer there when expansion happens, but I can
see (now) that that's not the only way to take it.
I see. The idea that the subsequent directives might have
disappeared (ie, by the time the macro arguments are collected)
had not occurred to me. I don't have an airtight argument that
it's wrong but it does seem highly unlikely (at least it does to
me) that this is what was intended. Do you have a different
take on that?
Nothing more than what I said: that the wording seems to suggest that
only the group is "processed", but that verb is not entirely clear. The
description seems to imply two phases -- a sort of parsing of the whole
conditional inclusion and then processing the selected group. It's not
even really an argument so much an impression formed on first reading.

<snip>
--
Ben.
Tim Rentsch
2017-05-20 04:47:08 UTC
Reply
Permalink
Raw Message
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. [...]
Does that mean you think the behavior is defined rather than
undefined? To me it looks like undefined behavior.
I thought it was defined. Is it not?
Section 6.10.3 paragraph 11 (discussing arguments for function-like
[...] If there are sequences of preprocessing tokens within the
list of arguments that would otherwise act as preprocessing
directives, the behavior is undefined.
I believe this sentence applies in the code shown above, which
therefore would mean undefined behavior.
I'm not sure. It definitely applies to
M(
#if C>=3
10,20,
#else
100,200,
#endif
30)
but in the example above I am not sure the #else or the #endif are there
when the arguments are being collected. The wording in the standard
talks about processing the "group" (shown in the syntax as between the
#else and the #endif) being "processed". I took that to mean that the
bounding directive are no longer there when expansion happens, but I can
see (now) that that's not the only way to take it.
I see. The idea that the subsequent directives might have
disappeared (ie, by the time the macro arguments are collected)
had not occurred to me. I don't have an airtight argument that
it's wrong but it does seem highly unlikely (at least it does to
me) that this is what was intended. Do you have a different
take on that?
Nothing more than what I said: that the wording seems to suggest that
only the group is "processed", but that verb is not entirely clear. The
description seems to imply two phases -- a sort of parsing of the whole
conditional inclusion and then processing the selected group. It's not
even really an argument so much an impression formed on first reading.
First let me explicitly acknowledge the last sentence there, and
say I don't consider the current discussion to be an argument.

To satisfy my own curiosity though, and me being the person I am,
I wondered if the question could be answered more resolutely, and
went back to dig into the Standard again. I believe the passage
below (section 5.1.1.2, p1, subparagraph 4) does that for us:

4. Preprocessing directives are executed, macro invocations
are expanded, and _Pragma unary operator expressions are
executed. If a character sequence that matches the syntax
of a universal character name is produced by token
concatenation (6.10.3.3), the behavior is undefined. A
#include preprocessing directive causes the named header
or source file to be processed from phase 1 through phase
4, recursively. All preprocessing directives are then
deleted.

Note the last sentence. Preprocessing directives are not deleted
until macro calls have finished being expanded. Thus collecting
arguments will have encounted a preprocessing directive (if
nothing else then the #endif, since #endif by itself is not a
group), which gives an answer that looks (to me) pretty airtight.

And now I'm sure you will be glad to return to more interesting
topics. :)

bartc
2017-05-15 13:52:45 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
Why is it surprising that tcc compiles it?
Because the multi-pass approach that would be the only way I could think
of to get over some of these problems, is not compatible with very fast
compilation. Also tcc has bugs with other examples (so is not just
lifting the same preprocessor code from gcc).
Post by Ben Bacarisse
Post by bartc
The thing is that every so often you come across troublesome macros
like this that only work on some compilers. But why should that be the
case?
Compilers have bugs.
Yes, you might expect variability with compilers built with small teams
(or one-man efforts) such Pelles C, lccwin, DMC. But then there is also
MSVC (apparently even the version with VS2015), which I doubt was
created with a small team.
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. Do you think the
wording of the standard needs to be improved and, if so, how?
I would prefer that the possibilities were purposely kept simple. As it
is now, a macro call such as:

M(A,B)

where A and B are arbitrary expressions, could conceivably be split up
like this:

M .. ( .. A .. , .. B .. )

where I use .. to indicate points of discontinuity (and .. can occur
within A and B of course. ".." can include:

* Any combination of spaces, tabs and newlines
* Any // or /*..*/ comments
* The end of one include file and/or the start of another
* The end of #if branch and/or the start of another
* Any combination of the above.

Since M, with no args, and M(...) are treated differently, you need to
check whether "(" follows M. But that is not easy when you need to
consider all the above that can occur between M and (.

That's not all, as 'M' and elements of A and B could be synthesised from
other macro expansions. Although not, oddly, "(", which must be an
actual parenthesis.

(Doing a quick test, splitting M(A,B) across include files doesn't work
with gcc. Why not? Where does it say in the Standard that this is not
possible? Or is it the case that if gcc can't manage it, no other
compiler needs to bother?)
Post by Ben Bacarisse
<snip>
Post by bartc
But then maybe someone will have a macro expansion that generates
#-directives
Macros can't (validly) expand to directives.
I think #pragmas can be generated. However, what's to stop gcc from
allowing directives to be generated? Then others will have to follow.

Here's a test of some old macro examples, and the results with various
compilers. They're all different! This is rather scary actually; what
other differences could there be in real code, which don't manifest
themselves so obviously?

People like you keep saying that the macro system is perfectly defined,
yet the experts writing the actual compilers seem to have a bit of
trouble! (Perhaps you would like to help them out...)

This is the C file being tested (the numeric labels are included but
omitted from outputs):

//---------------------------------
#define a(x) mac_a(x)
#define b
#define d(x) (x)
#define e(x, y, z) x y ## z
#define f a d d (x)
#define g a b (c)
#define h(x, y) x y
#define hash #
#define i j j
#define j i

#define F(a) a*G
#define G(a) F(a)
#define A(x,y) if(x==y)
#define B A(

1: a b (c)
2: a d (c)
3: d(a b (c))
4: d(a d (c))
5: f
6: e(a,,a)
7: e(a, c d, e)
8: h(hash, zz)
9: i
10: d(i)
11: F(2)(9)
12: B 10,20);
//---------------------------------

-E output of various compilers:

gcc Pelles C lccwin64 tcc msvc008

1: a (c) a (c) a b (c) a (c) a (c)
2: a (c) a(c) a d (c) a(c) a(c)
3: (mac_a(c)) (mac_a(c)) (a b (c)) (mac_a(c)) (mac_a(c))
4: (mac_a(c)) (mac_a(c)) (a d (c)) (mac_a(c)) (mac_a(c))
5: a d (x) a d(x) a d(x) a d(x) a (x)
6: a a a a aa a a a a
7: a c de a c de a c de a c de a c de
8: # zz # zz # zz # zz # zz
9: i i i i i i i i i i
10: (i i) (i i) (i i) (i i) (i i)
11: 2*9*G 2 * F(9) 2 *F(9) 2*9*G 2*9*G
12: if(10==20); A( 10,20); A( 10,20); error if(==)if(==) 10,20;

mcc (my compiler)

1: a ( c )
2: a ( c )
3: ( mac_a ( c ) )
4: ( mac_a ( c ) )
5: mac_a ( x )
6: a a
7: a c de
8: # zz
9: i i
10: ( i i )
11: 2 * 9 * G
12: error

Other errors and warnings that some compilers gave are not shown.

DMC couldn't be easily tested as its -e option doesn't work properly.
But there were a few errors shown.
--
bartc
Thiago Adams
2017-05-15 14:11:51 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
Why is it surprising that tcc compiles it?
Because the multi-pass approach that would be the only way I could think
of to get over some of these problems, is not compatible with very fast
compilation. Also tcc has bugs with other examples (so is not just
lifting the same preprocessor code from gcc).
I do it separately from the "macro expansion core" in one pass.
The "macro expansion core" does multi pass. (I am using that algorithm I put here)
When the scanner reads M it checks if it is a macro. Then it collect the macro call arguments reading only tokens that are included.

The result of macro call is M(10,20,30). This "M(10,20,30)" is send to
the "macro expansion core".
The result of expansion is pushed "similar of #include , but using string instead of file)
bartc
2017-05-15 14:29:32 UTC
Reply
Permalink
Raw Message
Post by Thiago Adams
Post by bartc
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
Why is it surprising that tcc compiles it?
Because the multi-pass approach that would be the only way I could think
of to get over some of these problems, is not compatible with very fast
compilation. Also tcc has bugs with other examples (so is not just
lifting the same preprocessor code from gcc).
I do it separately from the "macro expansion core" in one pass.
The "macro expansion core" does multi pass. (I am using that algorithm I put here)
When the scanner reads M it checks if it is a macro. Then it collect the macro call arguments reading only tokens that are included.
The result of macro call is M(10,20,30). This "M(10,20,30)" is send to
the "macro expansion core".
The result of expansion is pushed "similar of #include , but using string instead of file)
I can compile the above with a one-line change. (But I can't make it
permanent as it would probably fail with everything else!)

I use three levels of tokenising function. Level 1 is the lowest, only
level 2 looks at #-directives. The code that assembles macro arguments
calls level 1. If I make it call level 2, that works, but I already know
it will screw up if I keep it like that. Maybe if I duplicate the
detection of #-directives within that argument handling code...

(I promised myself I wouldn't touch the preprocessing module unless it
was essential. Horrible language to have to implement which as far as I
know only works by chance. Fortunately I don't have to compile that
Python-related code where this macro crops up.)
--
bartc
Ben Bacarisse
2017-05-15 15:41:26 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
Why is it surprising that tcc compiles it?
Because the multi-pass approach that would be the only way I could
think of to get over some of these problems, is not compatible with
very fast compilation. Also tcc has bugs with other examples (so is
not just lifting the same preprocessor code from gcc).
OK. I don't see any obvious need for anything multi-pass in this case.

<snip>
Post by bartc
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. Do you think the
wording of the standard needs to be improved and, if so, how?
I would prefer that the possibilities were purposely kept simple.
Ah, that's not what I meant. I was not asking you to invent a macro
language you'd prefer, I was asking if you can suggest a way in which
the wording of (or, for that matter, any other changes to) the standard
would make the current intended semantics clearer.

This is a good case to consider since there is clearly some
disagreement. (There's a high probably that I'm wrong about it being
defined but it's certain that it's not as clear as I thought it was.)

<snip>
Post by bartc
Post by Ben Bacarisse
Post by bartc
But then maybe someone will have a macro expansion that generates
#-directives
Macros can't (validly) expand to directives.
I think #pragmas can be generated.
In some cases, the pp-tokens that follow 'pragma' may be expanded and in
some cases that expansion is forbidden, but that's not really got much
to do with the question of macros expanding to
Post by bartc
However, what's to stop gcc from allowing directives to be generated?
Then others will have to follow.
Has every compiler implemented every gcc extension? I don't think so.
Post by bartc
Here's a test of some old macro examples, and the results with various
compilers. They're all different! This is rather scary actually; what
other differences could there be in real code, which don't manifest
themselves so obviously?
People like you keep saying that the macro system is perfectly
defined,
No, I don't. In fact I remember saying that macro expansion is
notoriously hard to specify, didn't I?

It's true that I don't think C's macro language is quite as hard you
like to make out, but then that's a general remark about you and C -- it
all seems way more complicated to you than it does to me. You also keep
taking that to mean I think it's all simple. I don't.
Post by bartc
yet the experts writing the actual compilers seem to have a
bit of trouble! (Perhaps you would like to help them out...)
I have, in fact, supplied a patch to tcc and submitted bug reports to
lccwin32 but that's by the by because I am just as likely to make
mistakes as anyone else.
Post by bartc
This is the C file being tested (the numeric labels are included but
//---------------------------------
#define a(x) mac_a(x)
#define b
#define d(x) (x)
#define e(x, y, z) x y ## z
#define f a d d (x)
#define g a b (c)
#define h(x, y) x y
#define hash #
#define i j j
#define j i
#define F(a) a*G
#define G(a) F(a)
#define A(x,y) if(x==y)
#define B A(
1: a b (c)
2: a d (c)
3: d(a b (c))
4: d(a d (c))
5: f
6: e(a,,a)
7: e(a, c d, e)
8: h(hash, zz)
9: i
10: d(i)
11: F(2)(9)
12: B 10,20);
//---------------------------------
Please give the flags you pass. It may not make any difference (I don't
know all of these compilers) but some may default to peculiar
non-conforming modes or to old standards. (I think I've said this
before. Showing what gcc does in it's default mode is almost pointless
when trying to work out what should happen in standard C.)
Post by bartc
gcc Pelles C lccwin64 tcc msvc008
1: a (c) a (c) a b (c) a (c) a (c)
2: a (c) a(c) a d (c) a(c) a(c)
3: (mac_a(c)) (mac_a(c)) (a b (c)) (mac_a(c)) (mac_a(c))
4: (mac_a(c)) (mac_a(c)) (a d (c)) (mac_a(c)) (mac_a(c))
5: a d (x) a d(x) a d(x) a d(x) a (x)
6: a a a a aa a a a a
7: a c de a c de a c de a c de a c de
8: # zz # zz # zz # zz # zz
9: i i i i i i i i i i
10: (i i) (i i) (i i) (i i) (i i)
11: 2*9*G 2 * F(9) 2 *F(9) 2*9*G 2*9*G
12: if(10==20); A( 10,20); A( 10,20); error if(==)if(==) 10,20;
It would be much more helpful if you said which ones are contentious or
badly defined by the standard. And why show 7, 8, 9 and 10? They all
seem to give the same output. Is there any thing to debate about those
cases?

Here is the table edited to show the interesting things:

gcc Pelles C lccwin64 tcc msvc008

1: a b (c)
2: a d (c)
3: (a b (c))
4: (a d (c))
5: a (x)
6: aa
12: A( 10,20); A( 10,20); error if(==)if(==) 10,20;

You've found some bugs in lccwin64, one in each of Pelles C and tcc and
two in and old version of MSVC. (In line 11 both outputs are
acceptable). Are there any contentious results here? I.e. do you think
there is anything other than some bugs?
Post by bartc
mcc (my compiler)
5: mac_a ( x )
12: error
(Again I've left only what appears to be significant). Do you think
these are bugs or is there some debate to be had about what 5 and 12
should be?

<snip>
--
Ben.
s***@casperkitty.com
2017-05-15 15:53:16 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Ah, that's not what I meant. I was not asking you to invent a macro
language you'd prefer, I was asking if you can suggest a way in which
the wording of (or, for that matter, any other changes to) the standard
would make the current intended semantics clearer.
IMHO, many things could be made much clearer if the authors of the Standard
were to explicitly recognizes places which different implementations would
be allowed to interpret differently at their leisure--not saying that the
behavior was Undefined, but rather allowing implementations to choose in
Unspecified fashion from among a few discrete possibilities, one of which
may be refusal to process ambiguous code.

It's useful to have a category of programs which will be processed
identically by all implementations, but requiring that implementations jump
through hoops to ensure uniform handling of corner-case constructs which
would only arise in compiler-test scenarios doesn't seem very helpful.
bartc
2017-05-15 17:31:07 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by bartc
I would prefer that the possibilities were purposely kept simple.
Ah, that's not what I meant. I was not asking you to invent a macro
language you'd prefer,
I don't need to invent, it already exists. But I'm saying it is
ill-defined and badly constrained.
Post by Ben Bacarisse
Post by bartc
However, what's to stop gcc from allowing directives to be generated?
Then others will have to follow.
Has every compiler implemented every gcc extension? I don't think so.
There will be pressure to do so it if wants to compile code developed
with gcc and using those extensions.
Post by Ben Bacarisse
Post by bartc
12: if(10==20); A( 10,20); A( 10,20); error if(==)if(==) 10,20;
(I've now managed to test DMC. Results are the same as gcc except for
#12 which makes it crash.)
Post by Ben Bacarisse
It would be much more helpful if you said which ones are contentious or
badly defined by the standard. And why show 7, 8, 9 and 10? They all
seem to give the same output. Is there any thing to debate about those
cases?
You're assuming I know which ones are correct! But in the absence of
that knowledge, I'm taking gcc to give definitive versions.

Most examples were first posted by anti-spam, and I think all were
intended to be tricky, especially testing the first version of a
preprocessor (that is, before you have to rewrite it then hack it around
before most of the above work).
Post by Ben Bacarisse
gcc Pelles C lccwin64 tcc msvc008
1: a b (c)
2: a d (c)
3: (a b (c))
4: (a d (c))
5: a (x)
6: aa
12: A( 10,20); A( 10,20); error if(==)if(==) 10,20;
You've found some bugs in lccwin64, one in each of Pelles C and tcc and
two in and old version of MSVC. (In line 11 both outputs are
acceptable). Are there any contentious results here? I.e. do you think
there is anything other than some bugs?
Yes, that a preprocessor is difficult to get right. And I think it's
because it's poorly specified. (I expect most PPs are either based on an
existing, working one, or gradually evolve when it's found they won't
compile some existing program that makes creative use of the PP.)

If my examples are taken further and stringified as in the following:

#define str2(x) #x
#define str(x) str2(x)

puts(str(a b (c)));

then I get yet another assorted bunch of results. More bugs presumably.

I think that whoever dreamt up the preprocessor should also have
provided a reference implementation. Then at least it'll be easier to
test. But it's still going to be a matter of trial and error.
Post by Ben Bacarisse
Post by bartc
mcc (my compiler)
5: mac_a ( x )
12: error
(Again I've left only what appears to be significant). Do you think
these are bugs or is there some debate to be had about what 5 and 12
should be?
The majority verdict on #5 is that it should be 'a d (x)'. So my code
should really be changed (I thought I'd fixed 1-11 actually).

#12 I'm not too worried about, as it's a made-up example and most
compilers seem to go wrong. Yet, I don't really know whether it should
have worked or not. If you're allowed to have #-directives in the middle
of a macro-call, then why shouldn't #12 work too? After all gcc managed
to generate what might have been expected.
--
bartc
Keith Thompson
2017-05-15 18:56:21 UTC
Reply
Permalink
Raw Message
bartc <***@freeuk.com> writes:
[...]
Post by bartc
You're assuming I know which ones are correct! But in the absence of
that knowledge, I'm taking gcc to give definitive versions.
[...]

I don't know why you would make that assumption.

In particular, in cases where the behavior is undefined, the particular
behavior shown by gcc doesn't tell you anything.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ben Bacarisse
2017-05-15 18:57:50 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
Post by bartc
I would prefer that the possibilities were purposely kept simple.
Ah, that's not what I meant. I was not asking you to invent a macro
language you'd prefer,
I don't need to invent, it already exists. But I'm saying it is
ill-defined and badly constrained.
You've said that many times already. I was inviting you to be
productive and to say what parts of the standard could be made clearer
and/or more explicit. Even if no changes are ever made to it, it would
make for a useful thread.

<snip>
Post by bartc
Post by Ben Bacarisse
It would be much more helpful if you said which ones are contentious or
badly defined by the standard. And why show 7, 8, 9 and 10? They all
seem to give the same output. Is there any thing to debate about those
cases?
You're assuming I know which ones are correct!
Not all. I am asking you to say which ones are contentious and why. If
don't know the correct output for any of these, say so -- that they are
all up for debate and we could have a productive discussion about the
rules and what they mean.
Post by bartc
But in the absence of that knowledge, I'm taking gcc to give
definitive versions.
That's a reasonable starting point, but it break down for anything
undefined, implementation specific or where the implementation is given
a free hand (as in number 11 for example).

Anyway, I see you have not posted the flags being used in your tests so
I'm not sure there's any value in this discussion -- at least not in the
sense of clarify what the standard intends.

<snip>
Post by bartc
Post by Ben Bacarisse
gcc Pelles C lccwin64 tcc msvc008
1: a b (c)
2: a d (c)
3: (a b (c))
4: (a d (c))
5: a (x)
6: aa
12: A( 10,20); A( 10,20); error if(==)if(==) 10,20;
You've found some bugs in lccwin64, one in each of Pelles C and tcc and
two in and old version of MSVC. (In line 11 both outputs are
acceptable). Are there any contentious results here? I.e. do you think
there is anything other than some bugs?
Yes, that a preprocessor is difficult to get right.
I think you mean "no, my only point is that this is hard". You
obviously don't want to discuss the actual rules.

<snip>
Post by bartc
#define str2(x) #x
#define str(x) str2(x)
puts(str(a b (c)));
then I get yet another assorted bunch of results. More bugs
presumably.
I doubt there will be any fewer that's almost certain. But with no idea
what flags you are using, none of these may be bugs.
Post by bartc
I think that whoever dreamt up the preprocessor should also have
provided a reference implementation.
Yes, that is one way to specify these things. The reference
implementation is probably better off being very abstract. Haskell
anyone?
Post by bartc
Post by Ben Bacarisse
Post by bartc
mcc (my compiler)
5: mac_a ( x )
12: error
(Again I've left only what appears to be significant). Do you think
these are bugs or is there some debate to be had about what 5 and 12
should be?
The majority verdict on #5 is that it should be 'a d (x)'. So my code
should really be changed (I thought I'd fixed 1-11 actually).
#5 is just "f" with these defines:

#define a(x) mac_a(x)
#define d(x) (x)
#define f a d d (x)

So, as far I can see, this is what happens. I'll write <x> for the
token x and <_> for the space token (I don't think newline tokens
feature here).

<f> is replaced by <a><_><d><_><d><_><(><x><)>. This replacement is
scanned for macros to expand along with any tokens the follow <f> (but
lets say there are none).

<a> not followed by <_>*<(> has no expansion so I believe we can now
output the first tokens (i.e. no repeated scanning happens)

result: <a><_>

<d> (again not followed by <_>*<(>) expands to nothing:

result: <a><_><d><_>

<d><_><(> is a function-like macro invocation, so we must collect the
arguments. This involves scanning (with no expansion) for a matching
<)> treating <,> at the outer level as a separator. That's easy. The
argument list has one token list: <x>.

Before substituting <x> we expand any macros, but there are none. This
round of exansion is done in isolation as if it were a short, separate
source file. There are no # or ## tokens in sight so we can simply
replace <x> in the replacement list with <x> from the macro arguments.

replacement list: <(><x><)>

Again, this list scanned for macros to expand (including, this time, any
tokens that might follow in the input stream but there is nothing to
do. We now know the final result:

result: <a><_><d><_><(><x><)>

Note that the output of -E need not reflect the actual spacing provided
it makes no difference to the final program.
Post by bartc
#12 I'm not too worried about, as it's a made-up example and most
compilers seem to go wrong. Yet, I don't really know whether it should
have worked or not.
I think it should. I can't see any justification for it not working,
but I'm far from infallible.
Post by bartc
If you're allowed to have #-directives in the middle of a macro-call,
then why shouldn't #12 work too? After all gcc managed to generate
what might have been expected.
No, you are not allowed pp directives in the middle of a macro call[1]
but I don't see the connection with #12. #12 relies on the fact the the
replacement list is scanned alone with the remaining tokens:

#define A(x,y) if(x==y)
#define B A(

so "B 1,2)" expands to "A(| 1,2)" (using | to mark where the replacement
list ends). This replacement list is scanned along with the remaining
tokens, so a valid call of A is seen.

[1] 6.10.3 p11:

"The sequence of preprocessing tokens bounded by the outside-most
matching parentheses forms the list of arguments for the function-like
macro. The individual arguments within the list are separated by comma
preprocessing tokens, but comma preprocessing tokens between matching
inner parentheses do not separate arguments. If there are sequences of
preprocessing tokens within the list of arguments that would otherwise
act as preprocessing directives, the behavior is undefined."
--
Ben.
bartc
2017-05-15 19:31:41 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by bartc
then I get yet another assorted bunch of results. More bugs
presumably.
I doubt there will be any fewer that's almost certain. But with no idea
what flags you are using, none of these may be bugs.
I just don't think that flags have much to do with it. Except for
perhaps for gcc (and clang which uses the same flags), where apparently
they can be used to make gcc do anything.

And if they do, then they shouldn't.
Post by Ben Bacarisse
Post by bartc
I think that whoever dreamt up the preprocessor should also have
provided a reference implementation.
Yes, that is one way to specify these things. The reference
implementation is probably better off being very abstract. Haskell
anyone?
Perhaps, after all, the C standard version is better!
Post by Ben Bacarisse
Post by bartc
The majority verdict on #5 is that it should be 'a d (x)'. So my code
should really be changed (I thought I'd fixed 1-11 actually).
#define a(x) mac_a(x)
#define d(x) (x)
#define f a d d (x)
So, as far I can see, this is what happens. I'll write <x> for the
token x and <_> for the space token (I don't think newline tokens
feature here).
<f> is replaced by <a><_><d><_><d><_><(><x><)>. This replacement is
scanned for macros to expand along with any tokens the follow <f> (but
lets say there are none).
<a> not followed by <_>*<(> has no expansion so I believe we can now
output the first tokens (i.e. no repeated scanning happens)
result: <a><_>
result: <a><_><d><_>
<d><_><(> is a function-like macro invocation, so we must collect the
arguments. This involves scanning (with no expansion) for a matching
<)> treating <,> at the outer level as a separator. That's easy. The
argument list has one token list: <x>.
Before substituting <x> we expand any macros, but there are none. This
round of exansion is done in isolation as if it were a short, separate
source file. There are no # or ## tokens in sight so we can simply
replace <x> in the replacement list with <x> from the macro arguments.
replacement list: <(><x><)>
Again, this list scanned for macros to expand (including, this time, any
tokens that might follow in the input stream but there is nothing to
result: <a><_><d><_><(><x><)>
Note that the output of -E need not reflect the actual spacing provided
it makes no difference to the final program.
OK, I'll have a closer look later on. An older version of my
preprocessor expanded this properly. A fix to solve another problem,
that involved repeatedly rescanning any expanded sequence, has
introduced a bug. That suggests a rewrite is in order, but I don't have
the inclination to do that (and the problem hasn't come up in real code
yet).
Post by Ben Bacarisse
No, you are not allowed pp directives in the middle of a macro call[1]
but I don't see the connection with #12. #12 relies on the fact the the
#define A(x,y) if(x==y)
#define B A(
so "B 1,2)" expands to "A(| 1,2)" (using | to mark where the replacement
list ends). This replacement list is scanned along with the remaining
tokens, so a valid call of A is seen.
"The sequence of preprocessing tokens bounded by the outside-most
matching parentheses forms the list of arguments for the function-like
macro. The individual arguments within the list are separated by comma
preprocessing tokens, but comma preprocessing tokens between matching
inner parentheses do not separate arguments. If there are sequences of
preprocessing tokens within the list of arguments that would otherwise
act as preprocessing directives, the behavior is undefined."
If it's so perfectly clear, why do so many compilers have trouble?
--
bartc
Ben Bacarisse
2017-05-15 21:08:57 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
Post by bartc
then I get yet another assorted bunch of results. More bugs
presumably.
I doubt there will be any fewer that's almost certain. But with no idea
what flags you are using, none of these may be bugs.
I just don't think that flags have much to do with it. Except for
perhaps for gcc (and clang which uses the same flags), where
apparently they can be used to make gcc do anything.
I really don't know why you can't just tell us. Your remark about it
not mattering suggest that you are using no flags at all (save for -E)
for any of compilers you listed. Is that correct?
Post by bartc
And if they do, then they shouldn't.
Nonsense. It would be entirely reasonable for a compiler to support
some legacy behaviour in either its default mode or with some special
flags.

<snip>
Post by bartc
Post by Ben Bacarisse
No, you are not allowed pp directives in the middle of a macro call[1]
but I don't see the connection with #12. #12 relies on the fact the the
#define A(x,y) if(x==y)
#define B A(
so "B 1,2)" expands to "A(| 1,2)" (using | to mark where the replacement
list ends). This replacement list is scanned along with the remaining
tokens, so a valid call of A is seen.
"The sequence of preprocessing tokens bounded by the outside-most
matching parentheses forms the list of arguments for the function-like
macro. The individual arguments within the list are separated by comma
preprocessing tokens, but comma preprocessing tokens between matching
inner parentheses do not separate arguments. If there are sequences of
preprocessing tokens within the list of arguments that would otherwise
act as preprocessing directives, the behavior is undefined."
If it's so perfectly clear, why do so many compilers have trouble?
That's rhetoric. Instead, can you say what you think is unclear about

#define A(x,y) if(x==y)
#define B A(
B 10,20)

? It's entirely possible that the erroneous results you report are not
due to misunderstanding but are the consequences of incorrect fixes
elsewhere. Or they may simply be due to incorrect implementation
(i.e. despite understanding the words in the standard).

But I found lccwin64 and PellesC (8.0) get example 12 right.

Using no flags and

Pelles Compiler Driver, Version 8.00.0
Copyright (c) Pelle Orinius 2002-2015

Logiciels/Informatique lcc-win (64 bits) version 4.1.
Compilation date: Oct 27 2016 16:34:50

but even a very old version of lccwin32 gets 12 right.
--
Ben.
bartc
2017-05-15 22:50:27 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
I really don't know why you can't just tell us. Your remark about it
not mattering suggest that you are using no flags at all (save for -E)
Actually, I did use -E or equivalent to get the output, and nothing
else. Would source be preprocessed differently when it was actually
compiled? I was anyway /only/ looking at preprocessing.
Post by Ben Bacarisse
Post by bartc
If it's so perfectly clear, why do so many compilers have trouble?
That's rhetoric. Instead, can you say what you think is unclear about
#define A(x,y) if(x==y)
#define B A(
B 10,20)
Why do you think I thought up the example? I wanted one where a macro
call was synthesised from elements at different levels of macro
expansion including level zero (direct from original source). Just to
see what would happen.
Post by Ben Bacarisse
? It's entirely possible that the erroneous results you report are not
due to misunderstanding but are the consequences of incorrect fixes
elsewhere. Or they may simply be due to incorrect implementation
(i.e. despite understanding the words in the standard).
But I found lccwin64 and PellesC (8.0) get example 12 right.
Using no flags and
Pelles Compiler Driver, Version 8.00.0
Copyright (c) Pelle Orinius 2002-2015
Logiciels/Informatique lcc-win (64 bits) version 4.1.
Compilation date: Oct 27 2016 16:34:50
but even a very old version of lccwin32 gets 12 right.
I've tried PellesC and lccwin again, and results are variable. In the
original code, the macros for #11 and #12 were positioned just before
each invocation. Also, part of #11 was commented out which I hadn't
realised (I redid all tests for an uncommented #11, but only looked at
and updated the #11 results).

Anyway, the upshot is that processing:

#define F(a) a*G
#define G(a) F(a)

11: F(2)//(9)

#define A(x,y) if(x==y)
#define B A(
12: B 10,20);

with Pelles C 64-bit 8.0.170 gives this -E output (some blanks and
#lines removed):

11: 2 *G

#define A(x,y) if(x==y)

12: A( 10,20);

But this input:

#define F(a) a*G
#define G(a) F(a)

11: F(2)(9)

#define A(x,y) if(x==y)
#define B A(
12: B 10,20);

produces:

11: 2 * F(9)

12: if(10 == 20);

So something funny is going on, judging from that #define being sent to
the output. I've observed something similar with lccwin64.
--
bartc
Keith Thompson
2017-05-15 23:13:45 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
I really don't know why you can't just tell us. Your remark about it
not mattering suggest that you are using no flags at all (save for -E)
Actually, I did use -E or equivalent to get the output, and nothing
else. Would source be preprocessed differently when it was actually
compiled? I was anyway /only/ looking at preprocessing.
gcc does not fully conform to any edition of the C standard by
default. I don't know how or whether the various "-std=..." options
affect the behavior of the preprocessor, but I would suggest using
"-std=c11 -pedantic-errors -E" if you're looking at the behavior
of the preprocessor in the context of the current C standard.
(But even that can be less than entirely useful for code whose
behavior is undefined.)

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ben Bacarisse
2017-05-16 00:36:42 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
I really don't know why you can't just tell us. Your remark about it
not mattering suggest that you are using no flags at all (save for -E)
Actually, I did use -E or equivalent to get the output, and nothing
else.
Finally.
Post by bartc
Would source be preprocessed differently when it was actually
compiled? I was anyway /only/ looking at preprocessing.
Eh? I just want to know what flags you used. It's really not that odd
to ask when you posted compiler results.
Post by bartc
Post by Ben Bacarisse
Post by bartc
If it's so perfectly clear, why do so many compilers have trouble?
That's rhetoric. Instead, can you say what you think is unclear about
#define A(x,y) if(x==y)
#define B A(
B 10,20)
Why do you think I thought up the example? I wanted one where a macro
call was synthesised from elements at different levels of macro
expansion including level zero (direct from original source). Just to
see what would happen.
But what is unclear about it? Presumably you've read what the language
standard says about macro expansion. It's not many paragraphs. Did I
get it right in my explanation? Do you see a murky area where the words
could mean something else? I want to talk technical. You just seem to
want to bang the "ooh, it's all so complicated" drum.
Post by bartc
Post by Ben Bacarisse
? It's entirely possible that the erroneous results you report are not
due to misunderstanding but are the consequences of incorrect fixes
elsewhere. Or they may simply be due to incorrect implementation
(i.e. despite understanding the words in the standard).
But I found lccwin64 and PellesC (8.0) get example 12 right.
Using no flags and
Pelles Compiler Driver, Version 8.00.0
Copyright (c) Pelle Orinius 2002-2015
Logiciels/Informatique lcc-win (64 bits) version 4.1.
Compilation date: Oct 27 2016 16:34:50
but even a very old version of lccwin32 gets 12 right.
I've tried PellesC and lccwin again, and results are variable. In the
original code, the macros for #11 and #12 were positioned just before
each invocation. Also, part of #11 was commented out which I hadn't
realised (I redid all tests for an uncommented #11, but only looked at
and updated the #11 results).
Always post the code that gives the results you are reporting, please!

<snip cases>
Post by bartc
So something funny is going on, judging from that #define being sent
to the output.
It's just a bug. Here a small test case if you want to report it.

#define F()
F
#define X
X

There is obviously a bug in the code that looks ahead for '(' after
seeing a function-like macro name. The following #define is not
processed as a directive but it does persuade the scanner that no '(' is
coming.
Post by bartc
I've observed something similar with lccwin64.
Yes, lccwin64 gives the same results when given an input that actually
trips it up. Both PellesC and lccwin64 are based on Dave Hanson's lcc,
so I would bet the bug dates back to lcc's preprocessor. Both have been
extensively developed so they have obviously diverged from the original
lcc, but the coincidence suggests a common cause.

(Quick search... Source of lcc's cpp is available and, yes, it
generated the same faulty output.)
--
Ben.
bartc
2017-05-16 09:15:44 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by bartc
Would source be preprocessed differently when it was actually
compiled? I was anyway /only/ looking at preprocessing.
Eh? I just want to know what flags you used. It's really not that odd
to ask when you posted compiler results.
You know I avoid using compiler flags except for ones such as -E, -c or -O3.
Post by Ben Bacarisse
But what is unclear about it? Presumably you've read what the language
standard says about macro expansion. It's not many paragraphs. Did I
get it right in my explanation? Do you see a murky area where the words
could mean something else? I want to talk technical. You just seem to
want to bang the "ooh, it's all so complicated" drum.
It's not just me that thinks it's complicated. In the original #12
macro, 5 compilers out of 6 got it wrong. Now it turns out two of those
results were erroneous. But that still leaves half getting it wrong
(with those two turning out to have an unrelated bug).

I think (this is going back several months) I tried such an example to
see how much effort I should put into getting it right. So if the mighty
MSVC got it wrong (even with the 2008 version, but 2008 is still
comparatively recent), then I probably didn't need to bother (as there
were plenty of more pressing matters to get on with).

What it would mean in practice is that if it turns out I can't compile a
particular source, then neither can MSVC.

In the long term, that might even lead to people avoiding using those
dodgier macros. Whereas if all compilers accepted them, then the bar
would be raised even higher.
Post by Ben Bacarisse
There is obviously a bug in the code that looks ahead for '(' after
seeing a function-like macro name.
I don't have that bug because I deliberately limit the possibilities, so
that a macro call must look like one of these two:

M(A)
M (A)

That means there will be a set of source codes that my preprocessor will
fail on because they will do one of these, or worse:

M (A)
M
(A)
M /*comment*/ (A)

etc. But I don't care. If the C standard had such a restriction (and in
some places single spaces /are/ significant), then that bug in LCC
wouldn't have existed.
--
bartc
Philip Lantz
2017-05-17 02:17:49 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
Post by bartc
Would source be preprocessed differently when it was actually
compiled? I was anyway /only/ looking at preprocessing.
Eh? I just want to know what flags you used. It's really not that odd
to ask when you posted compiler results.
You know I avoid using compiler flags except for ones such as -E, -c or -O3.
If you're going to use gcc to help you determine what the standard requires,
I strongly suggest that you use a flag that tells it to compile some standard
flavor of C, rather than its own privately-defined C-like language.
Philip Lantz
2017-05-17 02:12:38 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Ben Bacarisse
I really don't know why you can't just tell us. Your remark about it
not mattering suggest that you are using no flags at all (save for -E)
Actually, I did use -E or equivalent to get the output, and nothing
else. Would source be preprocessed differently when it was actually
compiled? I was anyway /only/ looking at preprocessing.
You seem to have completely missed his point. A compiler may treat its
input as C11, C99, C89, K&R, or some private C-like language, depending
on the flags.* Do you expect the preprocessor rules to be identical for
all these possibilities? (I don't know whether they are or not, but I
certainly wouldn't expect them to be without checking.)

* And of course gcc might treat it as Fortran, but I think we can ignore
that for our current purposes.
s***@casperkitty.com
2017-05-17 15:48:12 UTC
Reply
Permalink
Raw Message
Post by Philip Lantz
* And of course gcc might treat it as Fortran, but I think we can ignore
that for our current purposes.
If there exists a conforming C compiler that would interpret a file that
began with __FORTRAN as a FORTRAN-77 program (allowable behavior given
the presence of an implementation-reserved identifier) then if such a
file would be processed as a usable FORTRAN program it would also be a
"conforming C program" because there would be at least one conforming C
implementation which could process it.
Robert Wessel
2017-05-17 16:19:53 UTC
Reply
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Philip Lantz
* And of course gcc might treat it as Fortran, but I think we can ignore
that for our current purposes.
If there exists a conforming C compiler that would interpret a file that
began with __FORTRAN as a FORTRAN-77 program (allowable behavior given
the presence of an implementation-reserved identifier) then if such a
file would be processed as a usable FORTRAN program it would also be a
"conforming C program" because there would be at least one conforming C
implementation which could process it.
Piffle! There's no need for an ugly language declaration like that,
it's perfectly possible to write programs that can be compiled as C or
Fortran, or even run as a shell script.

http://www.ioccc.org/1986/applin/applin.c

As to whether or not such an ability reflects the original intention
of C (later ruined by the ANSI committee), I'll leave for others to
debate. ;-)
s***@casperkitty.com
2017-05-17 17:22:59 UTC
Reply
Permalink
Raw Message
Post by Robert Wessel
Post by s***@casperkitty.com
If there exists a conforming C compiler that would interpret a file that
began with __FORTRAN as a FORTRAN-77 program (allowable behavior given
the presence of an implementation-reserved identifier) then if such a
file would be processed as a usable FORTRAN program it would also be a
"conforming C program" because there would be at least one conforming C
implementation which could process it.
Piffle! There's no need for an ugly language declaration like that,
it's perfectly possible to write programs that can be compiled as C or
Fortran, or even run as a shell script.
http://www.ioccc.org/1986/applin/applin.c
My point was that the appearance of an identifier starting with __ would
give a C compiler latitude to treat everything else in arbitrary fashion,
so the only requirement for such a file to be conforming would be the
existence of a conforming compiler that would process it usefully. The
example might have been improved, however, by having the magic line be

const __FORTRAN=77;

which would then allow the program to be a valid FORTRAN file as well as a
valid C file, at least if a lowercase "C" is a valid comment indicator (the
system on which I programmed FORTRAN didn't *have* lowercase letters, so
I don't know).
Hans-Peter Diettrich
2017-05-18 02:17:56 UTC
Reply
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Philip Lantz
* And of course gcc might treat it as Fortran, but I think we can ignore
that for our current purposes.
If there exists a conforming C compiler that would interpret a file that
began with __FORTRAN as a FORTRAN-77 program (allowable behavior given
the presence of an implementation-reserved identifier) then if such a
file would be processed as a usable FORTRAN program it would also be a
"conforming C program" because there would be at least one conforming C
implementation which could process it.
IMO it's a matter of the compiler front-ends, which languages are
accepted. But in most cases distinct compilers for different languages
are built, even if these use the gcc infrastructure and code generation
back-ends.

DoDi
Philip Lantz
2017-05-19 01:40:56 UTC
Reply
Permalink
Raw Message
Post by Hans-Peter Diettrich
Post by s***@casperkitty.com
Post by Philip Lantz
* And of course gcc might treat it as Fortran, but I think we can ignore
that for our current purposes.
If there exists a conforming C compiler that would interpret a file that
began with __FORTRAN as a FORTRAN-77 program (allowable behavior given
the presence of an implementation-reserved identifier) then if such a
file would be processed as a usable FORTRAN program it would also be a
"conforming C program" because there would be at least one conforming C
implementation which could process it.
IMO it's a matter of the compiler front-ends, which languages are
accepted. But in most cases distinct compilers for different languages
are built, even if these use the gcc infrastructure and code generation
back-ends.
Regardless of that, if you don't know what options the compiler was invoked
with, you have /no/ idea what it is going to do, and you can't draw any
conclusions from what messages it prints.
Philip Lantz
2017-05-19 01:36:58 UTC
Reply
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Philip Lantz
* And of course gcc might treat it as Fortran, but I think we can ignore
that for our current purposes.
If there exists a conforming C compiler that would interpret a file that
began with __FORTRAN as a FORTRAN-77 program (allowable behavior given
the presence of an implementation-reserved identifier) then if such a
file would be processed as a usable FORTRAN program it would also be a
"conforming C program" because there would be at least one conforming C
implementation which could process it.
You completely missed my point. I wasn't talking about any of that. Bart
thinks that command line options don't matter. Consider this, which shows
that command line options matter quite a bit:

$ cat > f.c
C This is Fortran
print *,"This is Fortran."
end

$ gcc -x f77 -c f.c

$ cat c.f
/* this is C */
main() { printf("this is C\n"); }

$ gcc -x c -c c.f
c.f: In function 'main':
c.f:2:10: warning: incompatible implicit declaration of built-in function 'printf' [enabled by default]
main() { printf("this is C\n"); }
^

$ gcc c.f
c.f:1.1:

/* this is C */
1
Error: Non-numeric character in statement label at (1)
c.f:1.2:

/* this is C */
1
Error: Invalid character in name at (1)
c.f:2.1:

main() { printf("this is C\n"); }
1
Error: Non-numeric character in statement label at (1)
c.f:2.1:

main() { printf("this is C\n"); }
1
Error: Unclassifiable statement at (1)
c.f:2.33:

main() { printf("this is C\n"); }
1
Error: Invalid character in name at (1)
bartc
2017-05-16 09:57:41 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by bartc
The majority verdict on #5 is that it should be 'a d (x)'. So my code
should really be changed (I thought I'd fixed 1-11 actually).
#define a(x) mac_a(x)
#define d(x) (x)
#define f a d d (x)
So, as far I can see, this is what happens. I'll write <x> for the
token x and <_> for the space token (I don't think newline tokens
feature here).
<f> is replaced by <a><_><d><_><d><_><(><x><)>. This replacement is
scanned for macros to expand along with any tokens the follow <f> (but
lets say there are none).
<a> not followed by <_>*<(> has no expansion so I believe we can now
output the first tokens (i.e. no repeated scanning happens)
result: <a><_>
result: <a><_><d><_>
<d><_><(> is a function-like macro invocation, so we must collect the
arguments. This involves scanning (with no expansion) for a matching
<)> treating <,> at the outer level as a separator. That's easy. The
argument list has one token list: <x>.
Before substituting <x> we expand any macros, but there are none. This
round of exansion is done in isolation as if it were a short, separate
source file. There are no # or ## tokens in sight so we can simply
replace <x> in the replacement list with <x> from the macro arguments.
replacement list: <(><x><)>
Again, this list scanned for macros to expand (including, this time, any
tokens that might follow in the input stream but there is nothing to
result: <a><_><d><_><(><x><)>
Now that this first <d> is followed by <(>, why wouldn't it now be
expanded? I believe that is what MSVC must have done.

Is it because once a macro name has been processed once (and either it
has been expanded, or it couldn't be expanded because it wasn't followed
by "(") then it's no longer eligible to be expanded again?

(That is not the problem I get. I now fail to expand this #5 properly
because I added a loop. That was to get around this example:

#define info(x) (mem(x))
#define llink(x) info(x+1)
#define prevbreak llink
#define serial info

serial(prevbreak(10));

Output should be: (mem((mem(10+1))));

All compilers manage this except TCC which produces (mem(info(10+1)));
As did mine before I made the fix, but that broke example #5 (however
this one occurs in a real program so it takes priority).

The issue here is that at some point, a macro name is produced that is
not at first followed by "(", so is not expanded, but later it is.
Similar to my point about why the d(x) in "a d (x)" wasn't expanded above.)
--
bartc
Thiago Adams
2017-05-16 14:02:02 UTC
Reply
Permalink
Raw Message
Post by bartc
The majority verdict on #5 is that it should be 'a d (x)'. So my code
should really be changed (I thought I'd fixed 1-11 actually).
It would be very nice if you publish the samples/results in your github.

As a comment, Microsoft is planning to fix the preprocessor, not because of C, but because of C++.

https://blogs.msdn.microsoft.com/vcblog/2017/05/10/c17-features-in-vs-2017-3/

"[C] C99 preprocessor support is still partial, in that variadic macros mostly work. We’re planning to overhaul the preprocessor before marking this as complete."

C++ perspective:
"This document details the changes that need to be made to the working draft to resynchronize the preprocessor and translation phases of C++ with C99
"
Working draft changes for C99 preprocessor synchronization
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1653.htm
Ben Bacarisse
2017-05-16 15:26:33 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by bartc
Post by bartc
The majority verdict on #5 is that it should be 'a d (x)'. So my code
should really be changed (I thought I'd fixed 1-11 actually).
#define a(x) mac_a(x)
#define d(x) (x)
#define f a d d (x)
So, as far I can see, this is what happens. I'll write <x> for the
token x and <_> for the space token (I don't think newline tokens
feature here).
<f> is replaced by <a><_><d><_><d><_><(><x><)>. This replacement is
scanned for macros to expand along with any tokens the follow <f> (but
lets say there are none).
<a> not followed by <_>*<(> has no expansion so I believe we can now
output the first tokens (i.e. no repeated scanning happens)
result: <a><_>
result: <a><_><d><_>
<d><_><(> is a function-like macro invocation, so we must collect the
arguments. This involves scanning (with no expansion) for a matching
<)> treating <,> at the outer level as a separator. That's easy. The
argument list has one token list: <x>.
Before substituting <x> we expand any macros, but there are none. This
round of exansion is done in isolation as if it were a short, separate
source file. There are no # or ## tokens in sight so we can simply
replace <x> in the replacement list with <x> from the macro arguments.
replacement list: <(><x><)>
Again, this list scanned for macros to expand (including, this time, any
tokens that might follow in the input stream but there is nothing to
result: <a><_><d><_><(><x><)>
Now that this first <d> is followed by <(>, why wouldn't it now be
expanded? I believe that is what MSVC must have done.
Do you think it should be expanded in this case:

#define d(x) [x]
#define p (y)
d p

? Replacement lists are re-scanned (once, I believe) but not the
result. Once tokens have been emitted the scanning does no back-up to
see if subsequent tokens have now make a valid call.
Post by bartc
Is it because once a macro name has been processed once (and either it
has been expanded, or it couldn't be expanded because it wasn't
followed by "(") then it's no longer eligible to be expanded again?
No, that wording applies to macros that have been expanded -- they are
not recognised when scanning the replacement list. d has not been
expanded here and it does not appear in a replacement list anymore.
It's been though the algorithm and out the other end.
Post by bartc
(That is not the problem I get. I now fail to expand this #5 properly
#define info(x) (mem(x))
#define llink(x) info(x+1)
#define prevbreak llink
#define serial info
serial(prevbreak(10));
Output should be: (mem((mem(10+1))));
All compilers manage this except TCC which produces (mem(info(10+1)));
As did mine before I made the fix, but that broke example #5 (however
this one occurs in a real program so it takes priority).
This sort of "balloon animal" debugging -- where you squeeze it into
shape in one place on one place and a bug pops up somewhere else --
usually indicaes a need to step back and review the overall algorithm.
At least that's how I deal with this sort of situation.
Post by bartc
The issue here is that at some point, a macro name is produced that is
not at first followed by "(", so is not expanded, but later it
is. Similar to my point about why the d(x) in "a d (x)" wasn't
expanded above.)
You do have to get the sequence right. I started to go through this
example but I failed to find a good notation to show the nested
expansions and I ended up thinking it was not helping. I might revisit
it if I get time.
--
Ben.
Hans-Peter Diettrich
2017-05-18 01:27:33 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
<snip>
Post by bartc
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. Do you think the
wording of the standard needs to be improved and, if so, how?
I would prefer that the possibilities were purposely kept simple.
Ah, that's not what I meant. I was not asking you to invent a macro
language you'd prefer, I was asking if you can suggest a way in which
the wording of (or, for that matter, any other changes to) the standard
would make the current intended semantics clearer.
IMO the problem arises from translation phase 4, with the mix of macro
expansion and preprocessor directive handling. If macro expansion would
start only after all macro arguments are collected, at least no
preprocessor directives can occur in the argument list. This convention
also would produce the same result, regardless of whether a function or
a functional macro is compiled.
Post by Ben Bacarisse
This is a good case to consider since there is clearly some
disagreement. (There's a high probably that I'm wrong about it being
defined but it's certain that it's not as clear as I thought it was.)
<snip>
Post by bartc
Post by Ben Bacarisse
Post by bartc
But then maybe someone will have a macro expansion that generates
#-directives
Macros can't (validly) expand to directives.
Synthetic preprocessor directive construction should be
disallowed/ignored, because this leads to self modifying code. This were
in accordance to the tokenization, where it is impossible to construct
comments from /##/, as MSVC did some time ago.
Post by Ben Bacarisse
Post by bartc
6: e(a,,a)
AFAIK empty macro arguments are not allowed.

DoDi
Ben Bacarisse
2017-05-18 02:25:46 UTC
Reply
Permalink
Raw Message
Post by Hans-Peter Diettrich
Post by Ben Bacarisse
<snip>
Post by bartc
Post by Ben Bacarisse
Post by bartc
Are C's preprocessor and macro expansion rules really so poorly
defined that so many compilers get it wrong?
Not, I think, in this case. It seems very clear. Do you think the
wording of the standard needs to be improved and, if so, how?
I would prefer that the possibilities were purposely kept simple.
Ah, that's not what I meant. I was not asking you to invent a macro
language you'd prefer, I was asking if you can suggest a way in which
the wording of (or, for that matter, any other changes to) the standard
would make the current intended semantics clearer.
IMO the problem arises from translation phase 4, with the mix of macro
expansion and preprocessor directive handling. If macro expansion
would start only after all macro arguments are collected, at least no
preprocessor directives can occur in the argument list.
Macro expansion does start only after the arguments are collected. At
least it can be arranged that way. Arguments are also expanded but that
does not have to happen until after they are all collected.
Post by Hans-Peter Diettrich
This
convention also would produce the same result, regardless of whether a
function or a functional macro is compiled.
I don't follow this, but then I think I've misunderstood what you mean
above.

<snip>
Post by Hans-Peter Diettrich
Post by Ben Bacarisse
Post by bartc
6: e(a,,a)
AFAIK empty macro arguments are not allowed.
No, they are allowed.
--
Ben.
James Kuyper
2017-05-18 02:35:46 UTC
Reply
Permalink
Raw Message
...
Post by Hans-Peter Diettrich
Post by bartc
6: e(a,,a)
AFAIK empty macro arguments are not allowed.
6.10.3p4 mentions "arguments consisting of no preprocessing tokens",
which implies that it's permissible for such arguments to exist. Section
7 of the forward mentions "empty macro arguments" as one of the features
introduced in C99.
Tim Rentsch
2017-05-20 01:38:18 UTC
Reply
Permalink
Raw Message
Post by James Kuyper
...
Post by Hans-Peter Diettrich
Post by bartc
6: e(a,,a)
AFAIK empty macro arguments are not allowed.
6.10.3p4 mentions "arguments consisting of no preprocessing tokens",
which implies that its permissible for such arguments to exist. Section
7 of the forward mentions "empty macro arguments" as one of the features
introduced in C99.
If you will excuse a micro-quibble... They are mentioned as one
of the /changes/ in C99. Empty macro arguments were actually
introduced in the original standard, in section G.5.12 of C90
(and IIANM under a different numbering in C89), as a "Common
Extension".
Thiago Adams
2017-05-15 13:00:50 UTC
Reply
Permalink
Raw Message
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
It's working in my preprocessor, not because I understood the standard, but because the way I did worked. I assumed that the scanner will see

'M' '(' '10' ',' '20', '\n' '\n' '\n' '30' ')'

The extra '\n' are for #else and #endif

How about split macro call with #include?

--header1.h--
#define AB(a, b)
AB(1,

--main.c--
#include "header.h"
2)

//unexpected end of file in macro expansion (VC++)

So, #include is not something continuous 'EOF', but #if blocks are (with added \n).
bartc
2017-05-15 14:05:28 UTC
Reply
Permalink
Raw Message
Post by Thiago Adams
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
It's working in my preprocessor, not because I understood the standard, but because the way I did worked.
Yes, there is that too. Sometimes it works by chance. I thought doing
multiple passes, to first deal with #include and #if, would be better
behaved. But thinking again, that won't work for #if because the
expression will depend on macros, so they need to be expanded first
before #if knows if its expression is true or false. And presumably you
can't do it for #includes because they rely a lot on #ifs too.

And what happens here:

#if M(
#if C
A
#else
B
#endif
)
#endif

It seems that this isn't allowed because the #if expression must be on
one line. That seems a good rule to me; why can't it apply everywhere!

I assumed that the scanner will see
Post by Thiago Adams
'M' '(' '10' ',' '20', '\n' '\n' '\n' '30' ')'
The extra '\n' are for #else and #endif
How about split macro call with #include?
--header1.h--
#define AB(a, b)
AB(1,
--main.c--
#include "header.h"
2)
//unexpected end of file in macro expansion (VC++)
So, #include is not something continuous 'EOF', but #if blocks are (with added \n).
No, I just tried splitting a macro call across include files, and it
didn't work.
--
bartc
Tim Rentsch
2017-05-15 13:43:11 UTC
Reply
Permalink
Raw Message
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. [...]
Did you give gcc the -pedantic-errors option, like several
people have suggested?
fir
2017-05-15 16:47:49 UTC
Reply
Permalink
Raw Message
Post by bartc
#include <stdio.h>
#define M(a,b,c) printf("%d %d %d\n",(a),(b),(c));
#define C 2
int main(void) {
#if C>=3
M(10,20,
#else
M(100,200,
#endif
30)
}
It either calls printf with arguments 10,20,30 or 100,200,30. But it
splits the macro invocation with conditionals, that causes compile
errors with pelles c, lccwin, dmc, and MSVC. It compiles with gcc and
(surprisingly) tiny C. Also (using online compilers) with clang.
The thing is that every so often you come across troublesome macros like
this that only work on some compilers. But why should that be the case?
Are C's preprocessor and macro expansion rules really so poorly defined
that so many compilers get it wrong? (I certainly thought so when I
tried to implement a preprocessor earlier this year.)
Maybe, you can get more consistent behaviour by doing multiple passes,
so that all the #ifs are done first for example, then the macro expansion.
But then maybe someone will have a macro expansion that generates
#-directives, or other macro invocations created from parts joined
together with ## or that use #-stringifying, that gcc will somehow
manage to compile as expected! Then that becomes the benchmark for what
is expected to work.
So, does anyone actually know EXACTLY what the capabilities of the C
macro system are? Or do compilers just make them up as they go along?
With gcc in the lead. (I don't intend to make this work in my own
implementation. I believe there should be clearly-defined limits to what
is possible and what is considered reasonable.)
(This is not a made-up example; the following was posted in
comp.lang.python today in "How to install Python package from source on
If you're using 3.6, you'll have to build from source. The package has
a single C extension without external dependencies, so it should be a
straight-forward build if you have Visual Studio 2015+ installed with
the C/C++ compiler for x86. Ideally it should work straight from pip.
But I tried and it failed in 3.6.1 due to the new PySlice_GetIndicesEx
macro. Apparently MSVC doesn't like preprocessor code like this in
#if PY_MAJOR_VERSION >= 3
if (PySlice_GetIndicesEx(item, Py_SIZE(self),
#else
if (PySlice_GetIndicesEx((PySliceObject*)item, Py_SIZE(self),
#endif
&start, &stop, &step, &slicelength) < 0) {
It fails with a C1057 error (unexpected end of file in macro
expansion). The build will succeed if you copy the common line with
`&start` to each case and comment out the original line, such that the
macro invocation isn't split across an #if / #endif. This is an ugly
consequence of making PySlice_GetIndicesEx a macro. I wonder if it
could be written differently to avoid this problem.
macro language layer such as this used in c (has it its name? "c macro language?") may be usefull [though i NEVER use and used it (ecept very simple tries), you will never se even one define in all my code from begining till now]

i always belived that some alias-language which would take c program semantics into account can go better
than abstract text based layer macro language), but i may be wrong [and if such alias-language would be usefull maybe it should be just built in in core c? i dont know]
Loading...