Discussion:
Tiny question
Add Reply
David Kleinecke
2017-05-04 03:45:57 UTC
Reply
Permalink
Raw Message
Would there be any downside were I to realize character
constants in the preprocessor rather than in the parser?
That is, for example, the pre-processor changes the three
characters 'a' to 0x61. and so on for other characters
including the wise ones.
Ben Bacarisse
2017-05-04 19:37:50 UTC
Reply
Permalink
Raw Message
Post by David Kleinecke
Would there be any downside were I to realize character
constants in the preprocessor rather than in the parser?
That is, for example, the pre-processor changes the three
characters 'a' to 0x61. and so on for other characters
including the wise ones.
(assuming s/wise/wide/)

For 'plain' characters it would depend on when in the PP the conversion
happens. For example

#define s(x) #x
s('a')

must generate "'a'" so you can't change 'a' to 0x61 when reading tokens.

For the wide characters you are likely to have more trouble. L'a' is an
expression of type wchar_t and u'a' and U'a' are of type char16_t and
char32_t.
--
Ben.
David Kleinecke
2017-05-05 02:45:15 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
Post by David Kleinecke
Would there be any downside were I to realize character
constants in the preprocessor rather than in the parser?
That is, for example, the pre-processor changes the three
characters 'a' to 0x61. and so on for other characters
including the wise ones.
(assuming s/wise/wide/)
For 'plain' characters it would depend on when in the PP the conversion
happens. For example
#define s(x) #x
s('a')
must generate "'a'" so you can't change 'a' to 0x61 when reading tokens.
For the wide characters you are likely to have more trouble. L'a' is an
expression of type wchar_t and u'a' and U'a' are of type char16_t and
char32_t.
Thank You. I missed that point. I can avoid it successfully
by reading the source token by token so that when I see the
'a' I already know it will be stringized so I avoid converting
it to an integer.

I am sticking to the C89 POV so unsigned wide characters don't
enter into the picture. But it seems possible to label the wide
character number as type wchar_because I can make all constants
values (that is type plus address) in the preprocessor rather
than waiting for that in the parser. Complicates things somewhat
but not hopelessly.

The tiny practical gain is a small saving of effort in the parser
- but an increase in the preprocessor. Might not be worth the
effort.

PS: Of course, some integer constants might well be optimized out
of value form. I would just ignore the old value assignment leaving
a few holes in the final executable.
Post by Ben Bacarisse
--
Ben.
Barry Schwarz
2017-05-05 06:14:22 UTC
Reply
Permalink
Raw Message
On Wed, 3 May 2017 20:45:57 -0700 (PDT), David Kleinecke
Post by David Kleinecke
Would there be any downside were I to realize character
constants in the preprocessor rather than in the parser?
That is, for example, the pre-processor changes the three
characters 'a' to 0x61. and so on for other characters
including the wise ones.
For one thing, 'a' is not 0x61 on all systems. It happens to be 0x81
on my system.

For another, you cannot define a macro named 'a'. How would you tell
the preprocessor to accomplish this?

And finally, what is the advantage of what you plan to do? Do you
really think 0x61 conveys the intent of the code to a reader better
than 'a'? (Quick, what is the intent of 0x74?)
--
Remove del for email
m***@gmail.com
2017-05-05 08:08:00 UTC
Reply
Permalink
Raw Message
Post by Barry Schwarz
On Wed, 3 May 2017 20:45:57 -0700 (PDT), David Kleinecke
Post by David Kleinecke
Would there be any downside were I to realize character
constants in the preprocessor rather than in the parser?
That is, for example, the pre-processor changes the three
characters 'a' to 0x61. and so on for other characters
including the wise ones.
For one thing, 'a' is not 0x61 on all systems. It happens to be 0x81
on my system.
"All the world's a Vax" ...
David Kleinecke
2017-05-05 19:36:40 UTC
Reply
Permalink
Raw Message
Post by Barry Schwarz
On Wed, 3 May 2017 20:45:57 -0700 (PDT), David Kleinecke
Post by David Kleinecke
Would there be any downside were I to realize character
constants in the preprocessor rather than in the parser?
That is, for example, the pre-processor changes the three
characters 'a' to 0x61. and so on for other characters
including the wise ones.
For one thing, 'a' is not 0x61 on all systems. It happens to be 0x81
on my system.
For another, you cannot define a macro named 'a'. How would you tell
the preprocessor to accomplish this?
And finally, what is the advantage of what you plan to do? Do you
really think 0x61 conveys the intent of the code to a reader better
than 'a'? (Quick, what is the intent of 0x74?)
The difference here is in compiler architecture. The user sees
exactly the same thing either way.

Loading...