Discussion:
separate (static-like) namespaces for several modules that now need to be compiled together
(too old to reply)
John Forkosh
2017-07-08 10:48:26 UTC
Permalink
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.

The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-there-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above

So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
GOTHIER Nathan
2017-07-08 11:12:42 UTC
Permalink
On Sat, 8 Jul 2017 10:48:26 +0000 (UTC)
Post by John Forkosh
...
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
I'm afraid there's no magic in the compiler. :o)
David Kleinecke
2017-07-08 16:39:17 UTC
Permalink
Post by GOTHIER Nathan
On Sat, 8 Jul 2017 10:48:26 +0000 (UTC)
Post by John Forkosh
...
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
I'm afraid there's no magic in the compiler. :o)
But there might be in the IDE.

Not that I know of any examples - but there are lots of IDE's
and I only know about a few
John Forkosh
2017-07-09 10:39:38 UTC
Permalink
Post by GOTHIER Nathan
On Sat, 8 Jul 2017 10:48:26 +0000 (UTC)
Post by John Forkosh
...
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
I'm afraid there's no magic in the compiler. :o)
Thanks for the info.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Malcolm McLean
2017-07-09 12:24:48 UTC
Permalink
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-there-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
It's something you should have done anyway and as a matter of course.

One mistake people sometimes make is to export non-core functions,
say you have a csv file loader. You obviously need to export the functions
to load the csv, to destroy it, and to query it for fields. However you'll
probably have "strdup" in there. The strdup should be static, it shouldn't
be exported, because the csv.c module contains csv functions, not
general-purpose string handling functions.
GOTHIER Nathan
2017-07-09 13:04:01 UTC
Permalink
On Sun, 9 Jul 2017 05:24:48 -0700 (PDT)
Post by Malcolm McLean
One mistake people sometimes make is to export non-core functions,
say you have a csv file loader...
Actually the worst mistake in this case is to choose the same names for
different structures and functions. It's worth taking more than a minute to
make the code clearer if it doesn't deserve to be thrown to the trash can.
Keith Thompson
2017-07-09 20:51:49 UTC
Permalink
Malcolm McLean <***@gmail.com> writes:
[...]
Post by Malcolm McLean
One mistake people sometimes make is to export non-core functions,
say you have a csv file loader. You obviously need to export the functions
to load the csv, to destroy it, and to query it for fields. However you'll
probably have "strdup" in there. The strdup should be static, it shouldn't
be exported, because the csv.c module contains csv functions, not
general-purpose string handling functions.
Good point, bad example. Feel free to define your own function that
does what strdup does, but don't call it strdup if you want your code
to be portable. The name is reserved when <string.h> is included
(like all names starting with "str", "mem", or "wcs" followed by a
lowercase letter -- and POSIX defines a function called "strdup".
Neither should be an issue if you declare it static, but I'd say
it's still a good idea to use a different name, just to avoid any
possible confusion.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
John Forkosh
2017-07-09 22:03:14 UTC
Permalink
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-their-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
Post by Malcolm McLean
It's something you should have done anyway and as a matter of course.
In retrospect, yeah. But I've seen lots and lots of code (maybe >75% of
the stuff I look at) without "static" where it should "pedantically"
have been. Certainly most of mine. And are you throwing stones from
inside a glass house?:)
Post by Malcolm McLean
One mistake people sometimes make is to export non-core functions,
say you have a csv file loader. You obviously need to export the functions
to load the csv, to destroy it, and to query it for fields. However you'll
probably have "strdup" in there. The strdup should be static, it shouldn't
be exported, because the csv.c module contains csv functions, not
general-purpose string handling functions.
Yeah, a dozen or two of the namespace collision functions are indeed
for string_handling/expression_parsing. Mostly just duplicates in each
module, but sometimes "newer and improved'er". Never figured out how
to deal with evolving libraries. If kept in a separate module, any
changes may affect all programs using it, involving elaborate maintenance.
So library either tends to get frozen, or you have many different versions
floating around. Better to just put a copy (current version) in each
module using it, whereby it may eventually become somewhat stale, but
at least it reliably works in its context without continual maintenance.
And your library's free to be developed without affecting programs using
earlier versions.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
James R. Kuyper
2017-07-10 15:32:32 UTC
Permalink
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
...
Post by John Forkosh
Post by Malcolm McLean
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. ...
I wasn't the one who suggested that it would "take only minutes", but I
found that estimate entirely reasonable. You said there were only
"several dozen functions with name collisions". It shouldn't take very
long to add even several dozen "static" keywords to a program. Now
you're implying that most of the 156 functions needed modification.

Having that many identically named functions between two different
modules implies a huge amount of functional overlap between the modules.
I would suspect that much of the code in the second module was created
by copying it from the first, and then making modifications. If so, I
suspect you may have missed an opportunity for code re-use: many of the
identically named functions could probably, with a little re-design,
have been not merely identically named, but actually the same function.
If necessary, you could have made them behave differently when called
for the two different purposes, based upon the value of a parameter.
Post by John Forkosh
... And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
Well, that was a bad idea, and this is one of the reasons.
Post by John Forkosh
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
Post by Malcolm McLean
It's something you should have done anyway and as a matter of course.
In retrospect, yeah. But I've seen lots and lots of code (maybe >75% of
the stuff I look at) without "static" where it should "pedantically"
have been. Certainly most of mine. And are you throwing stones from
inside a glass house?:)
A very large fraction of my code was originally designed and written by
other people, and has only one function per module, so opportunities for
declaring functions static don't come up very often. However, whenever
I've been free to design new code or make a major change to my existing
code, I usually define multiple functions in each module, but usually
only one of them has external linkage. So no, I'm not "throwing stones
from inside a glass house".

...
Post by John Forkosh
Yeah, a dozen or two of the namespace collision functions are indeed
for string_handling/expression_parsing. Mostly just duplicates in each
module, but sometimes "newer and improved'er". Never figured out how
to deal with evolving libraries. If kept in a separate module, any
changes may affect all programs using it, involving elaborate maintenance.
So library either tends to get frozen, or you have many different versions
floating around. ...
A library function that is shared between multiple different programs
should be changed only if the change is one that you want to occur in
all of those programs. If that's not the case, freezing the code is
entirely reasonable, and I don't see that as a problem: it's a normal
feature of any sufficiently generic function, like memcpy().

Nor do I see it as a problem to have multiple versions of a routine if
those versions need to do different things - but I'd recommend giving
them different names that reflect those different things that they're
doing. Again, this is perfectly normal feature of the kinds of functions
that inexperienced programmers tend to give names like "process_data()"
or "make_object()".

It really sounds like your functional decomposition of your code is poor.
Post by John Forkosh
... Better to just put a copy (current version) in each
module using it, whereby it may eventually become somewhat stale, but
Well, that would tend to explain why your modules are so ridiculously
over-sized.
John Forkosh
2017-07-11 00:03:59 UTC
Permalink
Post by James R. Kuyper
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
...
Post by John Forkosh
Post by Malcolm McLean
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. ...
I wasn't the one who suggested that it would "take only minutes", but I
found that estimate entirely reasonable. You said there were only
"several dozen functions with name collisions". It shouldn't take very
long to add even several dozen "static" keywords to a program. Now
you're implying that most of the 156 functions needed modification.
Having that many identically named functions between two different
modules implies a huge amount of functional overlap between the modules.
I would suspect that much of the code in the second module was created
by copying it from the first, and then making modifications. If so, I
suspect you may have missed an opportunity for code re-use: many of the
identically named functions could probably, with a little re-design,
have been not merely identically named, but actually the same function.
If necessary, you could have made them behave differently when called
for the two different purposes, based upon the value of a parameter.
Yeah, very often copied-and-modified. But "based on value of parameter"
assumes I knew what I wanted from the very beginning. Many are functions
developed over years (and years). For example,
/* ======================================================================
* Function: strpspn ( char *s, char *reject, char *segment )
* Purpose: finds the initial segment of s containing no chars
* in reject that are outside (), [] and {} parens, e.g.,
* strpspn("abc(---)def+++","+-",segment) returns
* segment="abc(---)def" and a pointer to the first + in s
* because the -'s are enclosed in () parens.
* ----------------------------------------------------------------------
* etc */
essentially spans ()parens, but initially forgot about internal quotes
like (abc")))"def) which should span the ")))", too. Didn't occur to me
for first version. In this case, wouldn't hurt if all programs using
it got that fix. But many function changes were more subtle, and couldn't
be transparently backwards applied to all programs using earlier versions.
Post by James R. Kuyper
Post by John Forkosh
... And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
Well, that was a bad idea, and this is one of the reasons.
Post by John Forkosh
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
Post by Malcolm McLean
It's something you should have done anyway and as a matter of course.
In retrospect, yeah. But I've seen lots and lots of code (maybe >75% of
the stuff I look at) without "static" where it should "pedantically"
have been. Certainly most of mine. And are you throwing stones from
inside a glass house?:)
A very large fraction of my code was originally designed and written by
other people, and has only one function per module, so opportunities for
declaring functions static don't come up very often.
My modules tend to get ginormous becuase they're typically gpl'ed
with a mostly end-user target audience not entirely comfortable
with building from source. So I don't want to give them distributions
containing hundreds of small files, which just gets intimidating to
people unfamiliar with configure/make/make_install (or similar).
A few large files results in much less email from confused end-users.
And for people actually reading them, I have extensive comments,
including a table of contents within a large top-of-file comment
block, of the form,
* Functions: o The following "table of contents" lists each function
* comprising gifscroll in the order it appears in this file.
* See individual function entry points for specific comments
* about purpose, calling sequence, side effects, etc.
* =============================================================
* +---
* | gifscroll functions
* +-----------------------
* main(argc,argv) cgi driver for gifscroll
* rasterize_this(msgorpbm,type) rasterize message or pbm file
* new_raster(width,height) raster allocation and constructor
* delete_raster(rp) destructor for raster
* rastcpy(rp,width,height,isfree) duplicate copy of rp
* rastput(target,source,tupleft,supleft,bg,isopaque) put s on t
* boxcols(rp,bp,irow,col1) column pixels of bp->shape at irow
So these kinds of extensive comments hopefully make the large files
more manageable. At least it hasn't become unmanageable or unreadable
for me, and no particular size-specific problems until this pesky
static thing.
Post by James R. Kuyper
However, whenever
I've been free to design new code or make a major change to my existing
code, I usually define multiple functions in each module, but usually
only one of them has external linkage. So no, I'm not "throwing stones
from inside a glass house".
Well, had it occurred to me I'd ever want to compile these modules
together, I'd have been more careful from the beginning.
So, if not "...from a glass house", then you (and several others)
are at least telling me I should have closed the barn door >>after<<
the horse has bolted. Believe me, as soon as I saw the horse bolting,
I immediately realized, all by myself, I should've closed the door.
Original question was asking for an easy way to re-corral the horse.
Post by James R. Kuyper
...
Post by John Forkosh
Yeah, a dozen or two of the namespace collision functions are indeed
for string_handling/expression_parsing. Mostly just duplicates in each
module, but sometimes "newer and improved'er". Never figured out how
to deal with evolving libraries. If kept in a separate module, any
changes may affect all programs using it, involving elaborate maintenance.
So library either tends to get frozen, or you have many different versions
floating around. ...
A library function that is shared between multiple different programs
should be changed only if the change is one that you want to occur in
all of those programs. If that's not the case, freezing the code is
entirely reasonable, and I don't see that as a problem: it's a normal
feature of any sufficiently generic function, like memcpy().
Nor do I see it as a problem to have multiple versions of a routine if
those versions need to do different things - but I'd recommend giving
them different names that reflect those different things that they're
doing. Again, this is perfectly normal feature of the kinds of functions
that inexperienced programmers tend to give names like "process_data()"
or "make_object()".
It really sounds like your functional decomposition of your code is poor.
That's a whole other issue. And I've occasionally considered refactoring,
but it ain't worth the effort. Functional decomposition >>becomes<<
poor over time, as "bags-on-bags" development, necessitated by new
and unanticipated functional requirements, screws up what may have
been an originally reasonable design. You sound like people without
much real-world experience sound, who think you can write a
functional requirements document, and then detail design and code
from that, once and for all. But nothing ever stays unchanged.
Everything's best viewed as a prototype for its next version.
The most productive development style is "bags-on-bags" until the
whole thing becomes such a mess that you have to pretty much
chuck it and start from scratch, but hopefully with a better
understanding. Refactoring every time there's the slightest
possible improvement is almost always a big waste of time and money.
I'd rather add new features, visible to end users, to messy old code,
than try to clean it up with no other purpose. Only when it
gets to the point of unmaintainability is refactoring a justifiable
purpose in and of itself.
Post by James R. Kuyper
Post by John Forkosh
... Better to just put a copy (current version) in each
module using it, whereby it may eventually become somewhat stale, but
Well, that would tend to explain why your modules are so ridiculously
over-sized.
Well, that's your opinion which you're entitled to.
I'd personally disagree with most of what you're saying,
at least disagree with it as hard-and-fast rules.
Different situations call for different strategies.
For example, when working with a large (or even small) team,
I'd obviously never consider such large modules.
But when working solo, I consider what I can best handle.
And I also feel that I'm a better judge of that than you.
Now, you might maybe want to pick up my code, and then
maybe consider it over-sized for your working habits.
Conversely, I might pick up yours and consider it under-sized.
But I'd acknowledge that as a personal preference rather
than a universal truth.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
James Kuyper
2017-07-11 02:19:29 UTC
Permalink
...
Post by John Forkosh
Post by James R. Kuyper
It really sounds like your functional decomposition of your code is poor.
That's a whole other issue. And I've occasionally considered refactoring,
but it ain't worth the effort. Functional decomposition >>becomes<<
poor over time, as "bags-on-bags" development, necessitated by new
and unanticipated functional requirements, screws up what may have
been an originally reasonable design. You sound like people without
much real-world experience sound, who think you can write a
functional requirements document, and then detail design and code
from that, once and for all. But nothing ever stays unchanged.
I've got real-world experience, and that experience taught me that the
process you're talking about never occurs "once and for all" - it's an
ongoing cycle, with the requirements doc, the detailed design, and the
code (and the test plan, and the user's guide, and the data dictionary,
etc.) all subject to periodic review and revision.
Letting yourself get too caught up in the minutia of a particular
revision of the code can lead to randomized design. The end point for
that process is an unmaintainable body of code that no one can ever
properly understand. Someone needs to periodically review the
requirements, to make sure that they are complete and current. Someone
need to periodically review the design, looking for opportunities to
improve the code by fundamental major re-design. It's easy to come up
with excuses for not doing so, and I won't claim I've never used such
excuses, but if you always give in to that temptation, it will keep
getting harder and harder to make even minor changes without breaking
something else.
Post by John Forkosh
Everything's best viewed as a prototype for its next version.
The most productive development style is "bags-on-bags" until the
whole thing becomes such a mess that you have to pretty much
chuck it and start from scratch, but hopefully with a better
understanding. Refactoring every time there's the slightest
possible improvement is almost always a big waste of time and money.
True - the longer you wait before major design reviews, the more likely
it is that you will find a potential change that has a big enough
benefit to be worth making. But from your description of this body of
code, I suspect that you have long since past the point where design
review is in order.
Post by John Forkosh
I'd rather add new features, visible to end users, to messy old code,
than try to clean it up with no other purpose. Only when it
gets to the point of unmaintainability is refactoring a justifiable
purpose in and of itself.
That's waiting little too long, in my opinion.
Post by John Forkosh
Post by James R. Kuyper
Post by John Forkosh
... Better to just put a copy (current version) in each
module using it, whereby it may eventually become somewhat stale, but
Well, that would tend to explain why your modules are so ridiculously
over-sized.
Well, that's your opinion which you're entitled to.
I'd personally disagree with most of what you're saying,
at least disagree with it as hard-and-fast rules.
I didn't offer a hard-and-fast rule. However, when the cost of making a
simple change (such as declaring a bunch of functions static to avoid
name collisions) is so high that you're seriously considering looking
for a compiler feature to automate that process, I suspect that your
code has gotten too close to being unmaintainable.
Tim Rentsch
2017-07-10 19:32:38 UTC
Permalink
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-their-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
of each .c file:

static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();

Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.

After things are compiling and linking, then you can do
further fixups incrementally, as needed.

That all make sense?
Kenny McCormack
2017-07-10 20:15:16 UTC
Permalink
In article <***@x-alumni2.alumni.caltech.edu>,
Tim Rentsch <***@alumni.caltech.edu> wrote:
...
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
I see what you're doing here - educating OP to the effect that the
following sequence:

static int foo();
...
int foo(int bar) { return bar; }

is equivalent to:

static int foo(int bar) { return bar; }

But I wonder if it actually going to be any easier in practice.
I think the amount of editing is going to be about the same.

OP was asking whether there way any "systematic" way to solve the problem,
such as a magical compiler switch. Alas, it seems the answer to that
question is "No".

Alas, it seems source editing is inevitable. How one does that source
editing is basically irrelevant to the original question (which was "Is
there a way to do it that doesn't involve editing?" - to which, as noted,
the answer is "No.").
--
Kenny, I'll ask you to stop using quotes of mine as taglines.

- Rick C Hodgin -
bartc
2017-07-10 20:59:21 UTC
Permalink
Post by Kenny McCormack
...
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
I see what you're doing here - educating OP to the effect that the
static int foo();
...
int foo(int bar) { return bar; }
static int foo(int bar) { return bar; }
Two of my compilers say 'inconsistent linkage for foo'.

And they have a point, since the second one has external linkage and the
first has static, even if the C standard has some arcane rules about
what happens when different linkages are used across multiple declarations.
--
bartc
j***@verizon.net
2017-07-10 21:30:02 UTC
Permalink
...
Post by bartc
Post by Kenny McCormack
I see what you're doing here - educating OP to the effect that the
static int foo();
...
int foo(int bar) { return bar; }
static int foo(int bar) { return bar; }
Two of my compilers say 'inconsistent linkage for foo'.
They're wrong. Both declarations declare foo to have internal linkage.
Post by bartc
And they have a point, since the second one has external linkage and the
first has static, even if the C standard has some arcane rules about
what happens when different linkages are used across multiple declarations.
"If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. ..." (6.2.2p8). "For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. ..." (6.2.2p7)

Therefore, while it's true that there's a somewhat esoteric rule at play here, that rule does not take the form of identifying the linkage as internal for the first declaration, and external for the second declaration, and then resolving the discrepancy in favor of the earlier declaration. That rule says that, from the very beginning, the second declaration declares the linkage to be the same as previously specified.
John Forkosh
2017-07-10 22:35:55 UTC
Permalink
Post by Kenny McCormack
...
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
I see what you're doing here - educating OP to the effect that the
static int foo();
...
int foo(int bar) { return bar; }
static int foo(int bar) { return bar; }
My big problem was actually all the declarations of called functions
from calling functions, e.g.,
int foocaller() { /*okay _before_ static keyword added below*/
int foo(); whatever; }
...
static int foo(int bar) { return bar; } /*_after_ adding static*/
So now the compiler has a hissy fit. Moreover, I can't fix it
by writing "static int foo();" inside foocaller() (which creates a
"storage class" hissy fit). And since foocaller() precedes foo()
in the module, the "static int foo();" is needed near the top.
Post by Kenny McCormack
But I wonder if it actually going to be any easier in practice.
I think the amount of editing is going to be about the same.
OP was asking whether there way any "systematic" way to solve the problem,
such as a magical compiler switch. Alas, it seems the answer to that
question is "No".
Alas, it seems source editing is inevitable. How one does that source
editing is basically irrelevant to the original question (which was "Is
there a way to do it that doesn't involve editing?" - to which, as noted,
the answer is "No.").
Yeah, it seems like such a compiler -switch would be pretty easy for
the gcc maintainers to implement. And it might occasionally be pretty
handy, e.g., if you're trying to use two library modules from entirely
different places, and they haven't explicitly declared their internal
functions static, and accidentally have a function namespace
collision. (That's what happened to me, but with both modules from
the same place.)
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
David Brown
2017-07-11 20:50:02 UTC
Permalink
Post by John Forkosh
Post by Kenny McCormack
...
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
I see what you're doing here - educating OP to the effect that the
static int foo();
...
int foo(int bar) { return bar; }
static int foo(int bar) { return bar; }
My big problem was actually all the declarations of called functions
from calling functions, e.g.,
int foocaller() { /*okay _before_ static keyword added below*/
int foo(); whatever; }
...
static int foo(int bar) { return bar; } /*_after_ adding static*/
So now the compiler has a hissy fit. Moreover, I can't fix it
by writing "static int foo();" inside foocaller() (which creates a
"storage class" hissy fit). And since foocaller() precedes foo()
in the module, the "static int foo();" is needed near the top.
Declaring functions inside another function is almost never a good idea.
Why are you doing it? If you want to have "foocaller" earlier in the
file than "foo", then simply put a forward declaration (with "static) of
"foo" /once/ at file scope earlier than "foocaller". Common practice is
to put these forward declarations clumped together near the head of the
file, but it is not necessary.
Post by John Forkosh
Post by Kenny McCormack
But I wonder if it actually going to be any easier in practice.
I think the amount of editing is going to be about the same.
OP was asking whether there way any "systematic" way to solve the problem,
such as a magical compiler switch. Alas, it seems the answer to that
question is "No".
Alas, it seems source editing is inevitable. How one does that source
editing is basically irrelevant to the original question (which was "Is
there a way to do it that doesn't involve editing?" - to which, as noted,
the answer is "No.").
Yeah, it seems like such a compiler -switch would be pretty easy for
the gcc maintainers to implement. And it might occasionally be pretty
handy, e.g., if you're trying to use two library modules from entirely
different places, and they haven't explicitly declared their internal
functions static, and accidentally have a function namespace
collision. (That's what happened to me, but with both modules from
the same place.)
John Forkosh
2017-07-11 21:53:01 UTC
Permalink
Post by David Brown
Post by John Forkosh
Post by Kenny McCormack
...
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
I see what you're doing here - educating OP to the effect that the
static int foo();
...
int foo(int bar) { return bar; }
static int foo(int bar) { return bar; }
My big problem was actually all the declarations of called functions
from calling functions, e.g.,
int foocaller() { /*okay _before_ static keyword added below*/
int foo(); whatever; }
...
static int foo(int bar) { return bar; } /*_after_ adding static*/
So now the compiler has a hissy fit. Moreover, I can't fix it
by writing "static int foo();" inside foocaller() (which creates a
"storage class" hissy fit). And since foocaller() precedes foo()
in the module, the "static int foo();" is needed near the top.
Declaring functions inside another function is almost never a good idea.
Why are you doing it?
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks. I'm a big fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
So it's just a personal preference. And I thought it was just personal,
but now realize it has some tangible bad side-effects.
Post by David Brown
If you want to have "foocaller" earlier in the
file than "foo", then simply put a forward declaration (with "static) of
"foo" /once/ at file scope earlier than "foocaller". Common practice is
to put these forward declarations clumped together near the head of the
file, but it is not necessary.
Post by John Forkosh
Post by Kenny McCormack
But I wonder if it actually going to be any easier in practice.
I think the amount of editing is going to be about the same.
OP was asking whether there way any "systematic" way to solve the problem,
such as a magical compiler switch. Alas, it seems the answer to that
question is "No".
Alas, it seems source editing is inevitable. How one does that source
editing is basically irrelevant to the original question (which was "Is
there a way to do it that doesn't involve editing?" - to which, as noted,
the answer is "No.").
Yeah, it seems like such a compiler -switch would be pretty easy for
the gcc maintainers to implement. And it might occasionally be pretty
handy, e.g., if you're trying to use two library modules from entirely
different places, and they haven't explicitly declared their internal
functions static, and accidentally have a function namespace
collision. (That's what happened to me, but with both modules from
the same place.)
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Keith Thompson
2017-07-11 21:57:59 UTC
Permalink
John Forkosh <***@panix.com> writes:
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks. I'm a big fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
So it's just a personal preference. And I thought it was just personal,
but now realize it has some tangible bad side-effects.
[...]

Why do you use old-style function declarations rather than prototypes?

With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
John Forkosh
2017-07-11 23:32:58 UTC
Permalink
Post by Keith Thompson
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks. I'm a big fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
So it's just a personal preference. And I thought it was just personal,
but now realize it has some tangible bad side-effects.
[...]
Why do you use old-style function declarations rather than prototypes?
With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
It's exactly as described above, i.e., something like
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
is more of a comment to myself (except for the necessary *rp part)
than a declaration, per se. It reminds me where this rp's coming from,
and where it's going to, and what it's doing there in the first place.
A full prototype would take up too much room on the line, leaving too
little room for the more important comment. Like I said, the declaration's
not there for programming purposes, but for comment purposes. But it
turned out to have some negative programming side-effects that I hadn't
realized. (As for the compiler catching bad args, I'll find that out
quick enough all by myself:)
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Keith Thompson
2017-07-11 23:55:16 UTC
Permalink
Post by John Forkosh
Post by Keith Thompson
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example.
[...]
Why do you use old-style function declarations rather than prototypes?
With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
It's exactly as described above, i.e., something like
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
is more of a comment to myself (except for the necessary *rp part)
than a declaration, per se. It reminds me where this rp's coming from,
and where it's going to, and what it's doing there in the first place.
A full prototype would take up too much room on the line, leaving too
little room for the more important comment. Like I said, the declaration's
not there for programming purposes, but for comment purposes. But it
turned out to have some negative programming side-effects that I hadn't
realized. (As for the compiler catching bad args, I'll find that out
quick enough all by myself:)
If they're intended as comments, then I suggest making them comments.

For me, "soon enough" means catching bad arguments at compile time
whenever possible. It's too easy to miss errors if you go out of your
way to tell the compiler not to catch them for you.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
John Forkosh
2017-07-12 01:26:03 UTC
Permalink
Post by Keith Thompson
Post by John Forkosh
Post by Keith Thompson
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example.
[...]
Why do you use old-style function declarations rather than prototypes?
With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
It's exactly as described above, i.e., something like
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
is more of a comment to myself (except for the necessary *rp part)
than a declaration, per se. It reminds me where this rp's coming from,
and where it's going to, and what it's doing there in the first place.
A full prototype would take up too much room on the line, leaving too
little room for the more important comment. Like I said, the declaration's
not there for programming purposes, but for comment purposes. But it
turned out to have some negative programming side-effects that I hadn't
realized. (As for the compiler catching bad args, I'll find that out
quick enough all by myself:)
If they're intended as comments, then I suggest making them comments.
Yeah, as already described, the fix for
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
now reads
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
(and the new header prototypes address your other remark)
Post by Keith Thompson
For me, "soon enough" means catching bad arguments at compile time
whenever possible. It's too easy to miss errors if you go out of your
way to tell the compiler not to catch them for you.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
David Brown
2017-07-12 22:07:08 UTC
Permalink
Post by John Forkosh
Post by Keith Thompson
Post by John Forkosh
Post by Keith Thompson
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example.
[...]
Why do you use old-style function declarations rather than prototypes?
With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
It's exactly as described above, i.e., something like
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
is more of a comment to myself (except for the necessary *rp part)
than a declaration, per se. It reminds me where this rp's coming from,
and where it's going to, and what it's doing there in the first place.
A full prototype would take up too much room on the line, leaving too
little room for the more important comment. Like I said, the declaration's
not there for programming purposes, but for comment purposes. But it
turned out to have some negative programming side-effects that I hadn't
realized. (As for the compiler catching bad args, I'll find that out
quick enough all by myself:)
If they're intended as comments, then I suggest making them comments.
Yeah, as already described, the fix for
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
now reads
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
(and the new header prototypes address your other remark)
Well, it is /your/ code - but to me, that is just hideous and confusing.
If you only intend your code to be read by yourself, then comments to
yourself in your own way are fine - if you intend it to be read by
others, then I would suggest you re-think thinks a little. (Unless you
think that I have an unusual view here, which is possible.)
John Forkosh
2017-07-12 23:08:37 UTC
Permalink
Post by David Brown
Post by John Forkosh
Post by Keith Thompson
Post by John Forkosh
Post by Keith Thompson
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example.
[...]
Why do you use old-style function declarations rather than prototypes?
With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
It's exactly as described above, i.e., something like
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
is more of a comment to myself (except for the necessary *rp part)
than a declaration, per se. It reminds me where this rp's coming from,
and where it's going to, and what it's doing there in the first place.
A full prototype would take up too much room on the line, leaving too
little room for the more important comment. Like I said, the declaration's
not there for programming purposes, but for comment purposes. But it
turned out to have some negative programming side-effects that I hadn't
realized. (As for the compiler catching bad args, I'll find that out
quick enough all by myself:)
If they're intended as comments, then I suggest making them comments.
Yeah, as already described, the fix for
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
now reads
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
(and the new header prototypes address your other remark)
Well, it is /your/ code - but to me, that is just hideous and confusing.
If you only intend your code to be read by yourself, then comments to
yourself in your own way are fine - if you intend it to be read by
others, then I would suggest you re-think thinks a little. (Unless you
think that I have an unusual view here, which is possible.)
"...read by yourself": well, yeah, I write the comments that
I'd personally like to see, without taking any survey about it.
And I don't think you "have an unusual view", per se, just your view,
which is perfectly fine by me. But I might suggest that characterizing
an alternative view as "hideous and confusing" (even when modified
by "but to me") may be assigning too great a weight to one particular
view.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Ian Collins
2017-07-13 04:22:45 UTC
Permalink
Post by John Forkosh
Post by David Brown
Post by John Forkosh
Post by Keith Thompson
If they're intended as comments, then I suggest making them comments.
Yeah, as already described, the fix for
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
now reads
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
(and the new header prototypes address your other remark)
Well, it is /your/ code - but to me, that is just hideous and confusing.
If you only intend your code to be read by yourself, then comments to
yourself in your own way are fine - if you intend it to be read by
others, then I would suggest you re-think thinks a little. (Unless you
think that I have an unusual view here, which is possible.)
"...read by yourself": well, yeah, I write the comments that
I'd personally like to see, without taking any survey about it.
And I don't think you "have an unusual view", per se, just your view,
which is perfectly fine by me. But I might suggest that characterizing
an alternative view as "hideous and confusing" (even when modified
by "but to me") may be assigning too great a weight to one particular
view.
I think most programmers would find them "hideous and confusing"!
--
Ian
David Brown
2017-07-13 09:14:46 UTC
Permalink
Post by John Forkosh
Post by David Brown
Post by John Forkosh
Post by Keith Thompson
Post by John Forkosh
Post by Keith Thompson
[...]
Post by John Forkosh
As mentioned earlier,
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example.
[...]
Why do you use old-style function declarations rather than prototypes?
With a prototype, the compiler will complain if you call a function with
the wrong number or types of arguments. With an old-style declaration
(with empty parentheses), it will just assume the call is correct.
It's exactly as described above, i.e., something like
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
is more of a comment to myself (except for the necessary *rp part)
than a declaration, per se. It reminds me where this rp's coming from,
and where it's going to, and what it's doing there in the first place.
A full prototype would take up too much room on the line, leaving too
little room for the more important comment. Like I said, the declaration's
not there for programming purposes, but for comment purposes. But it
turned out to have some negative programming side-effects that I hadn't
realized. (As for the compiler catching bad args, I'll find that out
quick enough all by myself:)
If they're intended as comments, then I suggest making them comments.
Yeah, as already described, the fix for
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
now reads
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
(and the new header prototypes address your other remark)
Well, it is /your/ code - but to me, that is just hideous and confusing.
If you only intend your code to be read by yourself, then comments to
yourself in your own way are fine - if you intend it to be read by
others, then I would suggest you re-think thinks a little. (Unless you
think that I have an unusual view here, which is possible.)
"...read by yourself": well, yeah, I write the comments that
I'd personally like to see, without taking any survey about it.
And I don't think you "have an unusual view", per se, just your view,
which is perfectly fine by me. But I might suggest that characterizing
an alternative view as "hideous and confusing" (even when modified
by "but to me") may be assigning too great a weight to one particular
view.
I am being clear and honest - your comment style, judging solely on the
very small snippets you have shown, is hideous and confusing to me. It
may be that I have an unusual viewpoint here - but I can only tell you
/my/ viewpoint, I can't tell you the views of others. I can certainly
say that if someone at my office showed me that code and asked for
comment or for help, he'd be sent back to his desk to re-write it (after
being told why, of course).

But don't take my viewpoint for more than it is. There are people in
this newsgroup whose long experience and expert knowledge I profoundly
respect, and yet I would reject most of the code they have shown here
over the years as being unusable for the kind of work I do.
Tim Rentsch
2017-07-10 23:20:24 UTC
Permalink
Post by Kenny McCormack
...
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
I see what you're doing here - educating OP to the effect that the
static int foo();
...
int foo(int bar) { return bar; }
static int foo(int bar) { return bar; }
But I wonder if it actually going to be any easier in practice.
I think the amount of editing is going to be about the same.
It might be, if the redundant declarations were written by hand.
After doing that a dozen functions or so, I think most people
would figure out that the initial set of redundant declarations
can be produced automatically (eg, with a short shell pipeline
including an awk script) based on the linker output of multiply
defined symbols. That reduces the amount of work needed greatly.
Post by Kenny McCormack
OP was asking whether there way any "systematic" way to solve the problem,
such as a magical compiler switch. Alas, it seems the answer to that
question is "No".
Alas, it seems source editing is inevitable. How one does that source
editing is basically irrelevant to the original question (which was "Is
there a way to do it that doesn't involve editing?" - to which, as noted,
the answer is "No.").
First off please notice I wasn't wasn't responding to the
original question. I was only trying to help lessen his editing
burden.

In the second place, since my comments have nothing to do
with the question you think is so important, apparently you
have decided to jump to a faulty conclusion based on no
new evidence. What help do you think that provides?
Keith Thompson
2017-07-10 20:29:23 UTC
Permalink
Tim Rentsch <***@alumni.caltech.edu> writes:
[...]
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
I would include the return and parameter types from the beginning, using
prototypes rather than old-style declarations. I'd copy-and-paste the
function declarations, with whatever minor fixup is needed for
declarations that are part of definitions (dropping "{", adding ";").
Post by Tim Rentsch
After things are compiling and linking, then you can do
further fixups incrementally, as needed.
That all make sense?
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
John Forkosh
2017-07-10 22:43:11 UTC
Permalink
Post by Keith Thompson
[...]
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
I would include the return and parameter types from the beginning, using
prototypes rather than old-style declarations. I'd copy-and-paste the
function declarations, with whatever minor fixup is needed for
declarations that are part of definitions (dropping "{", adding ";").
Yeah, I essentially did copy-and-paste. I first did
#define FUNCSCOPE static
then added FUNCSCOPE preceding all functions. Then a
grep FUNCSCOPE module.c > prototypes.txt
got a pretty good approximation, which I manually
tweaked with emacs.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Tim Rentsch
2017-07-10 23:01:47 UTC
Permalink
Post by Keith Thompson
[...]
Post by Tim Rentsch
It sounds like you are working much harder than you have to. To
give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
I would include the return and parameter types from the beginning, using
prototypes rather than old-style declarations. I'd copy-and-paste the
function declarations, with whatever minor fixup is needed for
declarations that are part of definitions (dropping "{", adding ";").
IME (and sadly I have a lot of it in this area) it's better to
get something compiling and linking with as little editing as
possible, and then go back afterwards and fix things up as
needed. This isn't some sort of theoretical argument, but one
borne of practical experience and many, many, many hours doing
the actual editing. In real programs, as opposed to toy
examples, the "fixups" are never as minor as one hopes they would
be, because of C's compilation model. Furthermore, in this case
there is no incentive to provide a full prototype in the added
set of function declarations. The individual translation units
work separately as they are without problem (or at least it
sounds that way based on the OP), so giving the /redundant/
leading declarations prototypes serves no purpose. Later on
it's a good idea to go fix that, but later on we probably will
be putting the 'static'-ness in other parts of the source file,
and the redundant declarations will be moved or eliminated.
There's no reason to make extra now just to throw it away
later.
John Forkosh
2017-07-10 22:10:12 UTC
Permalink
Post by Tim Rentsch
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-their-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
It sounds like you are working much harder than you have to.
Feels that way, too:)
Post by Tim Rentsch
To give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
TU = translation unit (how long has it been since I've heard
that terminology???)
Post by Tim Rentsch
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
And I like your animals. Is that an intentional double entendre with
language zoo? (given TU above, I'm guessing .75 probability "yes")
Post by Tim Rentsch
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
Yeah, "fix up the return types" would've been yet another mess.
So rather than "at the _start_" (your emphasis above), I placed
static declarations after all typedef's, struct definitions, etc,
whereby stuff like static raster *aardvark(); would be understood.
Post by Tim Rentsch
After things are compiling and linking, then you can do
further fixups incrementally, as needed.
Well, all done (first cut, anyway) now, with modules compiling
(-pedantically) and running as before (modulo moderate re-testing)
both individually and together.
Post by Tim Rentsch
That all make sense?
Yes, but can it make sense without being entirely liked?
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example above. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks. I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Tim Rentsch
2017-07-11 01:16:11 UTC
Permalink
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-their-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
It sounds like you are working much harder than you have to.
Feels that way, too:)
Post by Tim Rentsch
To give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
TU = translation unit (how long has it been since I've heard
that terminology???)
Yes, I'm just used to that, especially in the newsgroup here.
Post by John Forkosh
Post by Tim Rentsch
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
And I like your animals. Is that an intentional double entendre with
language zoo? (given TU above, I'm guessing .75 probability "yes")
I can neither confirm nor deny an intentional double entendre.
Post by John Forkosh
Post by Tim Rentsch
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
Yeah, "fix up the return types" would've been yet another mess.
So rather than "at the _start_" (your emphasis above), I placed
static declarations after all typedef's, struct definitions, etc,
whereby stuff like static raster *aardvark(); would be understood.
I've seen it go both ways - cases where it was easier to
put the function declarations up front, and add extra
declarations for the struct tags, or cases where it was
easier to put the function declarations after the typedefs,
provided of course the source is suitably structured so
that there is such a place. Usually I assume people are
smart enough to make these adjustments without my having
to explain them (which is why I didn't).
Post by John Forkosh
Post by Tim Rentsch
After things are compiling and linking, then you can do
further fixups incrementally, as needed.
Well, all done (first cut, anyway) now, with modules compiling
(-pedantically) and running as before (modulo moderate re-testing)
both individually and together.
Good, glad to hear it.
Post by John Forkosh
Post by Tim Rentsch
That all make sense?
Yes, but can it make sense without being entirely liked?
Oh yeah. Been there, done that, didn't enjoy doing it but
happy it got done.
Post by John Forkosh
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example above. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks.
I think I would go crazy if I tried to do that. Let me hazard a
guess - do your functions tend to be on the long side? Mine
are pretty much never very long. Adding unnecessary declarations
like this would make a huge difference. And it sounds like a
maintenance nightmare (ie, to me - I acknowledge other people
may have different reactions).
Post by John Forkosh
I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
Probably I'll be burned at the stake for saying this, but IMO
the whole "literate programming" idea from Knuth is a big step
in the wrong direction. I have great respect for Don Knuth,
who is an amazingly smart guy, but this idea isn't one of his
better ones.

As for internal comment documentation - there was a time in the
distant past when I had done more programming in assembly language
(or languages plural) than all other languages combined. And of
course in those days it was common to write comments more or less
on every line. But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together. In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.

Anyway FWIW those are my reactions. I hope my other comments
helped (and I'm glad you enjoyed my animal names :).
John Forkosh
2017-07-11 04:12:33 UTC
Permalink
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-their-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
It sounds like you are working much harder than you have to.
Feels that way, too:)
Post by Tim Rentsch
To give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
TU = translation unit (how long has it been since I've heard
that terminology???)
Yes, I'm just used to that, especially in the newsgroup here.
Post by John Forkosh
Post by Tim Rentsch
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
And I like your animals. Is that an intentional double entendre with
language zoo? (given TU above, I'm guessing .75 probability "yes")
I can neither confirm nor deny an intentional double entendre.
Post by John Forkosh
Post by Tim Rentsch
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
Yeah, "fix up the return types" would've been yet another mess.
So rather than "at the _start_" (your emphasis above), I placed
static declarations after all typedef's, struct definitions, etc,
whereby stuff like static raster *aardvark(); would be understood.
I've seen it go both ways - cases where it was easier to
put the function declarations up front, and add extra
declarations for the struct tags, or cases where it was
easier to put the function declarations after the typedefs,
provided of course the source is suitably structured so
that there is such a place. Usually I assume people are
smart enough to make these adjustments without my having
to explain them (which is why I didn't).
Yeah, I wouldn't have mentioned it except for your original
emphasis on "_start_" which I didn't quite understand (i.e.,
understood its denotation but not your possible connotation).
So wanted to see if you'd elaborate any potential problem
I might be overlooking doing it my way. (This whole thread
was me overlooking a potential problem.:)
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
After things are compiling and linking, then you can do
further fixups incrementally, as needed.
Well, all done (first cut, anyway) now, with modules compiling
(-pedantically) and running as before (modulo moderate re-testing)
both individually and together.
Good, glad to hear it.
Post by John Forkosh
Post by Tim Rentsch
That all make sense?
Yes, but can it make sense without being entirely liked?
Oh yeah. Been there, done that, didn't enjoy doing it but
happy it got done.
Post by John Forkosh
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example above. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks.
I think I would go crazy if I tried to do that. Let me hazard a
guess - do your functions tend to be on the long side?
Absolutely happens, but I try to start out with a well-encapsulated
idea of the functionality to be implemented. Then #lines-of-code
just is what it is. However, many tactical problems can arise with
that nice-sounding strategy. The "bags-on-bags" problem happens
pretty much 100% of the time over a long enough period.
Just one small thing after another, each requiring a few tweaks
here and there. Never enough to justify a refactor/rewrite/etc,
but accumulating over time until you're looking at an unmaintainable
mess. But (if I do say so myself:), I've become pretty talented
at maintaining messes (both my own and other people's), so that
my messes can get exceptionally large and messy while still doing
their intended job reliably.

And I think one important dimension (among many) of a programmer's
skill is indeed the level of mess he can reliably maintain.
Because maintaining large messes inevitably goes with the territory.
And I further think you can often take a good guess about someone's
level of experience by how high a standard of perfection they apply
to the judgement of other people's code. They just haven't been
around long enough to understand how the real world works.
My very, very favorite quote (and I'm betting now one of yours)
regarding the unreal perfection they're expecting is from the
Italian painter (who knew nothing of programming -- he was referring
to women) Elio Carlotti,
Beauty is a summation of parts working together in such a way
that nothing needs to be added or taken away or altered.
And that would indeed be a beautiful function (or woman),
but is rarely achievable.
Post by Tim Rentsch
Mine are pretty much never very long. Adding unnecessary
declarations like this would make a huge difference.
And it sounds like a maintenance nightmare (ie, to me -
I acknowledge other people may have different reactions).
Post by John Forkosh
I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
Probably I'll be burned at the stake for saying this, but IMO
the whole "literate programming" idea from Knuth is a big step
in the wrong direction. I have great respect for Don Knuth,
who is an amazingly smart guy, but this idea isn't one of his
better ones.
For me, his specific idea was wrong. You shouldn't (in my opinion)
be writing internal documentation that's intended to be separated
from the code it's documenting. But I guess one can forgive Knuth's
preoccupation with TeX. Nevertheless, well-conceived and elaborate
internal documentation can (my opinion again) significantly extend
the maintainability and useful life of well-designed and -written
code.

I have no Carlotti-beautiful example of my own, but I continue
to try to strive towards that unachievable impossible dream.
So, to provide a concrete example rather than just blowing smoke,
visit http://www.forkosh.com/gifsave89.html and click the
"gifsave90 Listing" link near the top-left underneath Related Pages
(do >>not<< try to deep link to that, or a deny from will be
automatically added to .htaccess -- I had some dos problems from
some bleepity-bleeps that I eventually defended against).
So you'd probably feel that's too much of a good (or not-so-good)
thing. Well, that would be some of Carlotti's "taken away" part,
I suppose. To wit, beauty is in the eye of the beholder.
Post by Tim Rentsch
As for internal comment documentation - there was a time in the
distant past when I had done more programming in assembly language
(or languages plural) than all other languages combined.
And of course in those days it was common to write comments more or
less on every line.
Yeah, for me it was several years of System/360 BAL, and several more
on Data General Nova/Eclipse, and a little bit on DECSystem-10.
Post by Tim Rentsch
But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together.
As above (in gifsave89) I try to do both -- paragraphs of prose as
well as line-by-line. Internal documentation is the only documentation
that never (simply can't) get lost or separated. And well-written
documentation needs maintenance along with the code it's documenting.
That's way easier to do if it's staring you in the face as you're
editing the code (and you don't even have to open a separate window).
Otherwise, the documentation's way more likely to get staler and staler
over time. Likewise for readers, internal documentation is always
adjacent to the code it's documenting. No searching/alignment required.
Post by Tim Rentsch
In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Sure, then skip the stuff you don't need. But those comments
you don't need may be useful to less experienced programmers.
Or not. But let me put it this way -- it's pretty easy to
write a "comment stripper" program; way harder to write a program
that meaningfully comments other programs.
Post by Tim Rentsch
Anyway FWIW those are my reactions. I hope my other comments
helped (and I'm glad you enjoyed my animal names :).
Absolutely. Thanks for your (and others') help, Tim.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
David Kleinecke
2017-07-11 17:56:49 UTC
Permalink
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Malcolm McLean
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-their-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
It should take only minutes to add the keyword "static" to all your non-
exported functions.
If "only minutes", I wouldn't have even bothered writing the post.:)
~10 hours and counting (maybe ~half done). 156 functions in 18K lines
of code. And big problem isn't adding static (actually, my #define'd
symbol FUNCSCOPE), but commenting out declarations within calling
functions, which typically explicitly declare each called function, e.g.,
raster /**new_raster(),*/ *rp=NULL; /*image raster returned to caller*/
/*int delete_raster();*/ /*in case rasterization fails*/
It sounds like you are working much harder than you have to.
Feels that way, too:)
Post by Tim Rentsch
To give the desired functions internal linkage, simple add forward
declarations for each TU's relevant function set, at the _start_
TU = translation unit (how long has it been since I've heard
that terminology???)
Yes, I'm just used to that, especially in the newsgroup here.
Post by John Forkosh
Post by Tim Rentsch
static int aardvark();
static int bobcat();
static int chimpanzee();
...
static int zebra();
And I like your animals. Is that an intentional double entendre with
language zoo? (given TU above, I'm guessing .75 probability "yes")
I can neither confirm nor deny an intentional double entendre.
Post by John Forkosh
Post by Tim Rentsch
Compile, then fix up the return types (this may mean moving
around some type declarations, or adding forward declarations
for struct tags). Parameter types are not needed.
Yeah, "fix up the return types" would've been yet another mess.
So rather than "at the _start_" (your emphasis above), I placed
static declarations after all typedef's, struct definitions, etc,
whereby stuff like static raster *aardvark(); would be understood.
I've seen it go both ways - cases where it was easier to
put the function declarations up front, and add extra
declarations for the struct tags, or cases where it was
easier to put the function declarations after the typedefs,
provided of course the source is suitably structured so
that there is such a place. Usually I assume people are
smart enough to make these adjustments without my having
to explain them (which is why I didn't).
Yeah, I wouldn't have mentioned it except for your original
emphasis on "_start_" which I didn't quite understand (i.e.,
understood its denotation but not your possible connotation).
So wanted to see if you'd elaborate any potential problem
I might be overlooking doing it my way. (This whole thread
was me overlooking a potential problem.:)
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
After things are compiling and linking, then you can do
further fixups incrementally, as needed.
Well, all done (first cut, anyway) now, with modules compiling
(-pedantically) and running as before (modulo moderate re-testing)
both individually and together.
Good, glad to hear it.
Post by John Forkosh
Post by Tim Rentsch
That all make sense?
Yes, but can it make sense without being entirely liked?
Oh yeah. Been there, done that, didn't enjoy doing it but
happy it got done.
Post by John Forkosh
I really like declaring all called functions within each
calling function, as per
raster *new_raster(), *rp=NULL; /*image raster returned to caller*/
int delete_raster(); /*in case rasterization fails*/
example above. It keeps me apprised of dependencies at a glance,
and the generic comments in the example are typically replaced by
more meaningful context-specific remarks.
I think I would go crazy if I tried to do that. Let me hazard a
guess - do your functions tend to be on the long side?
Absolutely happens, but I try to start out with a well-encapsulated
idea of the functionality to be implemented. Then #lines-of-code
just is what it is. However, many tactical problems can arise with
that nice-sounding strategy. The "bags-on-bags" problem happens
pretty much 100% of the time over a long enough period.
Just one small thing after another, each requiring a few tweaks
here and there. Never enough to justify a refactor/rewrite/etc,
but accumulating over time until you're looking at an unmaintainable
mess. But (if I do say so myself:), I've become pretty talented
at maintaining messes (both my own and other people's), so that
my messes can get exceptionally large and messy while still doing
their intended job reliably.
And I think one important dimension (among many) of a programmer's
skill is indeed the level of mess he can reliably maintain.
Because maintaining large messes inevitably goes with the territory.
And I further think you can often take a good guess about someone's
level of experience by how high a standard of perfection they apply
to the judgement of other people's code. They just haven't been
around long enough to understand how the real world works.
My very, very favorite quote (and I'm betting now one of yours)
regarding the unreal perfection they're expecting is from the
Italian painter (who knew nothing of programming -- he was referring
to women) Elio Carlotti,
Beauty is a summation of parts working together in such a way
that nothing needs to be added or taken away or altered.
And that would indeed be a beautiful function (or woman),
but is rarely achievable.
Post by Tim Rentsch
Mine are pretty much never very long. Adding unnecessary
declarations like this would make a huge difference.
And it sounds like a maintenance nightmare (ie, to me -
I acknowledge other people may have different reactions).
Post by John Forkosh
I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
Probably I'll be burned at the stake for saying this, but IMO
the whole "literate programming" idea from Knuth is a big step
in the wrong direction. I have great respect for Don Knuth,
who is an amazingly smart guy, but this idea isn't one of his
better ones.
For me, his specific idea was wrong. You shouldn't (in my opinion)
be writing internal documentation that's intended to be separated
from the code it's documenting. But I guess one can forgive Knuth's
preoccupation with TeX. Nevertheless, well-conceived and elaborate
internal documentation can (my opinion again) significantly extend
the maintainability and useful life of well-designed and -written
code.
I have no Carlotti-beautiful example of my own, but I continue
to try to strive towards that unachievable impossible dream.
So, to provide a concrete example rather than just blowing smoke,
visit http://www.forkosh.com/gifsave89.html and click the
"gifsave90 Listing" link near the top-left underneath Related Pages
(do >>not<< try to deep link to that, or a deny from will be
automatically added to .htaccess -- I had some dos problems from
some bleepity-bleeps that I eventually defended against).
So you'd probably feel that's too much of a good (or not-so-good)
thing. Well, that would be some of Carlotti's "taken away" part,
I suppose. To wit, beauty is in the eye of the beholder.
Post by Tim Rentsch
As for internal comment documentation - there was a time in the
distant past when I had done more programming in assembly language
(or languages plural) than all other languages combined.
And of course in those days it was common to write comments more or
less on every line.
Yeah, for me it was several years of System/360 BAL, and several more
on Data General Nova/Eclipse, and a little bit on DECSystem-10.
Post by Tim Rentsch
But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together.
As above (in gifsave89) I try to do both -- paragraphs of prose as
well as line-by-line. Internal documentation is the only documentation
that never (simply can't) get lost or separated. And well-written
documentation needs maintenance along with the code it's documenting.
That's way easier to do if it's staring you in the face as you're
editing the code (and you don't even have to open a separate window).
Otherwise, the documentation's way more likely to get staler and staler
over time. Likewise for readers, internal documentation is always
adjacent to the code it's documenting. No searching/alignment required.
Post by Tim Rentsch
In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Sure, then skip the stuff you don't need. But those comments
you don't need may be useful to less experienced programmers.
Or not. But let me put it this way -- it's pretty easy to
write a "comment stripper" program; way harder to write a program
that meaningfully comments other programs.
Post by Tim Rentsch
Anyway FWIW those are my reactions. I hope my other comments
helped (and I'm glad you enjoyed my animal names :).
Absolutely. Thanks for your (and others') help, Tim.
We shouldn't forget the simple dumb advantage inline
commentary has - it's visible at the place it is relevant.

I favor a function-initial comment covering everything that
isn't "local" supplemented by inline comments whenever
something less than obvious occurs.

I don't like the strategy of writing comments that are later
stripped out and "published" as the documentation.
Tim Rentsch
2017-07-12 10:00:19 UTC
Permalink
I'm cutting madly and will mostly give just short responses.
Post by John Forkosh
Post by Tim Rentsch
I've seen it go both ways - cases where it was easier to
put the function declarations up front, and add extra
declarations for the struct tags, or cases where it was
easier to put the function declarations after the typedefs,
provided of course the source is suitably structured so
that there is such a place. Usually I assume people are
smart enough to make these adjustments without my having
to explain them (which is why I didn't).
Yeah, I wouldn't have mentioned it except for your original
emphasis on "_start_" which I didn't quite understand (i.e.,
understood its denotation but not your possible connotation).
So wanted to see if you'd elaborate any potential problem
I might be overlooking doing it my way. (This whole thread
was me overlooking a potential problem.:)
Problems happen when (a) the code is not well-structured enough
to start with, and/or (b) ripple effects cause more and more code
needing to be moved or edited. Apparently not a problem in your
case.
Post by John Forkosh
Post by Tim Rentsch
I think I would go crazy if I tried to do that. Let me hazard a
guess - do your functions tend to be on the long side?
Absolutely happens, but I try to start out with a well-encapsulated
idea of the functionality to be implemented. Then #lines-of-code
just is what it is. However, many tactical problems can arise with
that nice-sounding strategy. The "bags-on-bags" problem happens
pretty much 100% of the time over a long enough period.
Just one small thing after another, each requiring a few tweaks
here and there. Never enough to justify a refactor/rewrite/etc,
but accumulating over time until you're looking at an unmaintainable
mess. But (if I do say so myself:), I've become pretty talented
at maintaining messes (both my own and other people's), so that
my messes can get exceptionally large and messy while still doing
their intended job reliably.
I follow a simple rule: don't admit long function bodies. Sometimes
there are exceptions but they are few and far between. Following
this rule forces cleanup automatically, and incrementally.
Post by John Forkosh
And I think one important dimension (among many) of a programmer's
skill is indeed the level of mess he can reliably maintain.
Because maintaining large messes inevitably goes with the territory.
And I further think you can often take a good guess about someone's
level of experience by how high a standard of perfection they apply
to the judgement of other people's code. They just haven't been
around long enough to understand how the real world works.
IMO that is a sign of the immaturity of the profession. But that
is a topic for another day.
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming),
[...] well-conceived and elaborate
internal documentation can (my opinion again) significantly extend
the maintainability and useful life of well-designed and -written
code.
Here I think you mean "extensive" more than "elaborate".
Post by John Forkosh
Post by Tim Rentsch
But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together.
As above (in gifsave89) I try to do both -- paragraphs of prose as
well as line-by-line. Internal documentation is the only documentation
that never (simply can't) get lost or separated. And well-written
documentation needs maintenance along with the code it's documenting.
That's way easier to do if it's staring you in the face as you're
editing the code (and you don't even have to open a separate window).
Otherwise, the documentation's way more likely to get staler and staler
over time. Likewise for readers, internal documentation is always
adjacent to the code it's documenting. No searching/alignment required.
Here I think you are using "internal" in several difference
senses. For me documentation that happens to be in the same
source file doesn't automatically make it "internal". The
difference (or differences plural) is important.
Post by John Forkosh
Post by Tim Rentsch
In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Sure, then skip the stuff you don't need. But those comments
you don't need may be useful to less experienced programmers.
Or not. But let me put it this way -- it's pretty easy to
write a "comment stripper" program; way harder to write a program
that meaningfully comments other programs.
I don't buy this, for several different reasons. The most
important are these: One, line-by-line comments like the ones
you use interfere with the code - just taking out the comments
isn't at all the same as writing the code (at the level of
individual function bodies) without comments to start with.
Two, the basic premise is contradicted by research studies on
different psychological modes. Having the comments there all
the time not only doesn't help, it actually slows people down.
I understand that you are used to the style of commenting that
you use, but I believe it's a poor choice just for you, and
even moreso for other people.
John Forkosh
2017-07-12 22:53:29 UTC
Permalink
Post by Tim Rentsch
I'm cutting madly and will mostly give just short responses.
Good idea.
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
I've seen it go both ways - cases where it was easier to
put the function declarations up front, and add extra
declarations for the struct tags, or cases where it was
easier to put the function declarations after the typedefs,
provided of course the source is suitably structured so
that there is such a place. Usually I assume people are
smart enough to make these adjustments without my having
to explain them (which is why I didn't).
Yeah, I wouldn't have mentioned it except for your original
emphasis on "_start_" which I didn't quite understand (i.e.,
understood its denotation but not your possible connotation).
So wanted to see if you'd elaborate any potential problem
I might be overlooking doing it my way. (This whole thread
was me overlooking a potential problem.:)
Problems happen when (a) the code is not well-structured enough
to start with, and/or (b) ripple effects cause more and more code
needing to be moved or edited. Apparently not a problem in your
case.
"Apparently not a problem...": Isn't that what they told the
Captain of the Titanic? (But, yeah, seems okay in this case,
with hopefully less disastrous consequences if not.)
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
I think I would go crazy if I tried to do that. Let me hazard a
guess - do your functions tend to be on the long side?
Absolutely happens, but I try to start out with a well-encapsulated
idea of the functionality to be implemented. Then #lines-of-code
just is what it is. However, many tactical problems can arise with
that nice-sounding strategy. The "bags-on-bags" problem happens
pretty much 100% of the time over a long enough period.
Just one small thing after another, each requiring a few tweaks
here and there. Never enough to justify a refactor/rewrite/etc,
but accumulating over time until you're looking at an unmaintainable
mess. But (if I do say so myself:), I've become pretty talented
at maintaining messes (both my own and other people's), so that
my messes can get exceptionally large and messy while still doing
their intended job reliably.
I follow a simple rule: don't admit long function bodies. Sometimes
there are exceptions but they are few and far between. Following
this rule forces cleanup automatically, and incrementally.
Well, yeah, "well-encapsulated idea of functionality" typically
translates to ~50-100 lines of code, or thereabouts, in my case.
Occasionally not. But the subsequent "bags-on-bags" mess more than
occasionally wreaks havoc with that. But I don't "cleanup automatically
and incrementally". Too much else to do, whereby the cost-benefit of
cleanup is usually too cost-heavy until it becomes really necessary.
Post by Tim Rentsch
Post by John Forkosh
And I think one important dimension (among many) of a programmer's
skill is indeed the level of mess he can reliably maintain.
Because maintaining large messes inevitably goes with the territory.
And I further think you can often take a good guess about someone's
level of experience by how high a standard of perfection they apply
to the judgement of other people's code. They just haven't been
around long enough to understand how the real world works.
IMO that is a sign of the immaturity of the profession. But that
is a topic for another day.
Yeah, I think I've heard that sentiment for decades, re languages
and the development process/lifecycle. But in all that time,
no (n+1)^th-generation language seems to have come along to
succesfully introduce some new development paradigm.
Lisp-like functional stuff is certainly a very different model
than procedural/object/etc, and provides a very straightforward
representation for some class of problems (but not for the kinds
of problems that typically need to be solved, i.e., it's Turing
complete, but ridiculously cumbersome for most everyday problems).
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
I'm a big fan of internal comment documentation (having done
my fair share of assembly language programming),
[...] well-conceived and elaborate
internal documentation can (my opinion again) significantly extend
the maintainability and useful life of well-designed and -written
code.
Here I think you mean "extensive" more than "elaborate".
Post by John Forkosh
Post by Tim Rentsch
But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together.
As above (in gifsave89) I try to do both -- paragraphs of prose as
well as line-by-line. Internal documentation is the only documentation
that never (simply can't) get lost or separated. And well-written
documentation needs maintenance along with the code it's documenting.
That's way easier to do if it's staring you in the face as you're
editing the code (and you don't even have to open a separate window).
Otherwise, the documentation's way more likely to get staler and staler
over time. Likewise for readers, internal documentation is always
adjacent to the code it's documenting. No searching/alignment required.
Here I think you are using "internal" in several different
senses. For me documentation that happens to be in the same
source file doesn't automatically make it "internal". The
difference (or differences plural) is important.
In the ongoing context of this thread, "internal" (unless
otherwise decorated/modified) just means "same source file".
The large block of comments I put above the entry point of each
function document its "external view" to the caller: purpose,
arguments, return value(s), etc. And a Notes section, e.g.,
"caller should free() the returned pointer when finished using it",
and sometimes some "internals" notes (i.e., plural denoting the
algorithm/implementation). The notes sometimes include an
elaborate/extensive discussion of any not-well-known algorithm.
It's pretty much a boilerplate format that I write with every function.
Inline comments in the function body are typically about
its "internals".
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Sure, then skip the stuff you don't need. But those comments
you don't need may be useful to less experienced programmers.
Or not. But let me put it this way -- it's pretty easy to
write a "comment stripper" program; way harder to write a program
that meaningfully comments other programs.
I don't buy this, for several different reasons. The most
important are these: One, line-by-line comments like the ones
you use interfere with the code - just taking out the comments
isn't at all the same as writing the code (at the level of
individual function bodies) without comments to start with.
Two, the basic premise is contradicted by research studies on
different psychological modes. Having the comments there all
the time not only doesn't help, it actually slows people down.
I understand that you are used to the style of commenting that
you use, but I believe it's a poor choice just for you, and
even moreso for other people.
In my view, "slows people down" can be a very good thing.
Too fast is very bad, what with all the extra debugging,
poor decomposition/design/etc if you try to work faster
than the situation (and your skill) allows. Writing comments
indeed slows me down, and I find that a great help.
I always, always write that large "external view" comment block
first, before a single line of code, just to be absolutely clear
in my own mind what it is I'm doing, and why (and sometimes how).
Even when there already exists an extensive detail design
document, new stuff inevitably occurs to you as you sit down
to actually write lines of code.
And I'd imagine that any kind of psych study focuses on the
person being studied. Let's instead focus on the long-term
effects of comments on the program/system that's developed.
Like how maintainable is it many years after its original
developers have left the company and are no longer contactable?
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
Of course, "external documentation" also greatly impacts
maintainability/lifetime/etc, but let's hold that constant
and study the single-variable effect of "internal documentation".
I obviously don't know these answers, but I suppose you can
guess my opinion. And I think I can likewise guess yours.
But until such studies are actually available (might they
already be?), I guess we'll have no better benchmarks than
our own opinions.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
Tim Rentsch
2017-07-15 07:59:41 UTC
Permalink
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
[...]
Now cutting to just a couple key items.
Post by John Forkosh
Post by Tim Rentsch
I follow a simple rule: don't admit long function bodies.
Sometimes there are exceptions but they are few and far
between. Following this rule forces cleanup automatically,
and incrementally.
Well, yeah, "well-encapsulated idea of functionality" typically
translates to ~50-100 lines of code, or thereabouts, in my case.
Occasionally not. But the subsequent "bags-on-bags" mess more
than occasionally wreaks havoc with that. But I don't "cleanup
automatically and incrementally". Too much else to do, whereby
the cost-benefit of cleanup is usually too cost-heavy until it
becomes really necessary.
Here is a data point. The low end (50 lines) of your range there
is close to what I'm used to seeing as a 90th percentile in other
code - 90% of functions are 50 lines or less, with an average
usually in the 20's or 30's. For your gif code, 90th percentile
is about 115 lines, with an average in the low 40's.
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Sure, then skip the stuff you don't need. But those comments
you don't need may be useful to less experienced programmers.
Or not. But let me put it this way -- it's pretty easy to write
a "comment stripper" program; way harder to write a program
that meaningfully comments other programs.
I don't buy this, for several different reasons. The most
important are these: One, line-by-line comments like the ones
you use interfere with the code - just taking out the comments
isn't at all the same as writing the code (at the level of
individual function bodies) without comments to start with.
Two, the basic premise is contradicted by research studies on
different psychological modes. Having the comments there all
the time not only doesn't help, it actually slows people down.
I understand that you are used to the style of commenting that
you use, but I believe it's a poor choice just for you, and
even moreso for other people.
In my view, "slows people down" can be a very good thing.
Too fast is very bad, what with all the extra debugging,
poor decomposition/design/etc if you try to work faster
than the situation (and your skill) allows. Writing comments
indeed slows me down, and I find that a great help.
You have a strange combination of views. On the one hand, you
think being slowed down is a great help. On the other hand,
you're too busy to do code cleanup that would help the code be
both easier to understand and easier to work on. That seems a
rather discordant mixture.
Post by John Forkosh
And I'd imagine that any kind of psych study focuses on the
person being studied. Let's instead focus on the long-term
effects of comments on the program/system that's developed.
Like how maintainable is it many years after its original
developers have left the company and are no longer contactable?
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
No one is suggesting that code have no comments. The question is
what comments, what kinds of comments, and where should different
kinds of comments go.

Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
Post by John Forkosh
Of course, "external documentation" also greatly impacts
maintainability/lifetime/etc, but let's hold that constant
and study the single-variable effect of "internal documentation".
It would help if you could be specific about what kinds of
comments you identify as being "internal documentation", and
talk only about those. I had assumed you meant comments inside
function bodies (and only those), but now I'm not sure.
Post by John Forkosh
I obviously don't know these answers, but I suppose you can
guess my opinion. And I think I can likewise guess yours.
But until such studies are actually available (might they
already be?), I guess we'll have no better benchmarks than
our own opinions.
One difference between our two approaches is I always try to
ground my conclusions in empirical results. Another is I am in
the habit of exploring alternatives based on actual experience
rather than thought experiments. If all someone has to offer
is an opinion it usually isn't worth much.
Ben Bacarisse
2017-07-15 21:04:24 UTC
Permalink
<snip>
Post by Tim Rentsch
Post by John Forkosh
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
<snip>
Post by Tim Rentsch
Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
Strunk and White do not appear to discuss "comprised of".

The form "comprised of" is very widely used and is considered standard
by many authorities. For example, Oxford dictionaries online says:

"When this sense is used in the passive (as in the country is
comprised of twenty states), it is more or less synonymous with the
first sense (the country comprises twenty states). This usage is part
of standard English, [...]"

Anyway, unless I've missed the crucial part, Strunk and White has
nothing to say about this particular usage.

<snip>
--
Ben.
Tim Rentsch
2017-07-27 11:21:52 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
Post by John Forkosh
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
<snip>
Post by Tim Rentsch
Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
Strunk and White do not appear to discuss "comprised of".
S&W lists "comprise" as a commonly misused word. The entry says
(using /'s to mean italics)

/Comprise/. Literally, /embrace/. A zoo /comprises/
mammals, reptiles, and birds (because it embraces, or
includes, them). But animals do not comprise (embrace)
a zoo -- they /constitute/ a zoo.

It doesn't make sense to say "a system embraced of 1000 functions
in 100k lines of code". The system /embraces/ 1000 functions in
100k lines of code, not is embraced by or embraced of them. We
could use the word compose: "a system composed of 1000 functions
in 100k lines of code". To me it seems clear that the S&W entry
is meant to apply to all forms of comprise: comprise, comprises,
comprising, comprised by, ....
Post by Ben Bacarisse
The form "comprised of" is very widely used and is considered standard
"When this sense is used in the passive (as in the country is
comprised of twenty states), it is more or less synonymous with the
first sense (the country comprises twenty states). This usage is part
of standard English, [...]"
"Comprised of" is in common usage in much the same way that "I
could care less" is in common usage. The usage has become common
through years of misuse.
Post by Ben Bacarisse
Anyway, unless I've missed the crucial part, Strunk and White has
nothing to say about this particular usage.
They don't list the phrase because they list the word as a
misused word. Do you at least agree that the phrase doesn't make
sense given what they say about the meaning of /comprise/?
Ben Bacarisse
2017-07-27 15:48:32 UTC
Permalink
Post by Tim Rentsch
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
Post by John Forkosh
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
<snip>
Post by Tim Rentsch
Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
Strunk and White do not appear to discuss "comprised of".
S&W lists "comprise" as a commonly misused word. The entry says
(using /'s to mean italics)
/Comprise/. Literally, /embrace/. A zoo /comprises/
mammals, reptiles, and birds (because it embraces, or
includes, them). But animals do not comprise (embrace)
a zoo -- they /constitute/ a zoo.
Some authorities approve of "comprised of" and some don't. S&W does not
explicitly address the matter so I could not see why you cited it.
Every dictionary will give the same definition of the verb and most will
explain it can't be used "the other way round".

The dictionary I quoted gives the meaning *and* S&W's erroneous usage,
but then goes on to talk about the specific form "comprised of". Such
explicit sources are better countered by other equally explicit ones.
Post by Tim Rentsch
It doesn't make sense to say "a system embraced of 1000 functions
in 100k lines of code". The system /embraces/ 1000 functions in
100k lines of code, not is embraced by or embraced of them. We
could use the word compose: "a system composed of 1000 functions
in 100k lines of code". To me it seems clear that the S&W entry
is meant to apply to all forms of comprise: comprise, comprises,
comprising, comprised by, ....
I'm quite sure that they *would* object to it! but the fact is they
don't in any clear and unambiguous way, so it appeared to be a very odd
citation. Surely it would have been better to cite an authority that
does explicitly address the form in question?
Post by Tim Rentsch
Post by Ben Bacarisse
The form "comprised of" is very widely used and is considered standard
"When this sense is used in the passive (as in the country is
comprised of twenty states), it is more or less synonymous with the
first sense (the country comprises twenty states). This usage is part
of standard English, [...]"
"Comprised of" is in common usage in much the same way that "I
could care less" is in common usage. The usage has become common
through years of misuse.
I'd say it has become correct through many years of common usage.

(But it's a interesting idea that a usage would because common *through*
misuse. Despite misuse, yes, but through it? Are you suggesting that
an erroneous use is often particularly appealing?)
Post by Tim Rentsch
Post by Ben Bacarisse
Anyway, unless I've missed the crucial part, Strunk and White has
nothing to say about this particular usage.
They don't list the phrase because they list the word as a
misused word.
But they do list a misuse quite explicitly. Once could be forgiven for
thinking that that is the misuse which concerns them.
Post by Tim Rentsch
Do you at least agree that the phrase doesn't make
sense given what they say about the meaning of /comprise/?
No, I would not be so nice about it :-)
--
Ben.
Tim Rentsch
2017-07-28 13:59:21 UTC
Permalink
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
Post by John Forkosh
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
<snip>
Post by Tim Rentsch
Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
Strunk and White do not appear to discuss "comprised of".
S&W lists "comprise" as a commonly misused word. The entry says
(using /'s to mean italics)
/Comprise/. Literally, /embrace/. A zoo /comprises/
mammals, reptiles, and birds (because it embraces, or
includes, them). But animals do not comprise (embrace)
a zoo -- they /constitute/ a zoo.
Some authorities approve of "comprised of" and some don't. S&W
does not explicitly address the matter so I could not see why you
cited it. Every dictionary will give the same definition of the
verb and most will explain it can't be used "the other way round".
I cited S&W because it's a book I expect every serious author
will have read, or at least should have read. Considering what
all else the book says (eg, Rule 13: Omit needless words), I took
the entry for "comprise" to apply equally to "is comprised of"
(ie, as a misuage). I still do (though I see now that other
people may reach a different conclusion).
Post by Ben Bacarisse
The dictionary I quoted gives the meaning *and* S&W's erroneous usage,
but then goes on to talk about the specific form "comprised of". Such
explicit sources are better countered by other equally explicit ones.
I don't disagree, but there is more to it (see below).
Post by Ben Bacarisse
Post by Tim Rentsch
It doesn't make sense to say "a system embraced of 1000 functions
in 100k lines of code". The system /embraces/ 1000 functions in
100k lines of code, not is embraced by or embraced of them. We
could use the word compose: "a system composed of 1000 functions
in 100k lines of code". To me it seems clear that the S&W entry
is meant to apply to all forms of comprise: comprise, comprises,
comprising, comprised by, ....
I'm quite sure that they *would* object to it! but the fact is they
don't in any clear and unambiguous way, so it appeared to be a very odd
citation. Surely it would have been better to cite an authority that
does explicitly address the form in question?
Yes it would, but I don't know of any authority comparable to S&W
for the point being made. Even if "comprised of" is common usage,
and whether or not it is "correct" usage, it is an inadvisable
usage, according to Messieurs Strunk and White.
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
The form "comprised of" is very widely used and is considered standard
"When this sense is used in the passive (as in the country is
comprised of twenty states), it is more or less synonymous with the
first sense (the country comprises twenty states). This usage is part
of standard English, [...]"
"Comprised of" is in common usage in much the same way that "I
could care less" is in common usage. The usage has become common
through years of misuse.
I'd say it has become correct through many years of common usage.
(But it's a interesting idea that a usage would because common *through*
misuse. Despite misuse, yes, but through it? Are you suggesting that
an erroneous use is often particularly appealing?)
What I meant was it has become common because of being repeatedly
misused. I do think this particular misuse is appealing (and I
must confess I have been guilty of it in the past), and probably
that has contributed to its change in status. But all I meant was
it has been common following many years of misusage.
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
Anyway, unless I've missed the crucial part, Strunk and White has
nothing to say about this particular usage.
They don't list the phrase because they list the word as a
misused word.
But they do list a misuse quite explicitly. Once could be forgiven
for thinking that that is the misuse which concerns them.
Due to the nature of the book it seems obvious that their
counter-examples are not meant to be exhaustive. I agree
that someone could think that what S&W says doesn't apply
to "comprised of"; I hope though they would reach a different
conclusion after further reflection.
Post by Ben Bacarisse
Post by Tim Rentsch
Do you at least agree that the phrase doesn't make
sense given what they say about the meaning of /comprise/?
No, I would not be so nice about it :-)
That at least is good to hear. :)
Ben Bacarisse
2017-07-29 01:59:54 UTC
Permalink
Tim Rentsch <***@alumni.caltech.edu> writes:
<snip>
Post by Tim Rentsch
I cited S&W because it's a book I expect every serious author
will have read, or at least should have read.
I am amazed (and a tad offended) by that. My favourite texts are "The
Complete Plain Words" (Gowers), "Usage and Abusage" (Partridge) and the
delightfully acerbic Fowler (but I can't find my copy right now). I
think they do well enough.
Post by Tim Rentsch
Considering what
all else the book says (eg, Rule 13: Omit needless words), I took
the entry for "comprise" to apply equally to "is comprised of"
(ie, as a misuage). I still do (though I see now that other
people may reach a different conclusion).
So is the wordiness the problem? Surely not.

<snip>
--
Ben.
s***@casperkitty.com
2017-07-29 17:41:41 UTC
Permalink
Post by Ben Bacarisse
Post by Tim Rentsch
Considering what
all else the book says (eg, Rule 13: Omit needless words), I took
the entry for "comprise" to apply equally to "is comprised of"
(ie, as a misuage). I still do (though I see now that other
people may reach a different conclusion).
So is the wordiness the problem? Surely not.
In the absence of other clues as to whether P is made up of Q's, or Q
is made up of P's, some readers encountering "P comprises Q" will
interpret it each way. On the other hand, "X is comprised of Y" will
be interpreted as saying that X is made up of Y's, regardless of how
they would interpret "P comprises Q". As such, despite the presence
of four extra letters and two word spaces, I would consider the
"comprised of" usage superior.

I would view the active transitive form of "not comprise" as deprecated,
even though the "parts comprise the whole" sense could sometimes be useful
in cases where sentence structure would require listing the parts
first, and other substitute constructs wouldn't work as well. Consider
the meaning of:

Bob inspected the machined blocks that will comprise the main storage
vessel.

If the sentence were

Bob inspected the machined blocks that will be in the main storage
vessel

the main storage vessel might be a container *into which the blocks would
be placed*. If it had been

Bob inspected the machined blocks that will form the main storage
vessel

the blocks might be used as tooling (perhaps in a press) to shape (form)
a storage vessel. One could use some alternate structure like:

The main storage vessel will consist of machined blocks that Bob has
inspected.

but that would totally shift the focus. If "comprise" were consistently
recognized in the "parts comprise the whole" sense, the sentence using
that verb would be clean and clear. Perhaps the same meaning and focus
could be achieved other ways, but the common substitutions don't quite
work.
Tim Rentsch
2017-07-31 05:37:42 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
I cited S&W because it's a book I expect every serious author
will have read, or at least should have read.
I am amazed (and a tad offended) by that. My favourite texts are "The
Complete Plain Words" (Gowers), "Usage and Abusage" (Partridge) and the
delightfully acerbic Fowler (but I can't find my copy right now). I
think they do well enough.
Apparently I gave a wrong impression. I don't mean S&W is the
only such book worth reading, or that it is better than other
books like the ones you mention, or more authoritative, or
anything like that. I meant only that it is definitely worth
reading, and short enough and recommended often enough so anyone
serious about writing should want to read it if they haven't
already. I think it's more common, not better.

(Giving in to my compulsive desire for empirical data, I looked
up all four books on Amazon. Here is a summary - perhaps
unfairly selective, but not deliberately so:

Complete Plain Words Paperback: 320 pages 11 reviews
Usage and Abusage Paperback: 400 pages 7 reviews
Modern English Usage Hardcover: 928 pages 15 reviews
Elements of Style Paperback: 105 pages 2691 reviews

No conclusion from that, I'm just reporting what I found.)

Also the books you mention seem to me to be in a different
category. These books are ones to keep handy on the shelf, to go
back to when questions come up. S&W is short enough to read in
half an afternoon, and more the kind of book one absorbs than
goes back to as a reference. The difference is somewhat like
that between The Mythical Man-Month -- which IMO every serious
software developer should read if they haven't already -- and
Softare Engineering Economics -- which IMO is one they should
be familiar with and have on their shelf for reference. (Don't
take that analogy for more than it is - I am not likening the
writing in Gowers, Partridge, or Fowler to the writing in SEE,
just conjecturing a similarity in likely patterns of access.)
Post by Ben Bacarisse
Post by Tim Rentsch
Considering what
all else the book says (eg, Rule 13: Omit needless words), I took
the entry for "comprise" to apply equally to "is comprised of"
(ie, as a misuage). I still do (though I see now that other
people may reach a different conclusion).
So is the wordiness the problem? Surely not.
No, that isn't what I meant. The style of writing in S&W is
deliberately spare, which explains (or could explain) why they
gave only the one example, and didn't feel it was necessary to
give a separate example for "comprised of".
Ben Bacarisse
2017-07-31 12:10:34 UTC
Permalink
Post by Tim Rentsch
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
I cited S&W because it's a book I expect every serious author
will have read, or at least should have read.
I am amazed (and a tad offended) by that. My favourite texts are "The
Complete Plain Words" (Gowers), "Usage and Abusage" (Partridge) and the
delightfully acerbic Fowler (but I can't find my copy right now). I
think they do well enough.
Apparently I gave a wrong impression. I don't mean S&W is the
only such book worth reading, or that it is better than other
books like the ones you mention, or more authoritative, or
anything like that. I meant only that it is definitely worth
reading, and short enough and recommended often enough so anyone
serious about writing should want to read it if they haven't
already. I think it's more common, not better.
You said that you expected every serious author will have read it, or at
least should have read it. Maybe you were using should to mean would
(as in "I should say so"), but then either the two clauses mean very
similar things or the whole is ambiguous.

Anyway, now it seems the deficiency is simply that I don't *want* to read
it. I can live with that!

<snip>
--
Ben.
Tim Rentsch
2017-07-31 17:05:59 UTC
Permalink
Post by Ben Bacarisse
Post by Tim Rentsch
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
I cited S&W because it's a book I expect every serious author
will have read, or at least should have read.
I am amazed (and a tad offended) by that. My favourite texts are "The
Complete Plain Words" (Gowers), "Usage and Abusage" (Partridge) and the
delightfully acerbic Fowler (but I can't find my copy right now). I
think they do well enough.
Apparently I gave a wrong impression. I don't mean S&W is the
only such book worth reading, or that it is better than other
books like the ones you mention, or more authoritative, or
anything like that. I meant only that it is definitely worth
reading, and short enough and recommended often enough so anyone
serious about writing should want to read it if they haven't
already. I think it's more common, not better.
You said that you expected every serious author will have read it, or at
least should have read it. Maybe you were using should to mean would
(as in "I should say so"), but then either the two clauses mean very
similar things or the whole is ambiguous.
Yes, I didn't phrase that very well. I hope you will
excuse me if I am too lazy just now to try to rephrase it.
Post by Ben Bacarisse
Anyway, now it seems the deficiency is simply that I don't *want* to read
it. I can live with that!
Oh, that's curious. Do you mind if I ask why you don't
want to read it?
Ben Bacarisse
2017-08-02 03:52:52 UTC
Permalink
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Anyway, now it seems the deficiency is simply that I don't *want* to read
it. I can live with that!
Oh, that's curious. Do you mind if I ask why you don't
want to read it?
Sorry, my posting is sparse at the moment...

It's hard to pin down why. The competition is fierce: I have a big pile
of books I really, really want to read but have no time to read, and an
even longer list of books that might one day get on that pile. So
what's stopping S&W from climbing to the top of that list?

My first reaction is that it's about American usage and it's short, so
will it have anything useful to say that is new to me? I suspect it
says all the usual things about the roles of sentences and paragraphs,
about being concise and clear, and about avoiding cliches and
meaningless "fad" words. If so, are these so well-described as to be
worth the time and money? It seems unlikely.

But mostly I don't want another bossy, prescriptive book about usage
which is what I suspect S&W to be. I once cherished such books and
(shame on me) did my share of tutting at so-called split infinitives,
but I am happy to say that liberality in language has, in me, triumphed
over arbitrary authority.
--
Ben.
Tim Rentsch
2017-08-03 01:14:45 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by Tim Rentsch
Post by Ben Bacarisse
Anyway, now it seems the deficiency is simply that I don't *want* to read
it. I can live with that!
Oh, that's curious. Do you mind if I ask why you don't
want to read it?
Sorry, my posting is sparse at the moment...
It's hard to pin down why. The competition is fierce: I have a big pile
of books I really, really want to read but have no time to read, and an
even longer list of books that might one day get on that pile. So
what's stopping S&W from climbing to the top of that list?
My first reaction is that its about American usage and its short, so
will it have anything useful to say that is new to me? I suspect it
says all the usual things about the roles of sentences and paragraphs,
about being concise and clear, and about avoiding cliches and
meaningless "fad" words. If so, are these so well-described as to be
worth the time and money? It seems unlikely.
But mostly I don't want another bossy, prescriptive book about usage
which is what I suspect S&W to be. I once cherished such books and
(shame on me) did my share of tutting at so-called split infinitives,
but I am happy to say that liberality in language has, in me, triumphed
over arbitrary authority.
An excellent response. I understand your earlier comment now.

My impression of the book is that it is quite different from what
you expect (or more accurately, suspect). It's unfortunate that
the example that came up here (about comprise) reinforced a
misleading image. For what that might be worth.

I will make a recommendation though, since you clearly are
someone who is interested in language and in good writing,
for another book. Maybe you have read this already but
in case you haven't, it is "Genius", by James Gleick.

(We now return this channel to our regular programming.)
Gareth Owen
2017-07-27 18:15:37 UTC
Permalink
Post by Tim Rentsch
Post by Ben Bacarisse
Strunk and White do not appear to discuss "comprised of".
S&W lists "comprise" as a commonly misused word. The entry says
(using /'s to mean italics)
/Comprise/. Literally, /embrace/. A zoo /comprises/
mammals, reptiles, and birds (because it embraces, or
includes, them). But animals do not comprise (embrace)
a zoo -- they /constitute/ a zoo.
Maybe this is a US thing, but I don't think I've ever seen this usage in
UK English. "Comprise" in both my Cambridge and Oxford English
dictionary has primary meaning of "to be the constituent parts of".
s***@casperkitty.com
2017-07-27 18:35:17 UTC
Permalink
Post by Gareth Owen
Maybe this is a US thing, but I don't think I've ever seen this usage in
UK English. "Comprise" in both my Cambridge and Oxford English
dictionary has primary meaning of "to be the constituent parts of".
I remember reading the same thing in at least one grammar book in my
youth, and it would make sense given that the passive construction
"X is comprised of Ys" would be at best redundant if it meant the same
thing as "X comprises Ys" rather than "Ys comprise X". Further, there
would be no real need for the transitive verb "comprise" if it meant the
same thing as "consist of". Cases where sentences of the form "whole
verb parts" would make sense are best served by another verb or by "is
comprised of", but not all usages are amenable to that form. It would
be awkward, for example, to talk about the widgets of which a woozle
is comprised, for example, rather than the widgets comprising a woozle.
Ben Bacarisse
2017-07-27 22:03:45 UTC
Permalink
Post by Gareth Owen
Post by Tim Rentsch
Post by Ben Bacarisse
Strunk and White do not appear to discuss "comprised of".
S&W lists "comprise" as a commonly misused word. The entry says
(using /'s to mean italics)
/Comprise/. Literally, /embrace/. A zoo /comprises/
mammals, reptiles, and birds (because it embraces, or
includes, them). But animals do not comprise (embrace)
a zoo -- they /constitute/ a zoo.
Maybe this is a US thing, but I don't think I've ever seen this usage in
UK English. "Comprise" in both my Cambridge and Oxford English
dictionary has primary meaning of "to be the constituent parts of".
I have looked at both, and they both have the above (to my mind
uncontentious) definition as the first: "to have as parts or members"
(Cambridge English Dictionary online). It's the universal modern
meaning and is not, as far as I can see, in doubt. I'm surprised you've
never come across it.
--
Ben.
Gareth Owen
2017-07-29 08:28:52 UTC
Permalink
Post by Ben Bacarisse
I have looked at both, and they both have the above (to my mind
uncontentious) definition as the first: "to have as parts or members"
(Cambridge English Dictionary online). It's the universal modern
meaning and is not, as far as I can see, in doubt. I'm surprised you've
never come across it.
We appear to be at cross purposes. It's the "embrace" definition I've
never heard.
Ben Bacarisse
2017-07-29 11:43:30 UTC
Permalink
Post by Gareth Owen
Post by Ben Bacarisse
I have looked at both, and they both have the above (to my mind
uncontentious) definition as the first: "to have as parts or members"
(Cambridge English Dictionary online). It's the universal modern
meaning and is not, as far as I can see, in doubt. I'm surprised you've
never come across it.
We appear to be at cross purposes. It's the "embrace" definition I've
never heard.
Ah, I see. I don't think that's a definition of the word. When a book
like S&W says "literally /embrace/" it is likely to be referring to the
origin of the word, usually long gone.

Comprise comes to English from Latin (comprendere/comprehendere) via
French (comprendre). Even so, I am not entirely sure why they say
"literally /embrace/" but I'm no good at Latin.
--
Ben.
Tim Rentsch
2017-07-31 05:51:25 UTC
Permalink
Post by Gareth Owen
Post by Ben Bacarisse
I have looked at both, and they both have the above (to my mind
uncontentious) definition as the first: "to have as parts or members"
(Cambridge English Dictionary online). It's the universal modern
meaning and is not, as far as I can see, in doubt. I'm surprised you've
never come across it.
We appear to be at cross purposes. It's the "embrace" definition I've
never heard.
Note that "to have as parts or members" means more or less the
same thing as "embrace". Here are examples from three online
dictionaries that give "have as parts or members" (or very
similar) for a definition.

The accommodation comprises six bedrooms and three living rooms.

The special cabinet committee comprises Mr Brown, Mr Mandelson,
and Mr Straw.

The collection comprises 327 paintings.
Tim Rentsch
2017-07-28 14:16:11 UTC
Permalink
Post by Gareth Owen
Post by Tim Rentsch
Post by Ben Bacarisse
Strunk and White do not appear to discuss "comprised of".
S&W lists "comprise" as a commonly misused word. The entry says
(using /'s to mean italics)
/Comprise/. Literally, /embrace/. A zoo /comprises/
mammals, reptiles, and birds (because it embraces, or
includes, them). But animals do not comprise (embrace)
a zoo -- they /constitute/ a zoo.
Maybe this is a US thing, but I don't think I've ever seen this usage in
UK English. "Comprise" in both my Cambridge and Oxford English
dictionary has primary meaning of "to be the constituent parts of".
Interesting. Now I shall have to go find an OED and see what it
says.

Looking online, I found these links for pages that I think are
meant to address British English (at least in part):

http://dictionary.cambridge.org/dictionary/english/comprise
https://www.collinsdictionary.com/dictionary/english/comprise
John Forkosh
2017-07-16 00:11:11 UTC
Permalink
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
[...]
Now cutting to just a couple key items.
(Yeah, speaking of "too long"/"too many lines"/etc,
that pretty much characterizes this whole thread.
All I did was ask, "Is there a tricky/easy way to
declare all functions in a module static?" The
answer's apparently "No." Three ascii characters,
period included.)
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
I follow a simple rule: don't admit long function bodies.
Sometimes there are exceptions but they are few and far
between. Following this rule forces cleanup automatically,
and incrementally.
Well, yeah, "well-encapsulated idea of functionality" typically
translates to ~50-100 lines of code, or thereabouts, in my case.
Occasionally not. But the subsequent "bags-on-bags" mess more
than occasionally wreaks havoc with that. But I don't "cleanup
automatically and incrementally". Too much else to do, whereby
the cost-benefit of cleanup is usually too cost-heavy until it
becomes really necessary.
Here is a data point. The low end (50 lines) of your range there
is close to what I'm used to seeing as a 90th percentile in other
code - 90% of functions are 50 lines or less, with an average
usually in the 20's or 30's. For your gif code, 90th percentile
is about 115 lines, with an average in the low 40's.
Did you actually count that??? Or, hopefully, get some lint-like
program to do it for you? What program is that? I wouldn't mind
applying it to my other stuff, just to see what those numbers are.
C's often described (at least it used to be) as a "portable
assembly language", whereby "in the 20's and 30's" seems too short
to typically accomplish much useful. Of course, there are some
prng's and crc's, etc, etc, that can be brilliantly accomplished
in that length. But the decomposition of the "business logic"
of a program won't typically lend itself to such crisp functionality,
thereby translating to more lines of code.
I'd stand by my previous remark: you decompose the overall
functionality as best you can (which frequently has to take into
account the skill and experience of the people on your team to
whom you're assigning decomposed tasks), and then #lines-of-code
just is what it is.
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Sure, then skip the stuff you don't need. But those comments
you don't need may be useful to less experienced programmers.
Or not. But let me put it this way -- it's pretty easy to write
a "comment stripper" program; way harder to write a program
that meaningfully comments other programs.
I don't buy this, for several different reasons. The most
important are these: One, line-by-line comments like the ones
you use interfere with the code - just taking out the comments
isn't at all the same as writing the code (at the level of
individual function bodies) without comments to start with.
Two, the basic premise is contradicted by research studies on
different psychological modes. Having the comments there all
the time not only doesn't help, it actually slows people down.
I understand that you are used to the style of commenting that
you use, but I believe it's a poor choice just for you, and
even moreso for other people.
In my view, "slows people down" can be a very good thing.
Too fast is very bad, what with all the extra debugging,
poor decomposition/design/etc if you try to work faster
than the situation (and your skill) allows. Writing comments
indeed slows me down, and I find that a great help.
You have a strange combination of views. On the one hand, you
think being slowed down is a great help. On the other hand,
you're too busy to do code cleanup that would help the code be
both easier to understand and easier to work on. That seems a
rather discordant mixture.
No, that's not what I said. I don't think being slowed down
is a great help >>in and of itself<<. I think getting things
right is what's great. And what I said is that "working faster
that the situation allows" defeats that, whereby slowing down
is necessary and helpful. For me, comments help me get things
right. I also happen to think they help other people who need
to maintain the code. You apparently don't. Fine. Either way,
I don't really mind unmaintainable code -- I just call it
"job security".
Post by Tim Rentsch
Post by John Forkosh
And I'd imagine that any kind of psych study focuses on the
person being studied. Let's instead focus on the long-term
effects of comments on the program/system that's developed.
Like how maintainable is it many years after its original
developers have left the company and are no longer contactable?
Or, what's the average useful life of a system comprised of, say,
1000 functions in 100K lines of code, with and without comments?
No one is suggesting that code have no comments. The question is
what comments, what kinds of comments, and where should different
kinds of comments go.
As above, we apparently (very apparently) have different opinions
about that, and are both pretty much set in our ways about that.
So let's just agree to disagree (since if we disagreed to disagree,
we'd have to agree, which we don't).
Post by Tim Rentsch
Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
And what do Strunk&White have to say about "-pedantic"? :)
By the way, I actually do have a copy of that, but find it
almost always useless -- the toc and index hardly ever help me
actually locate whatever I happen to be looking for help with
(but note that it is okay to end a sentence with a preposition).
However, contrary to BenB's assertion, I do see "comprise 43"
in the index of my 1979 edition (but no mention of pedantic).
Post by Tim Rentsch
Post by John Forkosh
Of course, "external documentation" also greatly impacts
maintainability/lifetime/etc, but let's hold that constant
and study the single-variable effect of "internal documentation".
It would help if you could be specific about what kinds of
comments you identify as being "internal documentation", and
talk only about those. I had assumed you meant comments inside
function bodies (and only those), but now I'm not sure.
You mentioned that previously, and I replied that in the context
of this thread I just meant "in the same source file". And I added
that I always write large comment blocks just above the
entry point of each function documenting its "external view".
Indeed, I always write that before a single line of code, just to
get it crystal clear in my own mind what the code should accomplish.
Of course, I often subsequently have to come back and edit it
to accommodate necessary (or preferable) changes that emerge while
actually writing the code.
"Inside function bodies" is what I previously called inline
comments, which typically (typical for me, anyway) document the
function's internals, pretty much like we previously discussed
regarding typical assembly language programming style.
Post by Tim Rentsch
Post by John Forkosh
I obviously don't know these answers, but I suppose you can
guess my opinion. And I think I can likewise guess yours.
But until such studies are actually available (might they
already be?), I guess we'll have no better benchmarks than
our own opinions.
One difference between our two approaches is I always try to
ground my conclusions in empirical results. Another is I am in
the habit of exploring alternatives based on actual experience
rather than thought experiments. If all someone has to offer
is an opinion it usually isn't worth much.
Then agreed. Nothing I've said is worth much.
Sorry to have bothered you all.
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
GOTHIER Nathan
2017-07-16 00:35:10 UTC
Permalink
On Sun, 16 Jul 2017 00:11:11 +0000 (UTC)
Post by John Forkosh
C's often described (at least it used to be) as a "portable
assembly language", whereby "in the 20's and 30's" seems too short
to typically accomplish much useful. Of course, there are some
prng's and crc's, etc, etc, that can be brilliantly accomplished
in that length. But the decomposition of the "business logic"
of a program won't typically lend itself to such crisp functionality,
thereby translating to more lines of code.
If you are unable to implement a function in less than 25 lines then you should
realize that your function isn't reduced to basic functionalities such as those
provided by the C standard library.
Öö Tiib
2017-07-16 13:15:57 UTC
Permalink
Post by GOTHIER Nathan
On Sun, 16 Jul 2017 00:11:11 +0000 (UTC)
Post by John Forkosh
C's often described (at least it used to be) as a "portable
assembly language", whereby "in the 20's and 30's" seems too short
to typically accomplish much useful. Of course, there are some
prng's and crc's, etc, etc, that can be brilliantly accomplished
in that length. But the decomposition of the "business logic"
of a program won't typically lend itself to such crisp functionality,
thereby translating to more lines of code.
If you are unable to implement a function in less than 25 lines then
you should realize that your function isn't reduced to basic
functionalities such as those provided by the C standard library.
Function containing a switch with 10 cases is typically longer
than 25 lines.
GOTHIER Nathan
2017-07-16 14:17:04 UTC
Permalink
On Sun, 16 Jul 2017 06:15:57 -0700 (PDT)
Post by Öö Tiib
Function containing a switch with 10 cases is typically longer
than 25 lines.
A switch of 10 cases (or more) should be split in subroutines or refactored to
a more generic code to be clear.
Öö Tiib
2017-07-16 20:06:12 UTC
Permalink
Post by GOTHIER Nathan
On Sun, 16 Jul 2017 06:15:57 -0700 (PDT)
Post by Öö Tiib
Function containing a switch with 10 cases is typically longer
than 25 lines.
A switch of 10 cases (or more) should be split in subroutines or refactored to
a more generic code to be clear.
You suggest to split long switch into several functions with partial switches?
GOTHIER Nathan
2017-07-16 23:36:19 UTC
Permalink
On Sun, 16 Jul 2017 13:06:12 -0700 (PDT)
Post by Öö Tiib
You suggest to split long switch into several functions with partial switches?
Indeed.
Öö Tiib
2017-07-17 17:57:20 UTC
Permalink
Post by GOTHIER Nathan
On Sun, 16 Jul 2017 13:06:12 -0700 (PDT)
Post by Öö Tiib
You suggest to split long switch into several functions with partial switches?
Indeed.
Sounds like suggestion that makes code harder to follow and to maintain.
GOTHIER Nathan
2017-07-18 01:50:24 UTC
Permalink
On Mon, 17 Jul 2017 10:57:20 -0700 (PDT)
Post by Öö Tiib
Sounds like suggestion that makes code harder to follow and to maintain.
Do you suggest modular programming is harder to maintain?
Melzzzzz
2017-07-18 02:06:10 UTC
Permalink
Post by GOTHIER Nathan
On Mon, 17 Jul 2017 10:57:20 -0700 (PDT)
Post by Öö Tiib
Sounds like suggestion that makes code harder to follow and to maintain.
Do you suggest modular programming is harder to maintain?
Is that modular programminG?
--
press any key to continue or any other to quit...
GOTHIER Nathan
2017-07-18 02:35:31 UTC
Permalink
On Tue, 18 Jul 2017 02:06:10 +0000 (UTC)
Post by Melzzzzz
Is that modular programminG?
Indeed. Modular programming is literally grouping the code in modules which
could be any kind of files (sources, binaries) or functions.
Chris M. Thomasson
2017-07-19 05:41:11 UTC
Permalink
Post by GOTHIER Nathan
On Tue, 18 Jul 2017 02:06:10 +0000 (UTC)
Post by Melzzzzz
Is that modular programminG?
Indeed. Modular programming is literally grouping the code in modules which
could be any kind of files (sources, binaries) or functions.
Fractal programming ! ;^)
Öö Tiib
2017-07-18 02:27:20 UTC
Permalink
Post by GOTHIER Nathan
On Mon, 17 Jul 2017 10:57:20 -0700 (PDT)
Post by Öö Tiib
Sounds like suggestion that makes code harder to follow and to maintain.
Do you suggest modular programming is harder to maintain?
No. Why? C programming language lacks support of modules. So modular
programming in C has to be done on meta level outside of features of
the language.

You did snip and forgot? We discussed how to deal with long switch-case.
I said that your suggestion to split what is logically one long
switch-case artificially into partial switch-cases makes the code
harder to read and to maintain.
Ian Collins
2017-07-18 05:42:54 UTC
Permalink
Post by Öö Tiib
Post by GOTHIER Nathan
On Mon, 17 Jul 2017 10:57:20 -0700 (PDT)
Post by Öö Tiib
Sounds like suggestion that makes code harder to follow and to maintain.
Do you suggest modular programming is harder to maintain?
No. Why? C programming language lacks support of modules. So modular
programming in C has to be done on meta level outside of features of
the language.
You did snip and forgot? We discussed how to deal with long switch-case.
I said that your suggestion to split what is logically one long
switch-case artificially into partial switch-cases makes the code
harder to read and to maintain.
Agreed. The main problem with spitting a big case is the need to add
another layer to do the spitting which ends up being yet another switch
or an ugly if/else chain.
--
Ian
h***@gmail.com
2017-07-18 07:35:52 UTC
Permalink
On Monday, July 17, 2017 at 10:43:10 PM UTC-7, Ian Collins wrote:

(snip on keeping functions small, and large switch/case)
Post by Ian Collins
Agreed. The main problem with spitting a big case is the need to add
another layer to do the spitting which ends up being yet another switch
or an ugly if/else chain.
If the question is readability, then you have to consider each
case (haha) separately.

It might be that there is a natural separation, in which case
split it as appropriate. If there isn't, then consider how else
to make it readable.

I don't believe that there is always a general rule that shorter
is better.
h***@gmail.com
2017-07-16 18:11:07 UTC
Permalink
On Saturday, July 15, 2017 at 5:39:16 PM UTC-7, GOTHIER Nathan wrote:

(snip)
Post by GOTHIER Nathan
If you are unable to implement a function in less than 25 lines
then you should realize that your function isn't reduced to basic
functionalities such as those provided by the C standard library.
Yes, but sometimes we need to implement them, anyway.

There are some large scientific systems, for computational physics
or computational chemistry, in the millions, or tens of millions
of lines.

Often enough, there isn't a good place to divide them, and so the
get larger than 25 lines.

In the IBM S/360 (and s/370) days, it was nice to get each
function below about 8K bytes of object code, as it made addressing
easier, and more registers available for the compiler to use.
That might be 200 or so lines.

For 16 bit x86 code, and I believe Java/JVM, a function/method
should be less than 64K bytes of object code. I hope that limit
isn't a big restriction, which might be 1000 lines or so.
Ben Bacarisse
2017-07-16 10:52:45 UTC
Permalink
<snip>
Post by John Forkosh
Post by Tim Rentsch
Incidentally, re: the word "comprise" - see "The Elements of
Style", by Strunk and White. The system comprises its
functions, not the other way around.
<snip>
Post by John Forkosh
However, contrary to BenB's assertion, I do see "comprise 43"
That's not contrary to any assertion I made. The book simply explains
the meaning of comprise and takes no position on "comprised of".

<snip>
--
Ben.
James R. Kuyper
2017-07-17 15:15:31 UTC
Permalink
On 07/15/2017 08:11 PM, John Forkosh wrote:
...
Post by John Forkosh
(Yeah, speaking of "too long"/"too many lines"/etc,
that pretty much characterizes this whole thread.
All I did was ask, "Is there a tricky/easy way to
declare all functions in a module static?" The
answer's apparently "No." Three ascii characters,
period included.)
True - but the fact that you felt a desire for such a feature was a
strong hint that you might be using bad coding practices, and that
you're suffering from precisely those problems that good coding
practices are intended to minimize. Virtually everything you've written
subsequently in this thread has added support to that first impression.
People have given you advice beyond the simple answer that you looked
for, because they're trying to be helpful. You've made it pretty clear
that you have no interest in avoiding those problems by adopting better
coding practices, but they couldn't anticipate that in advance.
Tim Rentsch
2017-07-27 12:02:06 UTC
Permalink
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
[...]
Now cutting to just a couple key items.
(Yeah, speaking of "too long"/"too many lines"/etc,
that pretty much characterizes this whole thread.
All I did was ask, "Is there a tricky/easy way to
declare all functions in a module static?" The
answer's apparently "No." Three ascii characters,
period included.)
A more accurate answer is that it depends on what the constraints
and target environments are.
Post by John Forkosh
Post by Tim Rentsch
Post by John Forkosh
Post by Tim Rentsch
I follow a simple rule: don't admit long function bodies.
Sometimes there are exceptions but they are few and far
between. Following this rule forces cleanup automatically,
and incrementally.
Well, yeah, "well-encapsulated idea of functionality" typically
translates to ~50-100 lines of code, or thereabouts, in my case.
Occasionally not. But the subsequent "bags-on-bags" mess more
than occasionally wreaks havoc with that. But I don't "cleanup
automatically and incrementally". Too much else to do, whereby
the cost-benefit of cleanup is usually too cost-heavy until it
becomes really necessary.
Here is a data point. The low end (50 lines) of your range there
is close to what I'm used to seeing as a 90th percentile in other
code - 90% of functions are 50 lines or less, with an average
usually in the 20's or 30's. For your gif code, 90th percentile
is about 115 lines, with an average in the low 40's.
Did you actually count that???
Yes, of course.
Post by John Forkosh
Or, hopefully, get some lint-like
program to do it for you? What program is that?
It was a combination of source metrics program (there are at
least several of these publicly available, I think), and a small
about of massaging after the fact - a few lines of shell script,
and a short awk program.
Post by John Forkosh
I wouldn't mind
applying it to my other stuff, just to see what those numbers are.
Once you have the function line counts, it's easy:

line-counts *.c | sort -n | uniq -c | percentiles

with 'percentiles' being, eg, a 20-line awk script that shows
percentages along with running totals.
Post by John Forkosh
C's often described (at least it used to be) as a "portable
assembly language", whereby "in the 20's and 30's" seems too short
to typically accomplish much useful. Of course, there are some
prng's and crc's, etc, etc, that can be brilliantly accomplished
in that length. But the decomposition of the "business logic"
of a program won't typically lend itself to such crisp functionality,
thereby translating to more lines of code.
To get a comparison I got statistics for samba, which is open
source and good sized: over 2800 .c files, over 35,000 functions,
including some real monsters (several functions over 1000 lines
each, with the largest almost 4000 lines) - about 1.2 million
lines of function bodies in about 1.7 million lines of .c files.
Here is the breakdown:

50-th percentile 18 lines
60-th percentile 25 lines
70-th percentile 35 lines
80-th percentile 49 lines
90-th percentile 78 lines

with an average of 33.32 lines per function, which includes an
average of 5.00 blank lines per function.
Post by John Forkosh
I'd stand by my previous remark: you decompose the overall
functionality as best you can (which frequently has to take into
account the skill and experience of the people on your team to
whom you're assigning decomposed tasks), and then #lines-of-code
just is what it is.
Okay, but if your averages are significantly above industry norms
(for some value of "industry", for which you might or might not
want to include open-source projects like samba), you might want
to re-examine your habits and processes, and see if they might be
improved.
Post by John Forkosh
[...]
Post by Tim Rentsch
Post by John Forkosh
I obviously don't know these answers, but I suppose you can
guess my opinion. And I think I can likewise guess yours.
But until such studies are actually available (might they
already be?), I guess we'll have no better benchmarks than
our own opinions.
One difference between our two approaches is I always try to
ground my conclusions in empirical results. Another is I am in
the habit of exploring alternatives based on actual experience
rather than thought experiments. If all someone has to offer
is an opinion it usually isn't worth much.
Then agreed. Nothing I've said is worth much.
Sorry to have bothered you all.
Rather than just giving up and going away, why not try collecting
some data and statistics, and see where it leads you? Who knows,
you might end up changing some opinions.
h***@gmail.com
2017-07-13 19:41:10 UTC
Permalink
On Monday, July 10, 2017 at 6:16:30 PM UTC-7, Tim Rentsch wrote:

(snip someone wrote)
Post by Tim Rentsch
Post by John Forkosh
I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming), though never could get to like
Knuth's literate programming.
Probably I'll be burned at the stake for saying this, but IMO
the whole "literate programming" idea from Knuth is a big step
in the wrong direction. I have great respect for Don Knuth,
who is an amazingly smart guy, but this idea isn't one of his
better ones.
Seems to me that Web is useful if writing a book about the
actual program (see Tex:The Program and Metafont:The Program)
makes sense, such that an actual book publisher might publish it.
(Or maybe self published, but that some might read as a book.)

But in the more common case, I suspect you are closer to correct.

(I have read at least parts of those books, though not sequential
and cover to cover.)

Some years ago, I had to get Metafont running using the SunOS
Pascal compiler (that was before Web2C). Having the book helped.
Post by Tim Rentsch
As for internal comment documentation - there was a time in the
distant past when I had done more programming in assembly language
(or languages plural) than all other languages combined. And of
course in those days it was common to write comments more or less
on every line. But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together. In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Well, supposedly comments are to help other readers of your
programs, especially if done as part of a group project, or that
you are paid to write for someone else.

But otherwise, there are some tricky, and not so obvious even if
you read your own code somewhat later, things that should be
directly commented.
Post by Tim Rentsch
Anyway FWIW those are my reactions. I hope my other comments
helped (and I'm glad you enjoyed my animal names :).
Tim Rentsch
2017-07-14 20:56:59 UTC
Permalink
Post by h***@gmail.com
(snip someone wrote)
Post by Tim Rentsch
Post by John Forkosh
I'm a _big_ fan of
internal comment documentation (having done my fair share of
assembly language programming), [...]
As for internal comment documentation - there was a time in the
distant past when I had done more programming in assembly language
(or languages plural) than all other languages combined. And of
course in those days it was common to write comments more or less
on every line. But I grew out of that phase, and at pretty much
the same time grew out of the habit of writing "internal comments"
as they might be called. In most cases I find them more hindrance
than help. For my money I would much rather see each .c file
have two or three well-written prose paragraphs saying what the
module is for and how it hangs together. In nearly all cases I
can figure out the low-level stuff without any help (especially
if the function composition style is good), but figuring out the
high-level stuff is not nearly as accessible.
Well, supposedly comments are to help other readers of your
programs, especially if done as part of a group project, or that
you are paid to write for someone else.
But otherwise, there are some tricky, and not so obvious even if
you read your own code somewhat later, things that should be
directly commented.
I'm not saying programs shouldn't have comments, or even that
they shouldn't have comments in source files. The question is
what comments, what kinds of comments, and where should different
kinds of comments go.
David Brown
2017-07-11 20:40:17 UTC
Permalink
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-there-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
No magic - just add the "static" modifiers.

Of course, now you have learned your lesson - all functions, and all
file-level variables, should be declared "static" unless you are
actively choosing to export them from the module. It makes your code
more robust when using it in different circumstances (as you are seeing
here), as well as giving the compiler more opportunities for static
error checking and optimisation.

(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
exported, such as by using "extern", and everything else should have
been "static". But unless you are a Time Lord, you have to live with
this state and get used to putting "static" everywhere.)
John Forkosh
2017-07-11 21:57:05 UTC
Permalink
Post by David Brown
[...]
(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
exported, such as by using "extern", and everything else should have
been "static". But unless you are a Time Lord, you have to live with
this state and get used to putting "static" everywhere.)
Seems like the better default alternative, now that you mention it.
So is there some reason it isn't?, i.e., any reason they did what
they did?
--
John Forkosh ( mailto: ***@f.com where j=john and f=forkosh )
David Brown
2017-07-12 22:09:48 UTC
Permalink
Post by John Forkosh
Post by David Brown
[...]
(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
exported, such as by using "extern", and everything else should have
been "static". But unless you are a Time Lord, you have to live with
this state and get used to putting "static" everywhere.)
Seems like the better default alternative, now that you mention it.
So is there some reason it isn't?, i.e., any reason they did what
they did?
There are several "defaults" in C that I disagree with - default "int
foo();" when a function has not been declared, default "int" in various
places, default "fall-through" behaviour in switch statements, to name a
few. Some of these were fixed in C99, others remain - consistency and
backwards compatibility is vital to C. I don't know the reasoning
behind such decisions.
David Kleinecke
2017-07-13 04:50:25 UTC
Permalink
Post by David Brown
Post by John Forkosh
Post by David Brown
[...]
(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
exported, such as by using "extern", and everything else should have
been "static". But unless you are a Time Lord, you have to live with
this state and get used to putting "static" everywhere.)
Seems like the better default alternative, now that you mention it.
So is there some reason it isn't?, i.e., any reason they did what
they did?
There are several "defaults" in C that I disagree with - default "int
foo();" when a function has not been declared, default "int" in various
places, default "fall-through" behaviour in switch statements, to name a
few. Some of these were fixed in C99, others remain - consistency and
backwards compatibility is vital to C. I don't know the reasoning
behind such decisions.
The switch fall through was preferred because that way
more than one case could share the same code.

If I had been doing it I would have an implicit break
at the end of each case unless the last statement is a
continue. That, of course, prevents one from making a
jump out over several nested (usually switches) with a
continue. But I think I could live without that.
s***@casperkitty.com
2017-07-13 16:26:47 UTC
Permalink
Post by David Kleinecke
The switch fall through was preferred because that way
more than one case could share the same code.
More notably, the switch labels behave as goto targets, rather than as
executable statements. Perhaps it would have been helpful if a convention
had emerged to put "break" on a line immediately preceding each case label
[including the first] for which fall through wasn't required. If compilers
silently ignored unreachable breaks, that would make any missing break
statements visually obvious [if compilers squawks about redundant breaks
make it necessary to omit them even in some non-fall-through cases, such
cases would be visually indistinguishable from deliberate fall-through
ones].
Ike Naar
2017-07-13 22:11:52 UTC
Permalink
Post by s***@casperkitty.com
Post by David Kleinecke
The switch fall through was preferred because that way
more than one case could share the same code.
More notably, the switch labels behave as goto targets, rather than as
executable statements. Perhaps it would have been helpful if a convention
had emerged to put "break" on a line immediately preceding each case label
[including the first] for which fall through wasn't required. If compilers
silently ignored unreachable breaks, that would make any missing break
statements visually obvious [if compilers squawks about redundant breaks
make it necessary to omit them even in some non-fall-through cases, such
cases would be visually indistinguishable from deliberate fall-through
ones].
The amount of weasel words makes it unclear what you're trying to say.

Can we take a simple example?

#include <stdio.h>

static void example(int tag)
{
switch (tag)
{
case 1:
puts("one");
case 2: /* deliberate falltrough */
puts("one or two");
break; /* required break */
default:
puts("not one nor two");
}
}

Where would it have been helpful to put extra breaks, and how would the
compiler diagnostics differ if those extra breaks were added?
s***@casperkitty.com
2017-07-13 23:12:55 UTC
Permalink
Post by Ike Naar
Post by s***@casperkitty.com
Post by David Kleinecke
The switch fall through was preferred because that way
more than one case could share the same code.
More notably, the switch labels behave as goto targets, rather than as
executable statements. Perhaps it would have been helpful if a convention
had emerged to put "break" on a line immediately preceding each case label
[including the first] for which fall through wasn't required. If compilers
silently ignored unreachable breaks, that would make any missing break
statements visually obvious [if compilers squawks about redundant breaks
make it necessary to omit them even in some non-fall-through cases, such
cases would be visually indistinguishable from deliberate fall-through
ones].
The amount of weasel words makes it unclear what you're trying to say.
Can we take a simple example?
#include <stdio.h>
static void example(int tag)
{
switch (tag)
{
puts("one");
case 2: /* deliberate falltrough */
puts("one or two");
break; /* required break */
puts("not one nor two");
}
}
Where would it have been helpful to put extra breaks, and how would the
compiler diagnostics differ if those extra breaks were added?
int void example(int tag)
{
switch (tag, int mode1, int mode2)
{
break;
case 1:
puts("one");
case 2: /* deliberate falltrough */
puts("one or two");
break;
case 3:
puts("three");
if (mode1)
return 1;
else
return 2;
break;
case 4:
puts("four");
if (mode1)
{
if (mode2)
return 1;
}
else
return 2;
break;
case 5:
puts("five");
break;
default:
puts("not one nor two");
}
puts("Switch exited normally");
return 0;
}

Many compilers would either squawk at the breaks before cases 1 an 4, or
generate wasteful code for them, but omitting those breaks (especially
for case 4) would make the following case labels look as thought they
could be fallen into from above. Note that there is nothing visually
obvious that would distinguish the code before case 4 (which can't fall
through) from the code before case 5 (which can).

One could decide to either place "break" and "case" on the same line or
on separate lines with matching indents (as opposed to indenting "break"
to match the code above), but the key point would be that every case
that isn't expected to fall through should have a break whether or not
it would be reachable.
David Thompson
2017-08-21 18:48:18 UTC
Permalink
On Thu, 13 Jul 2017 00:09:48 +0200, David Brown
Post by David Brown
Post by John Forkosh
Post by David Brown
[...]
(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
<snip>
Post by David Brown
Post by John Forkosh
Seems like the better default alternative, now that you mention it.
<snip>
Post by David Brown
There are several "defaults" in C that I disagree with - default "int
foo();" when a function has not been declared, default "int" in various
places, default "fall-through" behaviour in switch statements, to name a
few. Some of these were fixed in C99, others remain - consistency and
backwards compatibility is vital to C. I don't know the reasoning
behind such decisions.
Default int and default func-return-int were clearly remnants from
'typeless' BCPL via B; if you haven't seen Ritchie's HOPL 2 paper --
(still !) available free at https://www.bell-labs.com/usr/dmr/www/
-- you should probably have a look.

switch/case fallthrough also appears to have come from BCPL, at least
the version of BCPL dmr/kmt/bwk started from, documented on a subpage
(there were several at the time) but I don't know why BCPL did that.
It's not at all fundamental the way typelessness was.
s***@casperkitty.com
2017-08-21 19:35:54 UTC
Permalink
Post by David Thompson
switch/case fallthrough also appears to have come from BCPL, at least
the version of BCPL dmr/kmt/bwk started from, documented on a subpage
(there were several at the time) but I don't know why BCPL did that.
It's not at all fundamental the way typelessness was.
Switch/case is not a "conditional execution statement" construct in the
style of if/else, but is instead a form of computed goto. It is commonly
used [with the aid of "break"] as an "execute one of the following blocks"
construct, but the ability to include a case label within a while loop
within a case statement [as famously illustrated by Tom Duff] shows that
it's really just a form of "goto".
David Kleinecke
2017-08-22 01:37:17 UTC
Permalink
Post by s***@casperkitty.com
Post by David Thompson
switch/case fallthrough also appears to have come from BCPL, at least
the version of BCPL dmr/kmt/bwk started from, documented on a subpage
(there were several at the time) but I don't know why BCPL did that.
It's not at all fundamental the way typelessness was.
Switch/case is not a "conditional execution statement" construct in the
style of if/else, but is instead a form of computed goto. It is commonly
used [with the aid of "break"] as an "execute one of the following blocks"
construct, but the ability to include a case label within a while loop
within a case statement [as famously illustrated by Tom Duff] shows that
it's really just a form of "goto".
An intelligent compiler should be able to decide whether what
is presented as switch/case is best implemented as if/else or a
jump table. I have seen lots of switch/case coding with only
two or three cases - such, it seems to me, are best turned into
if/else logic.
David Brown
2017-08-22 09:37:24 UTC
Permalink
Post by David Kleinecke
Post by s***@casperkitty.com
Post by David Thompson
switch/case fallthrough also appears to have come from BCPL, at least
the version of BCPL dmr/kmt/bwk started from, documented on a subpage
(there were several at the time) but I don't know why BCPL did that.
It's not at all fundamental the way typelessness was.
Switch/case is not a "conditional execution statement" construct in the
style of if/else, but is instead a form of computed goto. It is commonly
used [with the aid of "break"] as an "execute one of the following blocks"
construct, but the ability to include a case label within a while loop
within a case statement [as famously illustrated by Tom Duff] shows that
it's really just a form of "goto".
An intelligent compiler should be able to decide whether what
is presented as switch/case is best implemented as if/else or a
jump table. I have seen lots of switch/case coding with only
two or three cases - such, it seems to me, are best turned into
if/else logic.
You are mixing up the logical meaning of the statement in C, and its
implementation. Compilers can (and do) implement it in a variety of
ways. But that is a matter of optimisation and quality of
implementation - it has nothing to do with what the "switch" statement
does in C.
bartc
2017-08-22 11:05:05 UTC
Permalink
Post by David Kleinecke
Post by s***@casperkitty.com
Switch/case is not a "conditional execution statement" construct in the
style of if/else, but is instead a form of computed goto. It is commonly
used [with the aid of "break"] as an "execute one of the following blocks"
construct, but the ability to include a case label within a while loop
within a case statement [as famously illustrated by Tom Duff] shows that
it's really just a form of "goto".
An intelligent compiler should be able to decide whether what
is presented as switch/case is best implemented as if/else or a
jump table. I have seen lots of switch/case coding with only
two or three cases - such, it seems to me, are best turned into
if/else logic.
No. If you doing a series of unrelated tests, for example whether X==A,
or Y==B, or Z==C, then if-else is suited to that.

But if the pattern is whether X==A or X==B or X==C, then switch is a
better way of expressing it. (Because of C limitations, only when X is
as int type (whose value is not changed by the test), and only when A, B
and C are constants.)


It's not really to do with how many tests there are, unless there's only
one. Then You might still use switch if you expect the number to increase.
--
bartc
Gordon Burditt
2017-08-25 02:47:27 UTC
Permalink
Post by David Kleinecke
An intelligent compiler should be able to decide whether what
is presented as switch/case is best implemented as if/else or a
jump table. I have seen lots of switch/case coding with only
two or three cases - such, it seems to me, are best turned into
if/else logic.
Some (many?) of them use *BOTH*. If for some reason you use a
switch with cases 1, 10, 1000, 10000, 100000, 1000000, and 10000000,
it won't use a jump table because it would take a *lot* of space.
If that switch also had cases for 'a' - 'z' and 'A' - 'Z', with
only a few letters missing, it might use one jump table for 'a' -
'z', one jump table for 'A' - 'Z', and if/else for the rest and to
decide which jump table to enter, if any. Or maybe it would decide
that 'a' - 'z' and 'A' - 'Z' are close enough to put them in one
jump table. Jump tables generally need an if/else around them
anyway to deal with the out-of-range values that go to the default:
case, but sometimes the compiler can be smart about this:

switch (x & 3) {
case 0: ......; break
case 1: ......; break;
case 2: ......; break;
case 3: ......; break;
}
doesn't need code for a default case.

The code generated by this compiler was sometimes confusing (if you
needed to read the generated code) because it treated all case label
values as unsigned, whether the variable involved was unsigned or
not, and did only unsigned comparisons. It still worked fine,
though. It did mean that a switch with values -40 thru 40 would
use TWO jump tables, though, (with if/else to determine which to
use) although I have yet to see code that really did that.

I needed to read it because someone was trying to make the compiler
smarter about code generation and sometimes he overdid it and
generated buggy code. I was helping to test it. Generally I wanted
to point at a section of code that was *WRONG* before reporting a
bug. Sometimes the compiler had really checked for a special case
and generated that looked really wrong but actually, in this
particular situation, wasn't.

I believe the break-even point between if/else and a jump table was
determined (for this machine) to be about 5 or 6 cases (assuming
the case values are in a block of near-consecutive values). I don't
remember whether that was based on code size or execution time.
Actually, I think the two break-evens were fairly close together.

The triple-compile test proved very valuable in initial testing.

1. Compile the source code with whatever known-good compiler you've
got. If this fails, the source code has been botched. This almost
never happened because the guy working on it would at least compile
it once before giving copies to anyone else. Either that, or your
"known-good compiler" wasn't good. Back up to an older version.
Another possibility is that you added a feature to the compiler, and
a later version of the source started using that feature. In this case,
using a new feature before it's stable may require backing it out.

2. Compile the (same) source code with the compiler generated in
step 1. If this failed, with the compiler dumping core or the
assembler barfing on syntax errors in the generated assembly code,
some fairly serious errors in code generation got in.

3. Compile the (same) source code with the compiler generated in
step 2. The output should be byte-for-byte identical to the compiler
geneated in step 2, with the possible exception of time stamps in
the executable, if those are present (Avoid using __DATE__ and
__TIME__. Linkers might also put in such a time stamp). If there's
a difference, start comparing individual object files to figure out
which one(s) differ. Then start running diffs against the
generated-code assembly output to find out where the differences
are.

GOTHIER Nathan
2017-07-11 23:34:55 UTC
Permalink
On Tue, 11 Jul 2017 22:40:17 +0200
Post by David Brown
(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
exported, such as by using "extern", and everything else should have
been "static". But unless you are a Time Lord, you have to live with
this state and get used to putting "static" everywhere.)
I don't write "static" anywhere, however I use to choose clear names to avoid
collisions. I'm fine with the original C design providing an external
linkage as default since it is what any programmer would expect by writing more
than once in a sequence the same identifier.
h***@gmail.com
2017-07-13 19:53:42 UTC
Permalink
On Tuesday, July 11, 2017 at 1:40:28 PM UTC-7, David Brown wrote:

(snip)
Post by David Brown
(It was a mistake in the original design of C, IMHO, not to make
"static" the default. Exported symbols should have been explicitly
exported, such as by using "extern", and everything else should have
been "static". But unless you are a Time Lord, you have to live with
this state and get used to putting "static" everywhere.)
There are questions about how much of C was inherited from PL/I.
I suspect it is more than generally known.

PL/I variables can be STATIC, AUTOMATIC, or CONTROLLED, and
the first and last of those can be either INTERNAL or EXTERNAL,
with INTERNAL the default.

But there is nothing like C's file scope, and internal procedures
are actually inside an external procedure.

The file scope idea seems relatively new to C, and it was likely
not as well understood at the time. The default of external made
it easier for those used to many other languages.

Some time ago, I had to take a single large C file, and split
it into two. (There was a reason, but I forget now.) Many
variables were static, and so no longer connected to ones in
the other file. It didn't take so long, but was a little tricky
to get right.
h***@gmail.com
2017-07-13 20:10:01 UTC
Permalink
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
Seems to me that the not-mentioned solution is to rename one set
(or both) such that they don't conflict.

There is (or was) an IBM convention where each project (such as
each compiler, library, or OS subsystem) has a three letter code,
and all external names (and error messages) start with that code.

If the names aren't too common, a quick sed script should go through
one and change all the names such that they don't conflict.
(If you have function names that are english words, the change
might change those in printed messages, or other places that
you didn't want them changed.)

Otherwise, grep and diff are very useful when working on programs.
Many years ago, I ported gnu grep and diff to OS/2, when I was
doing development on that system. (And before home unix systems
were easily affordable.)
a***@yahoo.com
2017-07-17 15:59:23 UTC
Permalink
Post by John Forkosh
I have two large modules, let's say this.c and that.c, that
unexpectedly need to be compiled together,
cc this.c that.c -o thisandthat
Everything would work okay (just one main() function, etc),
except that each module contains several dozen functions with
name collisions with the other module, e.g., new_raster() and
delete_raster() in both modules, but the raster structs are a bit
different in this.c and that.c, so the functions are different.
The standard and correct thing would be declaring the colliding
functions static in each module. But big pain in the neck doing
all the editing. So I'm looking for some tricky-and-easy solution.
There doesn't seem to be any cc -switch to the effect of
cc -treat-all-functions-as-static-within-there-own-modules, or
cc -upon-name-collision-prefer-function-in-same-module-as-caller, or
cc -anything-like-the-above
So, is there any such -switch, or any other way to accomplish this
without lots of pesky editing to put the static keyword all over
the place? Thanks.
--
I obviously a week too late, but yes, I know how to reduce manual editing in the case similar to yours (but not in your specific case), assuming that number of functions that you want exported is much lower than the number of functions with conflicting names.
My trick would work on absolute majority of C codes bases. Unfortunately, couple of the posts in the middle of the thread suggest that it wouldn't work on your specific code base, because you are using one of the weirdest features of the C language, a feature that very few C coders used in last 20-25 years.

I'd still explain my trick, just for fun.

The idea is to exploit a similarity between C and C++.

At the beginning of you file, just after all includes add 'namespace that {'
At the end of you file add '}'
Below that add code that export all functions that you want to be visible by other C modules. Like:

extern "C" int func1(int a, double b) { return that::func1(a, b); }
extern "C" void func2(char *buf, int len) { that::func2(buf, len); }
etc...

Then compile your module as C++.
Loading...