Post by Patrick.Schluter Post by Ben Bacarisse Post by bartc Post by Ben Bacarisse
Not me. I prefer declarations all tucked away in a corner so
they don't clutter up the code.
I prefer to have my scopes all tucked away so the names don't
clutter up the rest of the code. That way I have smaller regions
of code to verify.
Then it means having to analyse every line back to the start of
the function to find the declaration of some variable. Just in
case it might have been declared somewhere in between.
When you just put all the declarations at the top of a function you
have to check every line for occurrences of every variable (if you
are doing any sort of semi-formal reasoning about your code), but
both this and your remark, whist being genuine problems with each
approach, are largely trivial in that there are simple tools to
help in both cases.
It's hard to put into words why I find it so very much more
comfortable to do it my way. I suspect a lot of it comes down to
the fact that we think about code in very different ways, but here
I'm currently in the process of replacing systematically on our
porject the style of declaration, from the standard C way to the
new C99 in-code declarations. What I have notices during that
process is that
- the code always gets shorter (1 line of declaration +1 line first
usage gets replaced by 1 line of declaration with initialisation.
- this often leads to some optimisations as very often the old
declarations initialised the variables for no reason.
- I've found several bugs in the code where variables got mixed up
in big functions.
- for loop declared variables allows to copy paste code more easily.
- refactoring long functions in to several smaller ones becomes
order of magnitude easier if you declare the variable where it is
used. I was surprized to see how often a variable was declared at
the beginning and only used twice at the end of the function in a
very unlikely error path.
The funny thing is that doing that has brought the need for the if(
with declaration as the pattern of
type var = call();
if(var == somethin) callsomething(var);
is really very frequent.
I have read through the comments in this thread with much
interest. Sometime after seeing this posting I thought I would
go through some code on a similar venture to the one described
above. I have enough experience with both styles so that I
didn't feel a need to make changes, but I wanted to see where and
how big the differences would be. I looked at sources from two
different projects: one, a fair number of functions that had
been written under C90 rules; and two, to add some perspective,
a single, rather complex, function that had been written under
The first case had 36 functions, averaging about 22 lines each
but with a standard deviation of about 23 lines. (Those counts
include only the function body, not the function declaration
lines or the trailing close brace line.) The median length was
16 lines. The ten longest functions had 30, 34, 34, 37, 40, 49,
59, 83, and 98 lines respectively.
I found a few instances where "within statements" declarations
looked like an improvement: one regular declaration, two as
part of for() loops (and two other plausible ones). There were
three instances where C99 initializer rules would have helped
(ie, but keeping C90 declaration placement).
Balancing that, there were just as many instances (I didn't
count these exactly) where a "within statements" declaration
was possible but looked like it gave a worse result, either
because of line length overflow or adaptability of code to
potential future changes, or circumstances that related to
those conditions. (Note added in editing: in most cases the
question didn't come up, either because the declaration already
had an initializer, or because it couldn't have one (ie, that
did anything useful) for some reason.)
What mattered much more was not where declarations were put but
how long the functions are. All the functions longer than about
30 or 35 lines clearly are in need of some recomposing (which in
fact looks fairly straightforward to do in this case). So that
seems like the high-order bit here.
The second case was interesting in a different way, partly
because it was written under C99 rules and already made use of a
"within-statements" initializing declaration. The code under
consideration is a single function, somewhat complicated but not
so large that breaking into smaller pieces is called for.
What makes the function interesting is four related variables,
let me call them a, b, x, and y. In the main flow of the
algorithm, x and y are set based on the value of a parameter, and
a and b are then set based on the values of x and y. However,
for a small subset of input values, there is fast path that only
sets 'a' (and then falls into some subsequent code). Practically
speaking the variable a must be declared before being given its
initial value. The question is what to do about the other three
related variables. Should they be declared parallel to a,
reflecting their relationship to a in the main line of the
algorithm? Or should they be declared separately, in only the
"slow path" of the algorithm, even though that is the predominate
case? (Note: the fast path is solely about performance - the
slow path always works, it just runs slower than the fast path.)
The code could be written either way. Ultimately I decided that
it made more sense to declare a, b, x, and y all together, even
though b, x, and y could have been declared more locally (and
clearly the function being fairly small played a significant
part in that decision). I think either decision is plausible, as
they both have their plusses and minuses, but on balance grouping
them all together seems like a better fit. I expect other people
would make a different decision.
After all of that, what are my conclusions? I think the main
conclusions are these:
One: I normally read function bodies "all at once" rather
than a line at a time. For me reading a function means
getting the whole function body in my head so I can think
about all of it at the same time.
Two: As both a cause and effect of the above, I very strongly
prefer functions be kept short, and that have been kept short.
Three: If function bodies are short, declaration placement
doesn't matter so much.
Four: There are several competing factors for where to put
declarations. Some favor putting declarations in one place,
some favor a different placement. Neither choice is right
in all cases; nor does one seem to predominate.
Five: As a corollary to (three) and (four), if refactoring
needs to be done, focusing on function length is more likely
to give a better ROI than moving declarations around.
And now I look forward to hearing all the kind responses and