Discussion:
Allocate variables within code blocks.
(too old to reply)
itsme.susila
2018-04-08 10:56:43 UTC
Permalink
Allocate variables within code blocks

Within a function, I will have many for loops:
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;

u = 3;
x = u * 6;
......


}


return;
}
[/code]

Which is the better practice (giving faster codes):
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.

Thanks.
Ben Bacarisse
2018-04-08 11:29:44 UTC
Permalink
Post by itsme.susila
Allocate variables within code blocks
I'd reconsider that design if I were you. I suppose it depends on what
the program is, but short single-purpose function generally make for
better code.
Post by itsme.susila
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many times over.
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code. Once you
have a clear program, measure the performance to see what parts are
taking up the time.
--
Ben.
Steven Petruzzellis
2018-04-08 12:06:33 UTC
Permalink
Post by Ben Bacarisse
Post by itsme.susila
Allocate variables within code blocks
I'd reconsider that design if I were you. I suppose it depends on what
the program is, but short single-purpose function generally make for
better code.
Post by itsme.susila
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many times over.
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code. Once you
have a clear program, measure the performance to see what parts are
taking up the time.
--
Ben.
Jesus doesn't have any clue what he is sniveling about. Protected code is making fast programs slow.

Who are you even talking to?

What GIMP looks like is the least of your problems and is no big deal, especially if it's better than what Jesus uses.

--
Live on Kickstarter
http://tinyurl.com/proof-about-ebot
https://prescottareapsychopaths.wordpress.com/paul-f-riddle-psychopath/
Jonas Eklundh Communication AB
Stefan Ram
2018-04-08 12:20:01 UTC
Permalink
Post by Ben Bacarisse
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code.
But what is clarity (in this case)?

I have the impression that one of the most general and fundamental
laws of software engineering (in most languages) is:

| The scope of a name should be as small as possible.

However, I do not know everything about this law and it's
application in C:

Does it have a name?

Was it written down somewhere by a C authority
such as Richie, Kernighan, or Pike?
(Maybe one should add Plauger to the list?)

So, in any way, this law would dictate to declare the
variables /within the loop/.

Well, I /do/ have some quotations, but for C++ or Java.

"Keep scopes small"
C++ Core Guidelines; Bjarne Stroustrup and Herb Sutter

"Minimize the scope"
Effective Java; Joshua Bloch.

Ok, I can't read this site because of its (ab)use
of JavaScript, so just from the URI, there is

wiki.c2.com/?ReduceScopeOfVariable

. This gives the name "Reduce Scope Of Variable"
leading to

refactoring.com/catalog/reduceScopeOfVariable.html

. (Actually, "variable" is too special, because we
would like to reduce the scope of other names, too.)

WRT the subject: "allocating variables" expresses
too low-level thinking (as if one would write in
assembler). In C, we define variables with block
scope.
Spiros Bousbouras
2018-04-08 13:00:49 UTC
Permalink
On 8 Apr 2018 12:20:01 GMT
Post by Stefan Ram
Post by Ben Bacarisse
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code.
But what is clarity (in this case)?
I have the impression that one of the most general and fundamental
| The scope of a name should be as small as possible.
The concept is familiar to me but not any specific phrasing and therefore
I can't quote anyone. The idea that global variables should be avoided is
a special case. As all software laws , it's a rule of thumb rather than
something which should always be followed. Sometimes global variables
(which by definition violate the rule as much as possible) clarify the
code. Also , if one worries about making every variable have as small a
scope as possible then this actually creates a bigger distraction than
choosing from the beginning a scope which is clearly large enough for the
intended purpose (say , function scope) but not "too large" and then focus
one's attention on what the code is supposed to do.
Post by Stefan Ram
However, I do not know everything about this law and it's
Does it have a name?
Was it written down somewhere by a C authority
such as Richie, Kernighan, or Pike?
(Maybe one should add Plauger to the list?)
So, in any way, this law would dictate to declare the
variables /within the loop/.
Well, I /do/ have some quotations, but for C++ or Java.
"Keep scopes small"
C++ Core Guidelines; Bjarne Stroustrup and Herb Sutter
"Minimize the scope"
Effective Java; Joshua Bloch.
[...]
Steven Petruzzellis
2018-04-08 15:14:40 UTC
Permalink
Post by Spiros Bousbouras
On 8 Apr 2018 12:20:01 GMT
Post by Stefan Ram
Post by Ben Bacarisse
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code.
But what is clarity (in this case)?
I have the impression that one of the most general and fundamental
| The scope of a name should be as small as possible.
The concept is familiar to me but not any specific phrasing and therefore
I can't quote anyone. The idea that global variables should be avoided is
a special case. As all software laws , it's a rule of thumb rather than
something which should always be followed. Sometimes global variables
(which by definition violate the rule as much as possible) clarify the
code. Also , if one worries about making every variable have as small a
scope as possible then this actually creates a bigger distraction than
choosing from the beginning a scope which is clearly large enough for the
intended purpose (say , function scope) but not "too large" and then focus
one's attention on what the code is supposed to do.
Post by Stefan Ram
However, I do not know everything about this law and it's
Does it have a name?
Was it written down somewhere by a C authority
such as Richie, Kernighan, or Pike?
(Maybe one should add Plauger to the list?)
So, in any way, this law would dictate to declare the
variables /within the loop/.
Well, I /do/ have some quotations, but for C++ or Java.
"Keep scopes small"
C++ Core Guidelines; Bjarne Stroustrup and Herb Sutter
"Minimize the scope"
Effective Java; Joshua Bloch.
[...]
It's like a spam message. Shawn Ulman has already decided what he is going to say before he calls. What you say is beside the point. What Peter Köhlmann says does not matter. I know the flooder is Shawn Ulman, who is a programmer but I don't know if it could be used to spam like this.

Pull up a background check on Shawn Ulman and you will find that he was in a detention center for a long time. Who knows what he did? The investigation sites don't go into that amount of specificity and public records from his jurisdiction have to be subpoenaed. The text of the post is a bunch of rambling baby sh*t. We know what that means. I have a little script I use as well, but it's a bit different.

Fool! Tizen runs on the LibreOffice kernel. So yeah, LibreOffice is mobile. LibreOffice is a super computer. LibreOffice is a server. LibreOffice is a desktop. LibreOffice is growing in market share.

--
I Left My Husband & Daughter At Home And THIS happened!!
http://jeff-relf.me/Cola_Regs.HTM
https://groups.google.com/forum/#!topic/comp.os.linux.advocacy/smzXrBhsWf4

Jonas Eklundh Communication
GOTHIER Nathan
2018-04-08 14:37:08 UTC
Permalink
On 8 Apr 2018 12:20:01 GMT
Post by Stefan Ram
Does it have a name?
It's called "logic" or "common sense". ;-)
--
GOTHIER Nathan
Joe Pfeiffer
2018-04-08 14:53:25 UTC
Permalink
Post by Stefan Ram
Post by Ben Bacarisse
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code.
But what is clarity (in this case)?
I have the impression that one of the most general and fundamental
| The scope of a name should be as small as possible.
However, I do not know everything about this law and it's
Does it have a name?
Was it written down somewhere by a C authority
such as Richie, Kernighan, or Pike?
(Maybe one should add Plauger to the list?)
So, in any way, this law would dictate to declare the
variables /within the loop/.
Well, I /do/ have some quotations, but for C++ or Java.
"Keep scopes small"
C++ Core Guidelines; Bjarne Stroustrup and Herb Sutter
"Minimize the scope"
Effective Java; Joshua Bloch.
Ok, I can't read this site because of its (ab)use
of JavaScript, so just from the URI, there is
wiki.c2.com/?ReduceScopeOfVariable
. This gives the name "Reduce Scope Of Variable"
leading to
refactoring.com/catalog/reduceScopeOfVariable.html
. (Actually, "variable" is too special, because we
would like to reduce the scope of other names, too.)
WRT the subject: "allocating variables" expresses
too low-level thinking (as if one would write in
assembler). In C, we define variables with block
scope.
What this demonstrates is yes, you should in general keep the scope of a
variable as small as possible no matter what language you're writing
in.
bartc
2018-04-08 15:32:14 UTC
Permalink
Post by Joe Pfeiffer
Post by Stefan Ram
| The scope of a name should be as small as possible.
"Keep scopes small"
"Minimize the scope"
What this demonstrates is yes, you should in general keep the scope of a
variable as small as possible no matter what language you're writing
in.
Imagine a tool which did that refactoring for you.

So it would look at the span of any variable: where it is first used to
its last access, and extracting its declaration and moving it to the
first use, or changing the first assignment to an initialisation.

Also looking at the last access, and perhaps enclosing the span within
its own block.

And, also, perhaps, recognising that uses of the same variable within
separate blocks are distinct, and splitting it into multiple scopes.

/I/ can imagine what a mess it would make of most of my programs, and
how hard it would be to do subsequent editing when the spans of some
variables need to change.

My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
--
bartc
Joe Pfeiffer
2018-04-08 21:19:29 UTC
Permalink
Post by bartc
Post by Joe Pfeiffer
Post by Stefan Ram
| The scope of a name should be as small as possible.
"Keep scopes small"
"Minimize the scope"
What this demonstrates is yes, you should in general keep the scope of a
variable as small as possible no matter what language you're writing
in.
Imagine a tool which did that refactoring for you.
So it would look at the span of any variable: where it is first used
to its last access, and extracting its declaration and moving it to
the first use, or changing the first assignment to an initialisation.
Also looking at the last access, and perhaps enclosing the span within
its own block.
And, also, perhaps, recognising that uses of the same variable within
separate blocks are distinct, and splitting it into multiple scopes.
/I/ can imagine what a mess it would make of most of my programs, and
how hard it would be to do subsequent editing when the spans of some
variables need to change.
My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
That's pretty much the rule I use.

I would be interested in seeing what a tool such as you describe would
do with my code; creating new blocks as defined by first/last use of a
variable would pretty obviously do more harm than good, but a tool that
moved variables within the smallest possible predefined block wouldn't
be a bad idea at all.
Tim Rentsch
2018-04-09 14:59:41 UTC
Permalink
Post by Joe Pfeiffer
[...]
My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
That's pretty much the rule I use.
I don't have a rule about keeping functions small; it simply
falls out from the drive to keep each function understandable as a
single unit. Also I don't think 5 lines or less is ridiculously
small. My most recent active project has an average function
length of somewhat over 7 lines, and a median function length
of 3 lines.
Tim Rentsch
2018-04-11 13:29:50 UTC
Permalink
Post by Tim Rentsch
Post by Joe Pfeiffer
[...]
My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
That's pretty much the rule I use.
I don't have a rule about keeping functions small; it simply
falls out from the drive to keep each function understandable as a
single unit. Also I don't think 5 lines or less is ridiculously
small. My most recent active project has an average function
length of somewhat over 7 lines, and a median function length
of 3 lines.
For comparison, here are some numbers from my currently active
project:

median length: 14 lines
60th percentile: 20 lines
70th percentile: 30 lines
average: 34 lines (to two digits, 33.91)
80th percentile: 46 lines
90th percentile: 87 lines
95th percentile: 133 lines
99th percentile: 311 lines (with just over 600 functions total)

Not the kind of numbers I like to see. But even in this scary
environment, 24% of functions are five lines or fewer.
bartc
2018-04-11 14:58:34 UTC
Permalink
Post by Tim Rentsch
Post by Tim Rentsch
Post by Joe Pfeiffer
[...]
My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
That's pretty much the rule I use.
I don't have a rule about keeping functions small; it simply
falls out from the drive to keep each function understandable as a
single unit. Also I don't think 5 lines or less is ridiculously
small. My most recent active project has an average function
length of somewhat over 7 lines, and a median function length
of 3 lines.
For comparison, here are some numbers from my currently active
median length: 14 lines
60th percentile: 20 lines
70th percentile: 30 lines
average: 34 lines (to two digits, 33.91)
80th percentile: 46 lines
90th percentile: 87 lines
95th percentile: 133 lines
99th percentile: 311 lines (with just over 600 functions total)
Not the kind of numbers I like to see. But even in this scary
environment, 24% of functions are five lines or fewer.
I got these figures for some of my code:

Average Min Max Lines per function

Project #1 31 3 304
Project #2 41 4 586
Project #3 22 3 412
Project #4 27 3 410

* Average 30 lines/function across all 3400 functions

* Not C sources as it is not easy to reliably find the start of each
function in that language, not using a simple script

* Number of lines is the /pitch/ between the start of one function and
the next. The body of the function is usually 3 lines smaller, and will
include any blank lines and comments.

* 3-line functions are usually empty bodies. 4-line ones, often the body
just calls another function.

* Large functions are usually associated with large switch statements
(eg. decoding x64 binary code by looking at the 256 possible values of
the next byte) which can't be easily decomposed
--
bartc
Tim Rentsch
2018-04-11 18:28:45 UTC
Permalink
Post by bartc
Post by Tim Rentsch
Post by Tim Rentsch
Post by Joe Pfeiffer
[...]
My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
That's pretty much the rule I use.
I don't have a rule about keeping functions small; it simply
falls out from the drive to keep each function understandable as a
single unit. Also I don't think 5 lines or less is ridiculously
small. My most recent active project has an average function
length of somewhat over 7 lines, and a median function length
of 3 lines.
For comparison, here are some numbers from my currently active
median length: 14 lines
60th percentile: 20 lines
70th percentile: 30 lines
average: 34 lines (to two digits, 33.91)
80th percentile: 46 lines
90th percentile: 87 lines
95th percentile: 133 lines
99th percentile: 311 lines (with just over 600 functions total)
Not the kind of numbers I like to see. But even in this scary
environment, 24% of functions are five lines or fewer.
Average Min Max Lines per function
Project #1 31 3 304
Project #2 41 4 586
Project #3 22 3 412
Project #4 27 3 410
* Average 30 lines/function across all 3400 functions
* Not C sources as it is not easy to reliably find the start of each
function in that language, not using a simple script
* Number of lines is the /pitch/ between the start of one function and
the next. The body of the function is usually 3 lines smaller, and
will include any blank lines and comments.
My line counts are always lines just of the function body,
excluding the outermost braces but including any blank
lines and/or comments.
Steven Petruzzellis
2018-04-13 07:42:42 UTC
Permalink
Post by bartc
Post by Tim Rentsch
Post by Tim Rentsch
Post by Joe Pfeiffer
[...]
My suggestion: keep functions small (not ridiculously so like 5 lines;
enough to see all at once would be a good rule). Then the scope of the
variables used in the function will be just as small.
That's pretty much the rule I use.
I don't have a rule about keeping functions small; it simply
falls out from the drive to keep each function understandable as a
single unit. Also I don't think 5 lines or less is ridiculously
small. My most recent active project has an average function
length of somewhat over 7 lines, and a median function length
of 3 lines.
For comparison, here are some numbers from my currently active
median length: 14 lines
60th percentile: 20 lines
70th percentile: 30 lines
average: 34 lines (to two digits, 33.91)
80th percentile: 46 lines
90th percentile: 87 lines
95th percentile: 133 lines
99th percentile: 311 lines (with just over 600 functions total)
Not the kind of numbers I like to see. But even in this scary
environment, 24% of functions are five lines or fewer.
Average Min Max Lines per function
Project #1 31 3 304
Project #2 41 4 586
Project #3 22 3 412
Project #4 27 3 410
* Average 30 lines/function across all 3400 functions
* Not C sources as it is not easy to reliably find the start of each
function in that language, not using a simple script
* Number of lines is the /pitch/ between the start of one function and
the next. The body of the function is usually 3 lines smaller, and will
include any blank lines and comments.
* 3-line functions are usually empty bodies. 4-line ones, often the body
just calls another function.
* Large functions are usually associated with large switch statements
(eg. decoding x64 binary code by looking at the 256 possible values of
the next byte) which can't be easily decomposed
--
bartc
Ha, ha!

And why the added need to debug JavaScript via an automated phone system in a super computer system? Predictably, he didn't initially list this as part of his continually changing 'criteria' <shrug>. For all the boasting Jeff Torrey's done on this topic, the 'Content Manager' doesn't understand how to do this. It really takes a couple seconds to select a sentence and 'capitalize' it.

Of course I quoted examples of Jeff Torrey trolling multiple groups -- twisting Desk Rabbit's words, etc. His response: to claim I am Jeff Torrey.

Until Jeff Torrey offers up his 'better' FOSS tool for testing, there is no challenge, just insane assertions.

This is something you should ask Desk Rabbit. The bulk of the regulars in this forum do programming either as a hobby or as a trade, so I am skeptical more than a few consider writing macros to be "witch craft".
--
Live on Kickstarter!

https://groups.google.com/forum/#!topic/comp.os.linux.advocacy/tzMH39QmAmU
http://www.5z8.info/racist-raps_r6r3wf_best-russian-sites-for-bootleg-everything
Jonas Eklundh Communication AB
Jorgen Grahn
2018-04-11 10:25:20 UTC
Permalink
Post by bartc
Post by Joe Pfeiffer
Post by Stefan Ram
| The scope of a name should be as small as possible.
"Keep scopes small"
"Minimize the scope"
What this demonstrates is yes, you should in general keep the scope of a
variable as small as possible no matter what language you're writing
in.
Imagine a tool which did that refactoring for you.
...
Post by bartc
/I/ can imagine what a mess it would make of most of my programs, and
how hard it would be to do subsequent editing when the spans of some
variables need to change.
Well, noone mentioned such a tool; it seems to me you're switching
subjects.

As for tools: I'm all for small scopes, but you lose half the benefit
if you write the code the wrong way and then autoconvert it. I can't
think of a reason to do it that way.

/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Ben Bacarisse
2018-04-08 18:27:58 UTC
Permalink
Post by Stefan Ram
Post by Ben Bacarisse
Modern compilers will generate good code in both cases. It's almost
always better to write for clarity rather than generated code.
But what is clarity (in this case)?
In this case it's probably to not have "many for loops" "within a
function". Hard to be sure, but that sounds like a bad design.
Post by Stefan Ram
I have the impression that one of the most general and fundamental
| The scope of a name should be as small as possible.
Provided you don't tale it literally, that's a suggestion I tend to
follow. In particular, I would not move (or place) a declaration just
so that I could re-used the variable.

It makes it simpler to reason about the code. I know that is no longer
fashionable, but I'm, not going to stop doing it even if only
informally.
--
Ben.
James Kuyper
2018-04-08 11:54:17 UTC
Permalink
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
It's unlikely for that choice to have any significant impact on the
execution speed of the code. Therefore, you shouldn't be using speed as
the basis for making this decision.
If those variables have the same meaning in each if the for loops, then
it might be OK (but see below for other issues). However, if they have
different meanings, use different names. If every use (direct or
indirect) of a variable named foo is complete before the first use of a
variable named bar, and both defined in the same block, then most modern
compilers can be counted on to store both variables in the same
location, at least if you turn on at least minimal optimizations, so you
won't be wasting any memory by defining the extra variables.
If you define them separately inside each for() block, then they will
start out uninitialized at the start of each pass through the loop. If
the code inside the loop tries to use the value of one of those
variables before a value has been stored in that variable, it will find
an indeterminate value, generally not a good thing. However, this is
something that modern compilers are pretty good at noticing and warning
your about.
If you define them outside of all of the for() blocks, the value they
have at the end of each pass through the loop will be the value they
have at the start of the next pass through that loop; and the value they
have at the end of each loop will be the value they start with at the
beginning of the next loop. If your code is written to make use of this
fact, that can be a good thing. Otherwise, it can be the source of
puzzling bugs.
I recently uncovered a defect in code written by someone else, which
relied upon this carry-over feature, despite the fact that the variables
were defined inside the loop, so there's no guarantee that any
carry-over would actually occur. The code also depended upon those
variables being zero-initialized, despite the fact that it didn't
zero-initialize them. The code actually worked as intended, because the
compiler always allocated each variable in the same location in memory
during each pass through the loop, and it did zero-initialize that
location before the start of the loop - but that was VERY sloppy
programming. When I reported this defect to the person who wrote it, I
got a response which said "Well, as long as the code works, there's no
need to fix it" - which goes a long way toward explaining the very poor
quality of that code. Unfortunately, there's no one I can appeal to to
insist that it be fixed - the reason they are writing code like this is
their management doesn't mind them writing code like this.
Sorry - that last paragraph is only tangentially related to your
question, I just felt a need to complain to someone about what I found.
Spiros Bousbouras
2018-04-08 12:50:43 UTC
Permalink
On Sun, 8 Apr 2018 07:54:17 -0400
Post by James Kuyper
If you define them outside of all of the for() blocks, the value they
have at the end of each pass through the loop will be the value they
have at the start of the next pass through that loop; and the value they
have at the end of each loop will be the value they start with at the
beginning of the next loop. If your code is written to make use of this
fact, that can be a good thing. Otherwise, it can be the source of
puzzling bugs.
A common application is that you have to parse a string and for or
while loops , one after the other , do successive parts of the parsing. In
such a situation it's convenient to have the same index variable for all
the loops which tells you up to which position you have parsed.
For example

size_t i = 0 ;

while (isspace(s[i])) i++ ;
while (isdigit(s[i])) {
.....
}
......
Post by James Kuyper
I recently uncovered a defect in code written by someone else, which
relied upon this carry-over feature, despite the fact that the variables
were defined inside the loop, so there's no guarantee that any
carry-over would actually occur. The code also depended upon those
variables being zero-initialized, despite the fact that it didn't
zero-initialize them. The code actually worked as intended, because the
compiler always allocated each variable in the same location in memory
during each pass through the loop, and it did zero-initialize that
location before the start of the loop - but that was VERY sloppy
programming. When I reported this defect to the person who wrote it, I
got a response which said "Well, as long as the code works, there's no
need to fix it" - which goes a long way toward explaining the very poor
quality of that code.
And I bet that if the code stops working then he will start complaining that
the compiler writers did something wrong. One Anton Ertl comes to mind.
Post by James Kuyper
Unfortunately, there's no one I can appeal to to
insist that it be fixed - the reason they are writing code like this is
their management doesn't mind them writing code like this.
Actually , the reason they write code like this is because there's something
wrong with their thought processes although I'm not sure what. Correct code
is both a deductive and inductive feature ; deductive meaning it has to
follow from the relevant standards or specifications or documentations or ...
that the code does what is supposed to ; and inductive meaning that it has to
be tested and found to work correctly.

The deductive part is necessary because the code may do what is supposed to
simply by coincidence as your example shows. The inductive part is necessary
because the compilers may have bugs or the hardware may have bugs or the code
accidentally used variable name "i" when "j" was intended (and the proof of
correctness works for "j") or ...

I suspect that some people who write code don't realise that the deductive
requirement also applies or they realise somewhat but it takes them more
mental effort that they're willing to invest to *reason* about their code.
Post by James Kuyper
Sorry - that last paragraph is only tangentially related to your
question, I just felt a need to complain to someone about what I found.
bartc
2018-04-08 12:42:20 UTC
Permalink
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
 int, x, y, z;
for (i = 0,...){
   u64 u, v, w;
   u = 3;
   x = u * 6;
   ......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
It's unlikely a compiler would generate allocation code at the start of
the loop for these simple variables.

So just declare them where you like.

Some people prefer them all declared at the start of the function, where
they are tidily out of the way (like me).

Others like them scattered throughout the function.
--
bartc
itsme.susila
2018-04-08 14:19:12 UTC
Permalink
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
 int, x, y, z;
for (i = 0,...){
   u64 u, v, w;
   u = 3;
   x = u * 6;
   ......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Thanks.
Actually, I have implemented a full chess playing program (not released
yet). A chess program needs optimization fully for speed. Its two huge
functions are:
1) evaluate().
2) recursive search() - doing "massive work" in each recursion call (a
node); with my core-2 single processor, it searches about 700,000 nodes
per sec.

I was just pondering about assembly language memory addressing modes (I
dont know assembly, just try to think first principle).
1) avoid global variables - allocated in heap; slow
2) use local variables - allocated in stack - very fast.

a) I imagine if we declare new variables (u, v, w are just needed
working variables), in deeply nested blocks, asm must have more
difficulty moving the "stack frame" and to add an offset to identify a
variable; there must be a way to differentiate which memory I am
loading/retrieving content. The more nesting level, the greater the
overhead.
b) in my search(), the more variables declared at the function start,
the more asm need to know which_is_which (my amateurish understanding).
So the lesser the number of variables, the faster the codes.

Thanks.
Stefan Ram
2018-04-08 14:24:05 UTC
Permalink
Post by itsme.susila
a) I imagine if we declare new variables (u, v, w are just needed
Don't imagine, measure.
itsme.susila
2018-04-08 14:27:09 UTC
Permalink
Post by Stefan Ram
Post by itsme.susila
a) I imagine if we declare new variables (u, v, w are just needed
Don't imagine, measure.
Correct, measure is certain. But could we try to predict through "theory"?
Lew Pitcher
2018-04-08 14:42:06 UTC
Permalink
Post by itsme.susila
Post by Stefan Ram
Post by itsme.susila
a) I imagine if we declare new variables (u, v, w are just needed
Don't imagine, measure.
Correct, measure is certain. But could we try to predict through "theory"?
"In the year of our Lord 1432, there arose a grievous quarrel among the
brethren over the number of teeth in the mouth of a horse. For thirteen
days the disputation raged without ceasing. All the ancient books and
chronicles were fetched out, and wonderful and ponderous erudition such
as was never before heard of in this region was made manifest.

At the beginning of the fourteenth day, a youthful friar of goodly
bearing asked his learned superiors for permission to add a word, and
straightway, to the wonderment of the disputants, whose deep wisdom he
sore vexed, he beseeched them to unbend in a manner coarse and unheard-of
and to look in the open mouth of a horse and find answer to their
questionings. At this, their dignity being grievously hurt, they waxed
exceeding wroth; and, joining in a mighty uproar, they flew upon him and
smote him, hip and thigh, and cast him out forthwith. For, said they,
surely Satan hath tempted this bold neophyte to declare unholy and
unheard-of ways of finding truth, contrary to all the teachings of the
fathers.

After many days more of grievous strife, the dove of peace sat on the
assembly, and they as one man declaring the problem to be an everlasting
mystery because of a grievous dearth of historical and theological
evidence thereof, so ordered the same writ down."
--
Lew Pitcher
"In Skills, We Trust"
PGP public key available upon request
bartc
2018-04-08 14:37:36 UTC
Permalink
Post by itsme.susila
I was just pondering about assembly language memory addressing modes (I
dont know assembly, just try to think first principle).
1) avoid global variables - allocated in heap; slow
Simple globals are not stored on the heap. Not unless you specifically
allocate them using malloc().
Post by itsme.susila
2) use local variables - allocated in stack - very fast.
If your algorithm is recursive, then probably you will need locals.
Post by itsme.susila
a) I imagine if we declare new variables (u, v, w are just needed
working variables), in deeply nested blocks, asm must have more
difficulty moving the "stack frame" and to add an offset to identify a
variable; there must be a way to differentiate which memory I am
loading/retrieving content. The more nesting level, the greater the
overhead.
b) in my search(), the more variables declared at the function start,
the more asm need to know which_is_which (my amateurish understanding).
So the lesser the number of variables, the faster the codes.
Variables within nested blocks should have no affect at all, unless you
use a poor compiler. Or use VLAs (arrays with a length only known at
runtime).

You need to look at other details, such as the efficiency of your
algorithm and data structures, not at where variables are declared.

Note that an optimising compiler may not allocate space for those
variables at all, but keeps them in registers.
--
bartc
David Brown
2018-04-08 14:45:16 UTC
Permalink
Post by itsme.susila
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
  int, x, y, z;
for (i = 0,...){
    u64 u, v, w;
    u = 3;
    x = u * 6;
    ......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Thanks.
Actually, I have implemented a full chess playing program (not released
yet).  A chess program needs optimization fully for speed. Its two huge
1) evaluate().
2) recursive search() - doing "massive work" in each recursion call (a
node); with my core-2 single processor, it searches about 700,000 nodes
per sec.
I was just pondering about assembly language memory addressing modes (I
dont know assembly, just try to think first principle).
1) avoid global variables - allocated in heap; slow
2) use local variables - allocated in stack - very fast.
Globals are not allocated on the heap - they go in statically allocated
data sections. They may or may not be slow, depending on how they are
used, what the rest of the code is like and what the target cpu is.

There is no term "global variable" in C, but if you mean data with
external linkage then the main reason to avoid them is because they
reduce modularity in the code, and can easily lead to mistakes.
File-scope static data is a lot safer (and can be faster, due to more
optimisations).

Local variables are allocated on the stack, or in registers, or get
optimised away - they are faster in many cases.
Post by itsme.susila
a) I imagine if we declare new variables (u, v, w are just needed
working variables), in deeply nested blocks, asm must have more
difficulty moving the "stack frame" and to add an offset to identify a
variable; there must be a way to differentiate which memory I am
loading/retrieving content. The more nesting level, the greater the
overhead.
No. That is simply incorrect. Local variables defined in deeply nested
blocks are at least as fast as those defined further out, and it is
easier to optimise them.
Post by itsme.susila
b) in my search(), the more variables declared at the function start,
the more asm need to know which_is_which (my amateurish understanding).
So the lesser the number of variables, the faster the codes.
No (unless you have a very poor quality compiler, or fail to enable
optimisation on your compiler).

Don't try to second-guess your compiler here. Write good, clean, clear
code. Keep your variable scopes to a practical minimum (not an absolute
minimum). Keep any file-scope data or functions as "static" unless you
are actually exporting them from the module. Make sure you understand
your compiler - how to use optimisations, warning flags, etc.

And don't try to do "hand optimisation" - you will almost certainly be
worse than the compiler, and do more harm than good.
itsme.susila
2018-04-08 14:55:41 UTC
Permalink
Post by David Brown
Post by itsme.susila
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
  int, x, y, z;
for (i = 0,...){
    u64 u, v, w;
    u = 3;
    x = u * 6;
    ......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand
they would be used many
times over.
Thanks.
Actually, I have implemented a full chess playing program (not
released yet).  A chess program needs optimization fully for speed.
1) evaluate().
2) recursive search() - doing "massive work" in each recursion call (a
node); with my core-2 single processor, it searches about 700,000
nodes per sec.
I was just pondering about assembly language memory addressing modes
(I dont know assembly, just try to think first principle).
1) avoid global variables - allocated in heap; slow
2) use local variables - allocated in stack - very fast.
Globals are not allocated on the heap - they go in statically allocated
data sections.  They may or may not be slow, depending on how they are
used, what the rest of the code is like and what the target cpu is.
There is no term "global variable" in C, but if you mean data with
external linkage then the main reason to avoid them is because they
reduce modularity in the code, and can easily lead to mistakes.
File-scope static data is a lot safer (and can be faster, due to more
optimisations).
Local variables are allocated on the stack, or in registers, or get
optimised away - they are faster in many cases.
Post by itsme.susila
a) I imagine if we declare new variables (u, v, w are just needed
working variables), in deeply nested blocks, asm must have more
difficulty moving the "stack frame" and to add an offset to identify a
variable; there must be a way to differentiate which memory I am
loading/retrieving content. The more nesting level, the greater the
overhead.
No.  That is simply incorrect.  Local variables defined in deeply nested
blocks are at least as fast as those defined further out, and it is
easier to optimise them.
Post by itsme.susila
b) in my search(), the more variables declared at the function start,
the more asm need to know which_is_which (my amateurish
understanding). So the lesser the number of variables, the faster the
codes.
No (unless you have a very poor quality compiler, or fail to enable
optimisation on your compiler).
Don't try to second-guess your compiler here.  Write good, clean, clear
code.  Keep your variable scopes to a practical minimum (not an absolute
minimum).  Keep any file-scope data or functions as "static" unless you
are actually exporting them from the module.  Make sure you understand
your compiler - how to use optimisations, warning flags, etc.
And don't try to do "hand optimisation" - you will almost certainly be
worse than the compiler, and do more harm than good.
Thanks.

I ask because I only know how to make my codes compile and run; no real
background.
GOTHIER Nathan
2018-04-08 15:52:51 UTC
Permalink
On Sun, 8 Apr 2018 22:55:41 +0800
Post by itsme.susila
Thanks.
I ask because I only know how to make my codes compile and run; no
real background.
I would recommend you to define first prototypes of simple functions to
build your program, next implement them in less than 25 lines and if
necessary split the code with more simple functions.

Prototypes are very important to make your mind clear about what are
the inputs and the outputs required by your functions.
--
GOTHIER Nathan
Joe Pfeiffer
2018-04-08 15:00:53 UTC
Permalink
Post by itsme.susila
I was just pondering about assembly language memory addressing modes
(I dont know assembly, just try to think first principle).
1) avoid global variables - allocated in heap; slow
2) use local variables - allocated in stack - very fast.
a) I imagine if we declare new variables (u, v, w are just needed
working variables), in deeply nested blocks, asm must have more
difficulty moving the "stack frame" and to add an offset to identify a
variable; there must be a way to differentiate which memory I am
loading/retrieving content. The more nesting level, the greater the
overhead.
b) in my search(), the more variables declared at the function start,
the more asm need to know which_is_which (my amateurish
understanding). So the lesser the number of variables, the faster the
codes.
This is all stuff that's figured out at compile time, and won't have any
real effect on speed at run time. By the time your code is actually
running, your global variables are all just at known addresses, space is
reserved on the stack for all the local variables in one operation by
just subtracting from the stack pointer, and all the local variables for
the function are just at known offsets from the base of the activation
record. None of this will impact how fast your code is running.
Joe Pfeiffer
2018-04-08 14:51:48 UTC
Permalink
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Thanks.
The compiler, if even marginally sane, will analyze all the variables
you ever use in the function and reserve space for them when you enter
the function regardless of where they are declared. So declare them
wherever it makes the code most readable (in your example, inside the
for-loop would probably be my choice).
Tim Rentsch
2018-04-08 22:59:42 UTC
Permalink
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Your question doesn't have any simple answers. In fact, these
days most questions about improving performance don't have any
simple answers. What happens now inside compilers and inside
computers is so complicated it is nearly impossible to analyze it
or give any hard and fast rules. However, unless you are an
expert, any effect you get from moving variable declarations
around is likely to be much smaller than effects from things like
choice of data structure representation, algorithmic choices,
speeding up common cases possibly at the expense of rare cases,
etc. Start by asking where the existing code is spending its
time. Which ever lines of code are the "hottest" is a good place
to start looking for alternatives. Code up two or three different
ideas for cooling down the hot spot, then run those and measure
what the effects of the different changes are. Pick what looks
like the best scheme and go around the cycle again, gradually
reaching a more uniform code temperature. Going through this
process, without ever getting to micro-optimization, is very
likely to give most of the performance gain that can be squeezed
out of the program. At some point you will reach a point of
diminishing returns, where the additional performance gain is
not worth the amount of effort involved; you have to decide
for yourself at what point that is.

(Disclaimer: I am not a performance expert. I have worked with
some people who are scary good at performance tuning, and some
of that has rubbed off, but I still wouldn't call myself an
expert.)
itsme.susila
2018-04-09 03:54:11 UTC
Permalink
Post by Tim Rentsch
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Your question doesn't have any simple answers. In fact, these
days most questions about improving performance don't have any
simple answers. What happens now inside compilers and inside
computers is so complicated it is nearly impossible to analyze it
or give any hard and fast rules. However, unless you are an
expert, any effect you get from moving variable declarations
around is likely to be much smaller than effects from things like
choice of data structure representation, algorithmic choices,
speeding up common cases possibly at the expense of rare cases,
etc. Start by asking where the existing code is spending its
time. Which ever lines of code are the "hottest" is a good place
to start looking for alternatives. Code up two or three different
ideas for cooling down the hot spot, then run those and measure
what the effects of the different changes are. Pick what looks
like the best scheme and go around the cycle again, gradually
reaching a more uniform code temperature. Going through this
process, without ever getting to micro-optimization, is very
likely to give most of the performance gain that can be squeezed
out of the program. At some point you will reach a point of
diminishing returns, where the additional performance gain is
not worth the amount of effort involved; you have to decide
for yourself at what point that is.
(Disclaimer: I am not a performance expert. I have worked with
some people who are scary good at performance tuning, and some
of that has rubbed off, but I still wouldn't call myself an
expert.)
I think the advice given earlier "test! don't imagine" is correct. But I
still like to have some good practice from the experts who know how our
computer cpu actually operate.

In fact, chess programmers are a lot who talk about speed. I did go to
talkchess.com (now no more) where there were much talk about coding
practice. A simple example that novice may not know: use integer(4
bytes) data rather than short(2 bytes)...chess programming is almost all
with integer operations and there are overheads for modern cpu to
operate on char and short types.

In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function. But I think it is correct to start off with what you say, good
choice of data structures, etc...
David Brown
2018-04-09 07:31:55 UTC
Permalink
Post by itsme.susila
I think the advice given earlier "test! don't imagine" is correct. But I
still like to have some good practice from the experts who know how our
computer cpu actually operate.
In fact, chess programmers are a lot who talk about speed. I did go to
talkchess.com (now no more) where there were much talk about coding
practice. A simple example that novice may not know: use integer(4
bytes) data rather than short(2 bytes)...chess programming is almost all
with integer operations and there are overheads for modern cpu to
operate on char and short types.
Beware about such advice from places like that. They /might/ give good
advice - but they might not. Their ideas could be specific to the type
of coding they use, the processors they use, the tools they use. It can
also change dramatically in different situations. Often it is badly out
of date, with advice that made sense for cpus of a decade ago, or for
ancient compilers. (If they give you "tricks" for your C code, like
using pointers instead of arrays or "bit hacks", they are probably
wrong. Most "hand optimised C" hinders the modern compiler, rather than
helping it.)

For the theory in this particular case, if you have numbers that will
fit in a 2-byte integer then a "short" (or, more precisely, an int16_t,
uint16_t) will sometimes be faster than an "int" (int32_t or uint32_t),
and sometimes slower. On some cpus, a "long long int" (int64_t or
uint64_t) will sometimes be faster, but that is less common.

In particular, on an x86-64, ARM, or a number of other modern cpus,
using 16-bit types will mean the occasional use of masking operations.
However, those operations are fast and easily scheduled. 16-bit types
have big advantages if you have large arrays of them, as they will be
twice as efficient for caches and memory bandwidth. And if you can work
with SIMD vector instructions, you can handle twice as many in a clock tick.

The theory can help you make sensible guesses on what to try - but only
measurement will give you the real figures. And though the particular
choices might make the code a touch faster on /your/ system, it might
make it a touch slower on a slightly different system.

Most of the time, the best advice is to write code clearly, and
understand how to let the compiler generate the best code. Messing
around with choosing 16-bit or 32-bit types will make an insignificant
difference compared to not using good optimisation flags, or organising
your code in a compiler-unfriendly manner.

And none of that is the slightest bit relevant in comparison to the
effect your choice of algorithms have. A good algorithm in an
interpreted language like Python will run circles round a poorer
algorithm implemented in top-speed C, given a large enough problem sample.
Post by itsme.susila
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function. But I think it is correct to start off with what you say, good
choice of data structures, etc...
Ben Bacarisse
2018-04-09 11:17:18 UTC
Permalink
"itsme.susila" <***@gmx.com> writes:
<snip>
Post by itsme.susila
In fact, chess programmers are a lot who talk about speed. I did go to
talkchess.com (now no more) where there were much talk about coding
practice. A simple example that novice may not know: use integer(4
bytes) data rather than short(2 bytes)...chess programming is almost
all with integer operations and there are overheads for modern cpu to
operate on char and short types.
I don't think that's good advice. It may be true in some cases (it
certainly used to be on some CPUs) but it may not be true in others.

I used to be quite good at guessing what code changes would improve
performance, but that was 25 years ago. With every new generation of
CPU, and every new compiler release, I have become used being surprised
and I've thrown out almost every "rule of thumb" that I used to use.

For example, I just tried a nested loop calculation using short, int and
long (long being 64-bit here) and the short version was faster than the
(32-bit) int version. But that was un-optimised. With gcc -O2 I can't
detect any difference between short and int. This was designed to
avoid cache issues so it's possible that short might even be faster in
some cases where the memory saving turns out to be helpful.

And if someone now posts that they've also done a quick test and int was
faster than short I would not be at all surprised. I have, in the past,
been so frequently surprised by what clever chips and compilers do that
I am rarely surprised any more.
Post by itsme.susila
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function.
That sounds odd. To me, a chess playing program /is/ recursive search
with very little else! It sounds like saying that the program is the
critical part of the program.

When I wrote a chess program, it was the function that evaluated
positions that was the bottle-neck. If I remember correctly, there was
some measurable component from the function that generated valid moves,
but I don't remember it being worth working on. And despite knowing the
most expensive function, the best improvements came from re-ordering the
move list because of the way the tree pruning worked.

(I suspect the design of chess programs is more sophisticated these
days. Even for the time, mine was crude, having no book of openings and
no special end-game mode.)
Post by itsme.susila
But I think it is correct to start off with what you say,
good choice of data structures, etc...
Yes. For macro-optimisation, think data structures and algorithms (and
measure). For micro-optimisation, measure (don't try to guess).
--
Ben.
Stefan Ram
2018-04-09 11:46:35 UTC
Permalink
Post by Ben Bacarisse
For micro-optimisation, measure (don't try to guess).
When one measures, one optimizes for the specific execution
environment that was chosen for the measurements, usually
this would be the execution environment of the customer.
But in the case of software to be distributed to several
different (possibly, unknown) execution environments
(which might not even exist at the time of writing),
it is not obvious which execution environment(s) to choose
for measurements.

"There are so many people, and I can't please them all,
so I might better please nobody at all." - Bob Dylan
(from my memory, can't find that quotation in the Web)
itsme.susila
2018-04-09 16:27:49 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by itsme.susila
In fact, chess programmers are a lot who talk about speed. I did go to
talkchess.com (now no more) where there were much talk about coding
practice. A simple example that novice may not know: use integer(4
bytes) data rather than short(2 bytes)...chess programming is almost
all with integer operations and there are overheads for modern cpu to
operate on char and short types.
I don't think that's good advice. It may be true in some cases (it
certainly used to be on some CPUs) but it may not be true in others.
I used to be quite good at guessing what code changes would improve
performance, but that was 25 years ago. With every new generation of
CPU, and every new compiler release, I have become used being surprised
and I've thrown out almost every "rule of thumb" that I used to use.
For example, I just tried a nested loop calculation using short, int and
long (long being 64-bit here) and the short version was faster than the
(32-bit) int version. But that was un-optimised. With gcc -O2 I can't
detect any difference between short and int. This was designed to
avoid cache issues so it's possible that short might even be faster in
some cases where the memory saving turns out to be helpful.
And if someone now posts that they've also done a quick test and int was
faster than short I would not be at all surprised. I have, in the past,
been so frequently surprised by what clever chips and compilers do that
I am rarely surprised any more.
Post by itsme.susila
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function.
That sounds odd. To me, a chess playing program /is/ recursive search
with very little else! It sounds like saying that the program is the
critical part of the program.
When I wrote a chess program, it was the function that evaluated
positions that was the bottle-neck. If I remember correctly, there was
some measurable component from the function that generated valid moves,
but I don't remember it being worth working on. And despite knowing the
most expensive function, the best improvements came from re-ordering the
move list because of the way the tree pruning worked.
reordering is difficult; there is not many ways to order search()
Post by Ben Bacarisse
(I suspect the design of chess programs is more sophisticated these
days. Even for the time, mine was crude, having no book of openings and
no special end-game mode.)
The problem with me is I never ever looked into the codes of the top
open source programs - stockfish, fruit, crafty; I think they are by
programming experts. I just looked into crafty in C; its search() has
about 1000 lines in one function. On the other hand Fruit C++(by famous
Fabien Letousy) has collections of small functions.

Before AlphaGo destroyed the world's top Go players, chess programming
was still popular; now it seems the interest has dropped a lot.
talkchess was very active and much of the advice were good; the site is
now closed.
Post by Ben Bacarisse
Post by itsme.susila
But I think it is correct to start off with what you say,
good choice of data structures, etc...
Yes. For macro-optimisation, think data structures and algorithms (and
measure). For micro-optimisation, measure (don't try to guess).
Steven Petruzzellis
2018-04-09 16:33:26 UTC
Permalink
Post by itsme.susila
Post by Ben Bacarisse
<snip>
Post by itsme.susila
In fact, chess programmers are a lot who talk about speed. I did go to
talkchess.com (now no more) where there were much talk about coding
practice. A simple example that novice may not know: use integer(4
bytes) data rather than short(2 bytes)...chess programming is almost
all with integer operations and there are overheads for modern cpu to
operate on char and short types.
I don't think that's good advice. It may be true in some cases (it
certainly used to be on some CPUs) but it may not be true in others.
I used to be quite good at guessing what code changes would improve
performance, but that was 25 years ago. With every new generation of
CPU, and every new compiler release, I have become used being surprised
and I've thrown out almost every "rule of thumb" that I used to use.
For example, I just tried a nested loop calculation using short, int and
long (long being 64-bit here) and the short version was faster than the
(32-bit) int version. But that was un-optimised. With gcc -O2 I can't
detect any difference between short and int. This was designed to
avoid cache issues so it's possible that short might even be faster in
some cases where the memory saving turns out to be helpful.
And if someone now posts that they've also done a quick test and int was
faster than short I would not be at all surprised. I have, in the past,
been so frequently surprised by what clever chips and compilers do that
I am rarely surprised any more.
Post by itsme.susila
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function.
That sounds odd. To me, a chess playing program /is/ recursive search
with very little else! It sounds like saying that the program is the
critical part of the program.
When I wrote a chess program, it was the function that evaluated
positions that was the bottle-neck. If I remember correctly, there was
some measurable component from the function that generated valid moves,
but I don't remember it being worth working on. And despite knowing the
most expensive function, the best improvements came from re-ordering the
move list because of the way the tree pruning worked.
reordering is difficult; there is not many ways to order search()
Post by Ben Bacarisse
(I suspect the design of chess programs is more sophisticated these
days. Even for the time, mine was crude, having no book of openings and
no special end-game mode.)
The problem with me is I never ever looked into the codes of the top
open source programs - stockfish, fruit, crafty; I think they are by
programming experts. I just looked into crafty in C; its search() has
about 1000 lines in one function. On the other hand Fruit C++(by famous
Fabien Letousy) has collections of small functions.
Before AlphaGo destroyed the world's top Go players, chess programming
was still popular; now it seems the interest has dropped a lot.
talkchess was very active and much of the advice were good; the site is
now closed.
Post by Ben Bacarisse
Post by itsme.susila
But I think it is correct to start off with what you say,
good choice of data structures, etc...
Yes. For macro-optimisation, think data structures and algorithms (and
measure). For micro-optimisation, measure (don't try to guess).
Virtually everything Audra Moore says about Alex Ding is false. He has yet to realize people are not as stupid as he needs them to be.

This is undoubtedly an interest of Audra Moore's.

BTW, I've already pointed out that his use of "herd" to describe Alex Ding is argumentum ad hominem, since he's likening them to crazed wildebeests.

Having to deal with the use of Scrivener is not what most want to do.



--
Live on Kickstarter
http://www.5z8.info/add-worm_z4r9qk_fakelogin
Jonas Eklundh Communication
Spiros Bousbouras
2018-04-09 16:52:10 UTC
Permalink
On Tue, 10 Apr 2018 00:27:49 +0800
Post by itsme.susila
Post by Ben Bacarisse
That sounds odd. To me, a chess playing program /is/ recursive search
with very little else! It sounds like saying that the program is the
critical part of the program.
When I wrote a chess program, it was the function that evaluated
positions that was the bottle-neck. If I remember correctly, there was
some measurable component from the function that generated valid moves,
but I don't remember it being worth working on. And despite knowing the
most expensive function, the best improvements came from re-ordering the
move list because of the way the tree pruning worked.
reordering is difficult; there is not many ways to order search()
He said reordering the move list. If there are N legal moves then there are
N! (N factorial) ways to sort the order and this can be quite large. I think
the general idea is that you search the moves in the order of highest to
lowest score according to the scores caclulated by the previous shallower
search.
Post by itsme.susila
Post by Ben Bacarisse
(I suspect the design of chess programs is more sophisticated these
days. Even for the time, mine was crude, having no book of openings and
no special end-game mode.)
The problem with me is I never ever looked into the codes of the top
open source programs - stockfish, fruit, crafty; I think they are by
programming experts. I just looked into crafty in C; its search() has
about 1000 lines in one function. On the other hand Fruit C++(by famous
Fabien Letousy) has collections of small functions.
Before AlphaGo destroyed the world's top Go players, chess programming
was still popular; now it seems the interest has dropped a lot.
It will go up again because of en.wikipedia.org/wiki/AlphaZero .So
in the coming years I expect a lot of activity in chess programming
experimenting with the new algorithm(s).
Post by itsme.susila
talkchess was very active and much of the advice were good; the site is
now closed.
--
vlaho.ninja/prog
itsme.susila
2018-04-09 17:17:55 UTC
Permalink
Post by Spiros Bousbouras
On Tue, 10 Apr 2018 00:27:49 +0800
Post by itsme.susila
Post by Ben Bacarisse
That sounds odd. To me, a chess playing program /is/ recursive search
with very little else! It sounds like saying that the program is the
critical part of the program.
When I wrote a chess program, it was the function that evaluated
positions that was the bottle-neck. If I remember correctly, there was
some measurable component from the function that generated valid moves,
but I don't remember it being worth working on. And despite knowing the
most expensive function, the best improvements came from re-ordering the
move list because of the way the tree pruning worked.
reordering is difficult; there is not many ways to order search()
He said reordering the move list. If there are N legal moves then there are
N! (N factorial) ways to sort the order and this can be quite large. I think
the general idea is that you search the moves in the order of highest to
lowest score according to the scores caclulated by the previous shallower
search.
I did not make myself clear. Ordering search at internal nodes is
already a standard practice:
1) moves that captures; losing captures sorted back.
2) other quiet moves ordered by history of [piece-type][to-squares] that
cause a fail high cutoff.
I think there is no known alternatives method of move ordering for
internal nodes; root search ordering is different.
Post by Spiros Bousbouras
Post by itsme.susila
Post by Ben Bacarisse
(I suspect the design of chess programs is more sophisticated these
days. Even for the time, mine was crude, having no book of openings and
no special end-game mode.)
The problem with me is I never ever looked into the codes of the top
open source programs - stockfish, fruit, crafty; I think they are by
programming experts. I just looked into crafty in C; its search() has
about 1000 lines in one function. On the other hand Fruit C++(by famous
Fabien Letousy) has collections of small functions.
Before AlphaGo destroyed the world's top Go players, chess programming
was still popular; now it seems the interest has dropped a lot.
It will go up again because of en.wikipedia.org/wiki/AlphaZero .So
in the coming years I expect a lot of activity in chess programming
experimenting with the new algorithm(s).
https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match

I have much to say against the "behind-door" match of AlphaZero and
Stockfish - current top program. Like what some others say, there is a
lot of politics in the announcement - google is dong PR work about their
technological edge in AI.

Deep blue played against a real player. Yes, AlphaGo also played real
players. But this chess match against Stockfish has some degree of
disingenuity - I think vastly different hardware. That AI can play chess
is now obvious, but I doubt it is possible to do it with a desktop
system - I think not within the next 20 years.
Post by Spiros Bousbouras
Post by itsme.susila
talkchess was very active and much of the advice were good; the site is
now closed.
Scott Lurndal
2018-04-09 18:03:47 UTC
Permalink
Post by Ben Bacarisse
For example, I just tried a nested loop calculation using short, int and
long (long being 64-bit here) and the short version was faster than the
(32-bit) int version. But that was un-optimised. With gcc -O2 I can't
detect any difference between short and int. This was designed to
avoid cache issues so it's possible that short might even be faster in
some cases where the memory saving turns out to be helpful.
On all modern processors, assuming the accesses are naturally aligned,
there should be no difference in performance between 8-bit, 16-bit, 32-bit
and (on 64-bit architectures) 64-bit integer arithmetic. Once the
line is present in the L1 cache, it makes no difference to the ALU how wide
the data is. May take a bit (probably unmeasurable) more power.

Unaligned accesses of any size, on the other hand, will be more expensive due to the
barrel shifter.

A more compelling case for 16-bit operations comes with the SIMD instructions,
where one can fit twice as many calculations into a cycle using 16-bit elements
for a given vector size.
s***@casperkitty.com
2018-04-09 20:39:27 UTC
Permalink
Post by Scott Lurndal
On all modern processors, assuming the accesses are naturally aligned,
there should be no difference in performance between 8-bit, 16-bit, 32-bit
and (on 64-bit architectures) 64-bit integer arithmetic. Once the
line is present in the L1 cache, it makes no difference to the ALU how wide
the data is. May take a bit (probably unmeasurable) more power.
On many modern processors other than those derived from the 8086, operations
like "add" always operate on full register-sized quantities. Allowing such
instructions to operate on smaller quantities means using extra bits in the
opcode to select the operand size, and in many cases freeing up opcode bits
for other uses would be more helpful.

On something like the ARM, if "x" needs to be loaded and stored from/to
memory, the performance of "x+=23;" won't be affected by whether x is 8,
16, or 32 bits, since the "store" instruction will ignore the upper bits
produced by the addition. If, however, x is in a register, and if the
upper bits of x would affect the next operation, it will be necessary to
add an instruction to truncate x.

Note that the way the C Standard is defined, this requirement holds even
if x is a short signed type. On a conforming implementation, given e.g.

int test(int16_t x)
{
x *= x;
return x;
}

calling test(320) may raise an implementation-defined signal, or it may
return a particular value the implementation defines as the result of
coercing the value 102400 to type "int16_t", but the Standard would not
allow an implementation to select among different possible values or
behaviors in Unspecified fashion. As a practical matter, this would mean
that even with signed type "int16_t", an implementation would have no
choice to but to include a truncate-and-sign-extend instruction.
Scott Lurndal
2018-04-09 20:53:15 UTC
Permalink
Post by s***@casperkitty.com
Post by Scott Lurndal
On all modern processors, assuming the accesses are naturally aligned,
there should be no difference in performance between 8-bit, 16-bit, 32-bit
and (on 64-bit architectures) 64-bit integer arithmetic. Once the
line is present in the L1 cache, it makes no difference to the ALU how wide
the data is. May take a bit (probably unmeasurable) more power.
On many modern processors other than those derived from the 8086, operations
like "add" always operate on full register-sized quantities. Allowing such
instructions to operate on smaller quantities means using extra bits in the
opcode to select the operand size, and in many cases freeing up opcode bits
for other uses would be more helpful.
Those are generally RISC architectures, in which case the size of the
datum is encoded into the load instruction(s), including whether to
sign or zero-extend to the register size.
Post by s***@casperkitty.com
On something like the ARM, if "x" needs to be loaded and stored from/to
memory, the performance of "x+=23;" won't be affected by whether x is 8,
16, or 32 bits, since the "store" instruction will ignore the upper bits
produced by the addition.
If, however, x is in a register, and if the
upper bits of x would affect the next operation, it will be necessary to
add an instruction to truncate x.
Not necessarily. See the ARM ADD, SUB instructions (where the instruction
encoding allow specification of the size and sign-extention properties
of the operand in the addend register).

<extend> For the 32-bit variant: is the extension to be applied to
the second source operand, encoded in the "option" field. It
can have the following values:
UXTB when option = 000
UXTH when option = 001
LSL|UXTW when option = 010
UXTX when option = 011
SXTB when option = 100
SXTH when option = 101
SXTW when option = 110
SXTX when option = 111

(UXTB - Unsigned (zero) extended from 8-bit operand)
(SXTW - Sign extended from 16-bit operand).

(Source, ARM DDI0487C revision a).
bartc
2018-04-09 11:23:39 UTC
Permalink
Post by itsme.susila
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function. But I think it is correct to start off with what you say, good
choice of data structures, etc...
Do you have some code that you post or link to, so that people here can
try and spot anything that might help? Just what is believed to be the
bottleneck might do.

Or is this some kind of contest and the program used is a secret?
--
bartc
itsme.susila
2018-04-09 16:39:00 UTC
Permalink
Post by bartc
Post by itsme.susila
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function. But I think it is correct to start off with what you say,
good choice of data structures, etc...
Do you have some code that you post or link to, so that people here can
try and spot anything that might help? Just what is believed to be the
bottleneck might do.
Or is this some kind of contest and the program used is a secret?
I actually was a little dumb when I started this thread (true, I dont
really know computer science and architecture). I was imaging things
about nesting levels and some vague issue idea about "memory
addressing". From the comments now, I could laugh at my ..sss. At
compile times, all the variables are just memory addresses; so the
compiler has resolved all the memory usage issues.
I think it doesn't matter much when or where we declare variables.

Chess programs are no more secret nowadays.There is nothing special
about my codes except I keep a large single search() and evaluate().
Tim Rentsch
2018-04-09 15:09:58 UTC
Permalink
I think the advice given earlier "test! don't imagine" is correct. But
I still like to have some good practice from the experts who know how
our computer cpu actually operate.
In fact, chess programmers are a lot who talk about speed. I did go to
talkchess.com (now no more) where there were much talk about coding
practice. A simple example that novice may not know: use integer(4
bytes) data rather than short(2 bytes)...chess programming is almost
all with integer operations and there are overheads for modern cpu to
operate on char and short types.
In fact the experts would also do profiling (there is compiler auto
profiling optimizations too) and then to optimize at the particular
spots; but we already know it i recursive search which is the critical
function. But I think it is correct to start off with what you say,
good choice of data structures, etc...
Like I said it's a complicated problem. Looking for simple
answers is likely to get in the way of real progress. I
recommend starting up the learning curve and preparing
yourself to be in it for the long haul.
Joe Pfeiffer
2018-04-09 23:07:24 UTC
Permalink
Post by Tim Rentsch
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Your question doesn't have any simple answers. In fact, these
days most questions about improving performance don't have any
simple answers. What happens now inside compilers and inside
computers is so complicated it is nearly impossible to analyze it
or give any hard and fast rules. However, unless you are an
expert, any effect you get from moving variable declarations
around is likely to be much smaller than effects from things like
choice of data structure representation, algorithmic choices,
speeding up common cases possibly at the expense of rare cases,
etc. Start by asking where the existing code is spending its
time. Which ever lines of code are the "hottest" is a good place
to start looking for alternatives. Code up two or three different
ideas for cooling down the hot spot, then run those and measure
what the effects of the different changes are. Pick what looks
like the best scheme and go around the cycle again, gradually
reaching a more uniform code temperature. Going through this
process, without ever getting to micro-optimization, is very
likely to give most of the performance gain that can be squeezed
out of the program. At some point you will reach a point of
diminishing returns, where the additional performance gain is
not worth the amount of effort involved; you have to decide
for yourself at what point that is.
(Disclaimer: I am not a performance expert. I have worked with
some people who are scary good at performance tuning, and some
of that has rubbed off, but I still wouldn't call myself an
expert.)
While you're right that almost all performance optimization questions
are hard, the particular ones he was asking about really aren't. If we
assume a sane compiler, it won't make any difference.
s***@casperkitty.com
2018-04-09 23:28:28 UTC
Permalink
Post by Joe Pfeiffer
While you're right that almost all performance optimization questions
are hard, the particular ones he was asking about really aren't. If we
assume a sane compiler, it won't make any difference.
If the function makes any calls to other functions a time where some automatic
objects are either out of scope or have values that will never be observed,
there are two sane ways a compiler could handle such code:

1. Reserve space for those objects at some time before the nested function
is called [e.g. at the start of the function] and leave it reserved while
calling that function.

2. Reserve space for those object when they are needed, and release that
space before calling a nested function.

The former approach will often be faster, but the latter approach will use
less stack space. Which approach is better will often depend upon how much
stack space the nested function will need, which is something the compiler
won't generally have any way of knowing.

Given that either approach may be better in certain circumstances, I would
not fault a compiler which kept variables allocated if they were in scope
across a nested function call, even if the values they hold during the call
will never be observed, but which released stack space used by variables
that were out of scope at the point of a nested call. Such an approach may
not be common nowadays, but I'd consider it "sane".
Steven Petruzzellis
2018-04-10 07:46:08 UTC
Permalink
Post by Joe Pfeiffer
Post by Tim Rentsch
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Your question doesn't have any simple answers. In fact, these
days most questions about improving performance don't have any
simple answers. What happens now inside compilers and inside
computers is so complicated it is nearly impossible to analyze it
or give any hard and fast rules. However, unless you are an
expert, any effect you get from moving variable declarations
around is likely to be much smaller than effects from things like
choice of data structure representation, algorithmic choices,
speeding up common cases possibly at the expense of rare cases,
etc. Start by asking where the existing code is spending its
time. Which ever lines of code are the "hottest" is a good place
to start looking for alternatives. Code up two or three different
ideas for cooling down the hot spot, then run those and measure
what the effects of the different changes are. Pick what looks
like the best scheme and go around the cycle again, gradually
reaching a more uniform code temperature. Going through this
process, without ever getting to micro-optimization, is very
likely to give most of the performance gain that can be squeezed
out of the program. At some point you will reach a point of
diminishing returns, where the additional performance gain is
not worth the amount of effort involved; you have to decide
for yourself at what point that is.
(Disclaimer: I am not a performance expert. I have worked with
some people who are scary good at performance tuning, and some
of that has rubbed off, but I still wouldn't call myself an
expert.)
While you're right that almost all performance optimization questions
are hard, the particular ones he was asking about really aren't. If we
assume a sane compiler, it won't make any difference.
Thanks to Autumn Nissen <https://www.instantcheckmate.com/people/autumn-nissen> and his 'followers' you now need a recognized list. If others start trolling him again I will not be able to stop myself... as I hate him too much to not. I'm referring to advocates here, not idiots, who have been responding to him all along. You won't go into a rave, drink all the moonshine, violate all the children, misappropriate the bibles and throw up in the kitchen without being laughed at. No no hell no. He never agreed to stop flooding. He lied about his trolling of course.

--
Live on Kickstarter
https://prescottareapsychopaths.wordpress.com/paul-f-riddle-psychopath/
Jonas Eklundh Communication
Tim Rentsch
2018-04-11 14:10:56 UTC
Permalink
Post by Joe Pfeiffer
Post by Tim Rentsch
Post by itsme.susila
Allocate variables within code blocks
[code]
void myfn(){
int, x, y, z;
for (i = 0,...){
u64 u, v, w;
u = 3;
x = u * 6;
......
}
return;
}
[/code]
Should I allocate u,v,w at the start of myfn(); I know beforehand they
would be used many
times over.
Your question doesn't have any simple answers. In fact, these
days most questions about improving performance don't have any
simple answers. What happens now inside compilers and inside
computers is so complicated it is nearly impossible to analyze it
or give any hard and fast rules. However, unless you are an
expert, any effect you get from moving variable declarations
around is likely to be much smaller than effects from things like
choice of data structure representation, algorithmic choices,
speeding up common cases possibly at the expense of rare cases,
etc. Start by asking where the existing code is spending its
time. Which ever lines of code are the "hottest" is a good place
to start looking for alternatives. Code up two or three different
ideas for cooling down the hot spot, then run those and measure
what the effects of the different changes are. Pick what looks
like the best scheme and go around the cycle again, gradually
reaching a more uniform code temperature. Going through this
process, without ever getting to micro-optimization, is very
likely to give most of the performance gain that can be squeezed
out of the program. At some point you will reach a point of
diminishing returns, where the additional performance gain is
not worth the amount of effort involved; you have to decide
for yourself at what point that is.
(Disclaimer: I am not a performance expert. I have worked with
some people who are scary good at performance tuning, and some
of that has rubbed off, but I still wouldn't call myself an
expert.)
While you're right that almost all performance optimization questions
are hard, the particular ones he was asking about really aren't. If we
assume a sane compiler, it won't make any difference.
Sadly I believe that is not necessarily true. Optimizing compilers,
and the workings of what goes on during program execution, are now
so complicated that statements like this cannot be made reliably. I
have seen first hand repeatable observations where a change to
source code that could not possibly make a difference to program
performance still did make a difference. For this case I agree that
any difference, if there is one, is likely to be small. So if what
you meant is that it won't make any significant difference, for what
this questioner is asking I think that's reasonable advice. But I
still think it's important to steer him away from worrying about
micro-optimizations in general, not just this one.
Continue reading on narkive:
Loading...