Discussion:
main()
Add Reply
bartc
2017-04-13 23:53:59 UTC
Reply
Permalink
Raw Message
I think this is not strictly part of the language but someone here may
happen to know how it's done.

The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.

On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().

What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)

(I played around with code like this which possibly suggests the latter,
but I need something definite. It seemed to work though!

[32-bit ARM]

#include <stdio.h>

int main(void) {
int i;
int n;
int *p=&i;
char** s;

for (i=0;i<10;++i) printf("%d: %d\n",i,*(p+i));

n=*(p+5);
s=(char**)*(p+4);

printf("N=%d\n",n);
for (i=0;i<n;++i) printf("%d: %s\n",i,s[i]);

}

)
--
bartc
Robert Wessel
2017-04-14 00:10:08 UTC
Reply
Permalink
Raw Message
Post by bartc
I think this is not strictly part of the language but someone here may
happen to know how it's done.
The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.
On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
(I played around with code like this which possibly suggests the latter,
but I need something definite. It seemed to work though!
[32-bit ARM]
#include <stdio.h>
int main(void) {
int i;
int n;
int *p=&i;
char** s;
for (i=0;i<10;++i) printf("%d: %d\n",i,*(p+i));
n=*(p+5);
s=(char**)*(p+4);
printf("N=%d\n",n);
for (i=0;i<n;++i) printf("%d: %s\n",i,s[i]);
}
)
I can't tell you how it works for ARM, but in x86-32 Linux, the entry
point to the loaded module (as specified in the ELF header), has three
items on the stack (argc and the argv and env pointers). Remember
that in *nix, the shell expands the command line arguments, not the
application (as in Windows). That entry point is somewhere in the crt
(crtbegin.o, crtend.o, gcrt1.o), which does some stuff needed for
startup, and then eventually calls main().
a***@math.uni.wroc.pl
2017-04-14 02:51:27 UTC
Reply
Permalink
Raw Message
Post by bartc
I think this is not strictly part of the language but someone here may
happen to know how it's done.
The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.
On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
In Linux program initialization (startup) is shared responsibility
of compiler, kernel, libc and dynamic linker. Linux executable has field
pointing to "ELF interpeter", that is program resposible for
initialization of dynamic linking. libc also may need some initalization.
If you properly plug your object files into the framwork, then
your main will get called with correct arguments. Otherwise you
risk crashes due to wrong initialization. To the point: on
Debian x86_64 libc provides 'crt1.o' which contains startup
routine called '_start'. This routine may look trivial (it is
just doing little register manipulations and then calls 'main').
But you need to do all required woodoo. Case in point: I was
using a Modula 2 compiler and on new version of Linux executables
from this compiler begun to crash. I looked at the problem and
the Modula 2 compiler provided its own startup routine which
turned out to be incompatible with newer version of Linux.

Arguments to main are provided by kernel, I did not check if
they are passed on the stack or in registers (details depend
on architecture). But if you want to use libc you need to
initialize things in way required by libc.
--
Waldek Hebisch
Scott Lurndal
2017-04-14 12:58:49 UTC
Reply
Permalink
Raw Message
Post by bartc
I think this is not strictly part of the language but someone here may
happen to know how it's done.
The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.
On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
Unix and unix-like systems get the arguments from the system
call used to create the process. Typically this is one of
the exec-family of system calls.

http://pubs.opengroup.org/onlinepubs/9699919799/functions/execv.html

The path argument provides the cwd-relative (or absolute) path to
the ELF (or COFF in the olden days) binary. The ELF header has
a field which records the virtual address of the entry point. That
virtual address refers (generally) to the label _start in in the
c-runtime initialization code, which handles various pre-main setup
activities then invokes main with the following signature:

int main(int argc, const char **argv, const char **envp, Elf32_auxv_t **auxv)

envp contains a list of all exported environment variables and their
values. auxv conveys information from the run-time ELF loader.

http://articles.manugarg.com/aboutelfauxiliaryvectors
bartc
2017-04-14 13:11:50 UTC
Reply
Permalink
Raw Message
Post by Scott Lurndal
Post by bartc
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
Unix and unix-like systems get the arguments from the system
call used to create the process. Typically this is one of
the exec-family of system calls.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/execv.html
The path argument provides the cwd-relative (or absolute) path to
the ELF (or COFF in the olden days) binary. The ELF header has
a field which records the virtual address of the entry point. That
virtual address refers (generally) to the label _start in in the
c-runtime initialization code, which handles various pre-main setup
int main(int argc, const char **argv, const char **envp, Elf32_auxv_t **auxv)
envp contains a list of all exported environment variables and their
values. auxv conveys information from the run-time ELF loader.
Interesting, I didn't know main() could take extra parameters. On
Windows (as I don't have a handy Linux at the minute) this works:

#include <stdio.h>
#include <stdlib.h>

int main(int n, char** args, char** env) {
int i;

for (i=0; i<n; ++i)
| printf("%d: %s\n",i,args[i]);

while (**env) {
| printf("Env str: %s\n",*env);
| *env += strlen(*env)+1;
}

}
--
bartc
Malcolm McLean
2017-04-15 14:53:03 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Scott Lurndal
envp contains a list of all exported environment variables and their
values. auxv conveys information from the run-time ELF loader.
Interesting, I didn't know main() could take extra parameters. On
#include <stdio.h>
#include <stdlib.h>
int main(int n, char** args, char** env) {
int i;
for (i=0; i<n; ++i)
| printf("%d: %s\n",i,args[i]);
while (**env) {
| printf("Env str: %s\n",*env);
| *env += strlen(*env)+1;
}
}
Exactly.
envp is one of those things which probably should be in the standard but
isn't, as many programs need the environment variables.
Having said that, setting environment variables is a horrid thing to
require of a user.
s***@casperkitty.com
2017-04-15 15:43:23 UTC
Reply
Permalink
Raw Message
Post by Malcolm McLean
Exactly.
envp is one of those things which probably should be in the standard but
isn't, as many programs need the environment variables.
Having said that, setting environment variables is a horrid thing to
require of a user.
Many programs are invoked using commands typed in a shell or shell-escape
prompt, and requiring users to re-specify common options every time they
type commands is apt to be a nuisance. While command shells could offer
macro facilities to minimize such retyping, it would be hard to make such
macros effective in programs that have a "shell-escape" feature to run
outside programs.
Ben Bacarisse
2017-04-15 15:50:32 UTC
Reply
Permalink
Raw Message
Post by Malcolm McLean
Post by bartc
Post by Scott Lurndal
envp contains a list of all exported environment variables and their
values. auxv conveys information from the run-time ELF loader.
Interesting, I didn't know main() could take extra parameters. On
#include <stdio.h>
#include <stdlib.h>
int main(int n, char** args, char** env) {
int i;
for (i=0; i<n; ++i)
| printf("%d: %s\n",i,args[i]);
while (**env) {
| printf("Env str: %s\n",*env);
| *env += strlen(*env)+1;
}
}
Exactly.
envp is one of those things which probably should be in the standard but
isn't, as many programs need the environment variables.
The standard includes getenv for that.

<snip>
--
Ben.
Keith Thompson
2017-04-15 22:31:03 UTC
Reply
Permalink
Raw Message
Malcolm McLean <***@btinternet.com> writes:
[...]
Post by Malcolm McLean
Exactly.
envp is one of those things which probably should be in the standard but
isn't, as many programs need the environment variables.
POSIX has `extern char **environ;`. And of course standard C has
getenv(), which isn't as flexible (you can only get the value of an
environment variable if you already know its name, but that's usually
enough).
Post by Malcolm McLean
Having said that, setting environment variables is a horrid thing to
require of a user.
Balderdash. (And in may cases, environment variables are set by the
environment, not explicitly by the user.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Gordon Burditt
2017-04-16 05:19:08 UTC
Reply
Permalink
Raw Message
Post by Malcolm McLean
envp is one of those things which probably should be in the standard but
isn't, as many programs need the environment variables.
Many programs need specific environment variables (They know the
names of them, like "HOME" or "PATH" or whatever ahead of time).

There are only a few that need to be able to look at *ALL* of the
environment variables, without knowing names ahead of time. One
of them is the UNIX "env" program which among other functions, lists
all of them.

Mostly, these are programs that invoke other programs, and want to
pass on all the environment variables they were given, perhaps with
a few changes. That want often proves to be a security hole.
Post by Malcolm McLean
Having said that, setting environment variables is a horrid thing to
require of a user.
Environment variables are often used to communicate between parts
of a collection of programs. The user doesn't have to mess with
these.

Kenny McCormack
2017-04-15 17:01:33 UTC
Reply
Permalink
Raw Message
In article <dI3IA.10330$***@fx44.iad>,
Scott Lurndal <***@pacbell.net> wrote:
...
Post by Scott Lurndal
Unix and unix-like systems get the arguments from the system
call used to create the process. Typically this is one of
the exec-family of system calls.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/execv.html
Keep in mind that, on Linux at least (and also OSX, but I don't know about
other Unixes), there is only one exec*() system call - namely, execve(2).
All the others (which are all in section 3, not section 2) are "front ends"
to execve().
--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/TedCruz
Lew Pitcher
2017-04-14 14:04:52 UTC
Reply
Permalink
Raw Message
Post by bartc
I think this is not strictly part of the language but someone here may
happen to know how it's done.
The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.
On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
In Linux, as in all Unix-like operating systems, the arguments ultimately
passed to main() as argv come from the invocation of one of the exec()
system functions.

Those functions accept (in one form or another)
- a pointer to string that provides a filesystem path to the executable, and
- a list of pointers to strings, that each provide a value for the
corresponding argv[] element (including argv[0]).

The OS ensures that the strings pointed to by the exec() call are copied to
known, standard locations in the memory map of the new process, and builds
the corresponding argc and argv[] arrays /before/ loading and invoking the
program code and C runtime startup.

So, yes, those args "just magically" appear, as far as the new process is
concerned, just as the code and data of that new process "just magically"
appear in memory.
--
Lew Pitcher
"In Skills, We Trust"
PGP public key available upon request
bartc
2017-04-14 14:52:18 UTC
Reply
Permalink
Raw Message
Post by Lew Pitcher
Post by bartc
I think this is not strictly part of the language but someone here may
happen to know how it's done.
The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.
On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
In Linux, as in all Unix-like operating systems, the arguments ultimately
passed to main() as argv come from the invocation of one of the exec()
system functions.
Those functions accept (in one form or another)
- a pointer to string that provides a filesystem path to the executable, and
- a list of pointers to strings, that each provide a value for the
corresponding argv[] element (including argv[0]).
The OS ensures that the strings pointed to by the exec() call are copied to
known, standard locations in the memory map of the new process, and builds
the corresponding argc and argv[] arrays /before/ loading and invoking the
program code and C runtime startup.
So, yes, those args "just magically" appear, as far as the new process is
concerned, just as the code and data of that new process "just magically"
appear in memory.
I said is was magical because the order of those parameters seems to
match the parameters as used in C's main() function.

Either it's an amazing coincidence or one came first and the other
copied. That may or may not been a sensible thing to do, but it makes it
harder to disentangle what is C, and what is OS.
--
bartc
Scott Lurndal
2017-04-14 15:00:14 UTC
Reply
Permalink
Raw Message
Post by bartc
Either it's an amazing coincidence or one came first and the other
copied. That may or may not been a sensible thing to do, but it makes it
harder to disentangle what is C, and what is OS.
Which is hardly suprising, given the history of the development
of C (which was developed as the language for Unix kernel
and operating system utility development
some decade or more before Windows existed).
Lew Pitcher
2017-04-14 17:49:51 UTC
Reply
Permalink
Raw Message
Post by bartc
Post by Lew Pitcher
Post by bartc
I think this is not strictly part of the language but someone here may
happen to know how it's done.
The arguments to a programs entry point such as main(int n, char** args)
are not set up by the OS, but by some function the C implementation
arranges.
On Windows at least, where such a function might use GetCommandLine() or
__getmainargs() to obtain the command-line arguments and pass them to
main().
What's the equivalent on Linux? Or will those args just magically be on
the stack anyway? (Because after all Linux and C are very.. close.)
In Linux, as in all Unix-like operating systems, the arguments ultimately
passed to main() as argv come from the invocation of one of the exec()
system functions.
Those functions accept (in one form or another)
- a pointer to string that provides a filesystem path to the executable,
and - a list of pointers to strings, that each provide a value for the
corresponding argv[] element (including argv[0]).
The OS ensures that the strings pointed to by the exec() call are copied
to known, standard locations in the memory map of the new process, and
builds the corresponding argc and argv[] arrays /before/ loading and
invoking the program code and C runtime startup.
So, yes, those args "just magically" appear, as far as the new process is
concerned, just as the code and data of that new process "just magically"
appear in memory.
I said is was magical because the order of those parameters seems to
match the parameters as used in C's main() function.
Either it's an amazing coincidence
It's not a coincidence
Post by bartc
or one came first and the other copied.
And, that's not what happened either.

C was developed as the high-level language to implement Unix with. Unix was
originally developed in assembly language, then rewritten in C. It is by
design that the C language, the C runtime support, the Unix runtime support,
the Unix OS system calls, and the underlying Unix OS all work together to
"magically" place arguments seeded by an exec() syscall into known locations
in memory so that the executable initiated by the exec() syscall can locate
them and populate the C main() argc and argv (and envp) parameters.
Post by bartc
That may or may not been a sensible thing to do, but it makes it
harder to disentangle what is C, and what is OS.
I guess so, but only in that those OS that do not provide the requisite
features natively must do so by emulation.
--
Lew Pitcher
"In Skills, We Trust"
PGP public key available upon request
Loading...