Discussion:
Could you please give me some advice on GC in C?
(too old to reply)
byang
2009-04-22 13:46:19 UTC
Permalink
Hi,
I am now designing a library in C, and the libary dynamically
allocate much memory, and now I use reference couting to deal with
memory alloc/free. I mean, the client of this libary should call unref
() for many pointer of the data structure of the library. And this ref/
unref interface impose more additional task for client programmers.
But I am wondering is there a better way? So, I goole for "gc for C",
and got a mark-sweep collector (http://www.hpl.hp.com/personal/
Hans_Boehm/gc/). Could anybody here please give me some advice on GC
of the library? Thanks a lot!


Regards!
Bo
n***@hotmail.com
2009-04-22 15:10:13 UTC
Permalink
Hi,
    I am now designing a library in C, and the libary dynamically
allocate much memory, and now  I use reference couting to deal with
memory alloc/free. I mean, the client of this libary should call unref
() for many pointer of the data structure of the library. And this ref/
unref interface impose more additional task for client programmers.
But I am wondering is there a better way? So, I goole for "gc for C",
and got a mark-sweep collector (http://www.hpl.hp.com/personal/
Hans_Boehm/gc/). Could anybody here please give me some advice on GC
of the library? Thanks a lot!
GC isn't part of the standard. But the Boehm collector is the best
known one. lcc-win which is implemented by a regular on clc has a GC.
It may also use the Boehm collector. I dunno. Ask on an lcc-win
specific
ng or look for their website for more details.
jacob navia
2009-04-22 16:02:32 UTC
Permalink
Post by byang
Hi,
I am now designing a library in C, and the libary dynamically
allocate much memory, and now I use reference couting to deal with
memory alloc/free. I mean, the client of this libary should call unref
() for many pointer of the data structure of the library. And this ref/
unref interface impose more additional task for client programmers.
But I am wondering is there a better way? So, I goole for "gc for C",
and got a mark-sweep collector (http://www.hpl.hp.com/personal/
Hans_Boehm/gc/). Could anybody here please give me some advice on GC
of the library? Thanks a lot!
Regards!
Bo
The lcc-win C compiler proposes the GC in the standard distribution.
It is running in the 32 bit version and has been ported in the 64 bit
version.

Under the linux OS there is gc library easily avilable. The software
from Boehm runs also in a wide variety of environments.

Actually it is VERY EASY to use. Just replace all malloc calls by
GC_malloc, and drop all calls to free().

You can download the lcc-win compiler at the URL below
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
George Peter Staplin
2009-04-22 22:26:18 UTC
Permalink
Post by byang
Hi,
I am now designing a library in C, and the libary dynamically
allocate much memory, and now I use reference couting to deal with
memory alloc/free. I mean, the client of this libary should call unref
() for many pointer of the data structure of the library. And this ref/
unref interface impose more additional task for client programmers.
But I am wondering is there a better way? So, I goole for "gc for C",
and got a mark-sweep collector (http://www.hpl.hp.com/personal/
Hans_Boehm/gc/). Could anybody here please give me some advice on GC
of the library? Thanks a lot!
Regards!
Bo
From what I recall, Boehm also mentions problems with his GC in some papers.
There are certain structures that can lead to memory being retained. Boehm
has a paper that suggests some changes he would make to C to improve the
accuracy of a GC for C, but these changes require compiler support.

Most C environments aren't designed for GC, and a conservative collector can
easily mistake a random integer or part of an integer as the case may be,
as a pointer. Moreover, this can happen randomly, so you might have a user
complaining about your program or library retaining a 1 GB allocation, and
you might never be able to duplicate that, unless you duplicate the exact
usage patterns of the user. That means pressing a button or key at a
specific time, and so on. It's really not the way to go for reliable
software, unless you have the entire environment setup to help the GC (some
systems do this though).

If you also work with memory mapped files (with mmap and munmap used to work
with segments of the file), or send pointers over sockets, you would find
those objects can lose all references and are thus freed for reuse, or back
to the system.

GC doesn't eliminate all unbounded growth patterns. You can still have
unbounded growth from errors in array growth algorithms, and abstract data
structures not having their elements pruned. In other words: GC doesn't
fix the flaws in your program, and sometimes it makes them more complicated
to track down.

If you're concerned about leaks I suggest you do what I do: create a series
of torture tests for constructors and destructors.

In other words:
while(1) {
bar = construct(foo);
destroy(bar);
}

Then watch the memory usage in a process monitoring program, and observe and
verify that the growth stabilizes over a period of time. If it doesn't
stabilize, you have a leak due to a missing free or something like that.

Then I generally produce more tests over time to refine the constructor
options, and I may mix in functions that could possibly change state
related to "bar" in this case, to verify the behavior with those (if they
might change allocated memory or state that could have an effect on it).

I have found leaks with these methods in a lot of code... It sometimes
helps to step through the malloc and free calls with a debugger to see what
is going on. Some systems also have better tools, and there are some
decent free tools like Valgrind.

-George
CBFalconer
2009-04-23 01:48:12 UTC
Permalink
George Peter Staplin wrote:
... snip ...
I have found leaks with these methods in a lot of code. It
sometimes helps to step through the malloc and free calls with a
debugger to see what is going on. Some systems also have better
tools, and there are some decent free tools like Valgrind.
Take a look at nmalloc, which is built for use with DJGPP. It
includes a full set of debuggery, which will show you faults,
unfreed areas, etc. If you can adapt it to your system go ahead.
See:

<http://cbfalconer.home.att.net/download/nmalloc.zip>
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Richard Heathfield
2009-04-23 08:51:00 UTC
Permalink
Post by CBFalconer
... snip ...
I have found leaks with these methods in a lot of code. It
sometimes helps to step through the malloc and free calls with a
debugger to see what is going on. Some systems also have better
tools, and there are some decent free tools like Valgrind.
Take a look at nmalloc, which is built for use with DJGPP. It
includes a full set of debuggery, which will show you faults,
unfreed areas, etc. If you can adapt it to your system go ahead.
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?

So by your own argument, it doesn't work..
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
CBFalconer
2009-04-24 00:56:40 UTC
Permalink
Post by Richard Heathfield
Post by CBFalconer
... snip ...
I have found leaks with these methods in a lot of code. It
sometimes helps to step through the malloc and free calls with a
debugger to see what is going on. Some systems also have better
tools, and there are some decent free tools like Valgrind.
Take a look at nmalloc, which is built for use with DJGPP. It
includes a full set of debuggery, which will show you faults,
unfreed areas, etc. If you can adapt it to your system go ahead.
See: ^^^^^
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Richard
2009-04-24 01:56:30 UTC
Permalink
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
... snip ...
I have found leaks with these methods in a lot of code. It
sometimes helps to step through the malloc and free calls with a
debugger to see what is going on. Some systems also have better
tools, and there are some decent free tools like Valgrind.
Take a look at nmalloc, which is built for use with DJGPP. It
includes a full set of debuggery, which will show you faults,
unfreed areas, etc. If you can adapt it to your system go ahead.
See: ^^^^^
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
Do you have any idea how imbecilic you looked when you did indeed state
that non ISO compliant C "does not work"?

Please. Do everyone a favour and go away.
--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c
Rosario
2009-04-28 19:04:21 UTC
Permalink
Post by Richard
Do you have any idea how imbecilic you looked when you did indeed state
we are *all* in "imbecilic" state, even if we don't know on it
i like *some* of what "CBFalconer" says; why do you want get angry him?
Richard Heathfield
2009-04-24 07:36:54 UTC
Permalink
<snip>
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
Do you mean that you are no longer of the opinion that non-standard
library calls stop code from working?
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mark McIntyre
2009-04-24 22:37:39 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
Do you mean that you are no longer of the opinion that non-standard
library calls stop code from working?
I suspect that CBF has never said that. I do recall a stupid exchange
between you and him where you trolled each other foolishly about the
meaning of the words "illegal" and "broken". Either way, its childish of
you both to perpetuate the discussion, and disingenuous to mention it in
this context since CBF hasn't said nmalloc is ISO C.
Richard Heathfield
2009-04-25 01:15:30 UTC
Permalink
Post by Mark McIntyre
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
Do you mean that you are no longer of the opinion that
non-standard library calls stop code from working?
I suspect that CBF has never said that.
He's said something of the kind quite a few times recently. Here's
one recent example where he says that non-standard code "won't
work": <***@yahoo.com> Not "might not work", or "will
only work on some platforms", or "requires adapting", but "won't
work".
Post by Mark McIntyre
I do recall a stupid exchange between you and him
Recently, practically all exchanges with Chuck Falconer and *anyone*
else have been stupid.

where you trolled each other
Post by Mark McIntyre
foolishly about the meaning of the words "illegal" and "broken".
Either way, its childish of you both to perpetuate the discussion,
and disingenuous to mention it in this context since CBF hasn't
said nmalloc is ISO C.
Hardly disingenuous. Chuck Falconer's position is that non-ISO C
code is atopical and non-working. nmalloc is non-ISO C code.
Therefore, by his own argument, it's off-topic and doesn't work.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mark McIntyre
2009-04-25 09:13:36 UTC
Permalink
Post by Richard Heathfield
Hardly disingenuous. Chuck Falconer's position is that non-ISO C
code is atopical and non-working.
I disagree that this was the case in the thread to which I referred,
because I clearly remember hte OP was invoking his compiler in strictly
conforming mode, and yet relying on an undeclared POSIX function.

But frankly life's too short to argue with argumentative people. You
note, correctly, that CBF has been especially cantankerous recently. The
same can increasingly be said about yourself.

<flame bait>
Youre both beginning to sound like Victor Meldrew.
</>
--
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
Richard Heathfield
2009-04-25 09:25:37 UTC
Permalink
Post by Mark McIntyre
Post by Richard Heathfield
Hardly disingenuous. Chuck Falconer's position is that non-ISO C
code is atopical and non-working.
I disagree that this was the case in the thread to which I
referred,
I describe his position. Is it your belief that he changes his
position depending on which thread he's in? (Hey, you might even be
right.)

<snip>
Post by Mark McIntyre
You note, correctly, that CBF has been especially cantankerous
recently. The same can increasingly be said about yourself.
<shrug> I post corrections. If people don't like it, that's their
problem, not mine. You can always killfile me if you don't want to
read my stuff.
Post by Mark McIntyre
<flame bait>
Youre both beginning to sound like Victor Meldrew.
</>
Who's he?
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Golden California Girls
2009-04-26 04:38:30 UTC
Permalink
Post by Richard Heathfield
Post by Mark McIntyre
<flame bait>
Youre both beginning to sound like Victor Meldrew.
</>
Who's he?
http://en.wikipedia.org/wiki/Victor_Meldrew
Richard Harter
2009-04-26 13:57:09 UTC
Permalink
On Sun, 26 Apr 2009 04:38:30 GMT, Golden California Girls
Post by Golden California Girls
Post by Richard Heathfield
Post by Mark McIntyre
<flame bait>
Youre both beginning to sound like Victor Meldrew.
</>
Who's he?
http://en.wikipedia.org/wiki/Victor_Meldrew
Chortle, thank you for the reference.



Richard Harter, ***@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
If I do not see as far as others, it is because
I stand in the footprints of giants.
Richard Harter
2009-04-26 13:57:09 UTC
Permalink
On Sun, 26 Apr 2009 04:38:30 GMT, Golden California Girls
Post by Golden California Girls
Post by Richard Heathfield
Post by Mark McIntyre
<flame bait>
Youre both beginning to sound like Victor Meldrew.
</>
Who's he?
http://en.wikipedia.org/wiki/Victor_Meldrew
Chortle, thank you for the reference.



Richard Harter, ***@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
If I do not see as far as others, it is because
I stand in the footprints of giants.

CBFalconer
2009-04-26 01:57:56 UTC
Permalink
Post by Mark McIntyre
Post by Richard Heathfield
Hardly disingenuous. Chuck Falconer's position is that non-ISO C
code is atopical and non-working.
... snip ...
Post by Mark McIntyre
But frankly life's too short to argue with argumentative people.
You note, correctly, that CBF has been especially cantankerous
recently. The same can increasingly be said about yourself.
Probably true enough. I recently got sufficienly sick of RHs
foolish carping and responded. I think I have largely ignored his
silly postings in the past.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Richard
2009-04-26 03:45:35 UTC
Permalink
Post by CBFalconer
Post by Mark McIntyre
Post by Richard Heathfield
Hardly disingenuous. Chuck Falconer's position is that non-ISO C
code is atopical and non-working.
... snip ...
Post by Mark McIntyre
But frankly life's too short to argue with argumentative people.
You note, correctly, that CBF has been especially cantankerous
recently. The same can increasingly be said about yourself.
Probably true enough. I recently got sufficienly sick of RHs
foolish carping and responded. I think I have largely ignored his
silly postings in the past.
"Chuck"fight! (as they would pronounce "chickfight" in NZ).
--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c
Richard
2009-04-25 04:53:13 UTC
Permalink
Post by Mark McIntyre
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
Do you mean that you are no longer of the opinion that non-standard
library calls stop code from working?
I suspect that CBF has never said that. I do recall a stupid exchange
between you and him where you trolled each other foolishly about the
meaning of the words "illegal" and "broken". Either way, its childish of
you both to perpetuate the discussion, and disingenuous to mention it in
this context since CBF hasn't said nmalloc is ISO C.
I rarely stick up for Dicky, but CBF is an idiot and Heathfield has done
a very good job of proving that recently.
--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c
Golden California Girls
2009-04-26 04:40:17 UTC
Permalink
Post by Richard
I rarely stick up for Dicky, but CBF is an idiot and Heathfield has done
a very good job of proving that recently.
Actually if someone here knows him, they should get him to a doctor to be
checked out.
James Kuyper
2009-04-24 10:07:49 UTC
Permalink
...
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
I agree that it's an imbecilic argument that he's referring to; but it's
your argument, not his. All RH is trying to do is help you understand
why that is the case. That makes him look foolish only insofar as that
seems to be a doomed cause.
CBFalconer
2009-04-24 22:06:53 UTC
Permalink
Post by James Kuyper
...
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
I agree that it's an imbecilic argument that he's referring to; but it's
your argument, not his. All RH is trying to do is help you understand
why that is the case. That makes him look foolish only insofar as that
seems to be a doomed cause.
No, he deliberately snipped the portion that shows his imbecility,
Post by James Kuyper
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
If you can adapt it to your system ...
^^^^^

Which pointed out that the code would require modification.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
jameskuyper
2009-04-24 22:47:46 UTC
Permalink
Post by CBFalconer
Post by James Kuyper
...
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
I agree that it's an imbecilic argument that he's referring to; but it's
your argument, not his. All RH is trying to do is help you understand
why that is the case. That makes him look foolish only insofar as that
seems to be a doomed cause.
No, he deliberately snipped the portion that shows his imbecility,
Post by James Kuyper
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
If you can adapt it to your system ...
^^^^^
Which pointed out that the code would require modification.
Comments like that were never effective at getting you to recognize
the existence of non-standard things; why should someone parodying
your attitudes in order to demonstrate how ridiculous they are be any
more reasonable about such things than you are?
Flash Gordon
2009-04-25 08:40:45 UTC
Permalink
Post by CBFalconer
Post by James Kuyper
...
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
<http://cbfalconer.home.att.net/download/nmalloc.zip>
This uses sbrk, right?
So by your own argument, it doesn't work..
Do you have any idea how big an imbecile that makes you appear?
I agree that it's an imbecilic argument that he's referring to; but it's
your argument, not his. All RH is trying to do is help you understand
why that is the case. That makes him look foolish only insofar as that
seems to be a doomed cause.
No, he deliberately snipped the portion that shows his imbecility,
Post by James Kuyper
Post by CBFalconer
Post by Richard Heathfield
Post by CBFalconer
If you can adapt it to your system ...
^^^^^
Which pointed out that the code would require modification.
You have recently claimed that code which uses implementation *defined*
behaviour "does not work". Are you now retracting that claim, since
obviously it works on the on the implementations where it worked by
definition of those implemenations, just like you nmalloc package works
on implementations which provide sbrck and where you method of dealing
with alignment works?

Either you flat statement that code which uses implementation defined
behaviour does not work is false, or your nmalloc package does not work.

Oh, and your nmalign function assumes that a pointer can be converted to
a long with no loss of information, something the standard does not
guarantee. In fact, the standard does not guarantee there is ANY integer
type which would work. It then assumes things about the converted value
which are not guaranteed to be true.

It has at least one identifier (_sysquery) at file scope which starts
with an underscore, which is reserved for the implementation and so not
available for user code.

Now, have you posted all these other limitations every time you have
said it will work on any implementation which provides sbrck? I don't
remember you pointing them out, so your code does not work by your own
arguments.
--
Flash Gordon
CBFalconer
2009-04-26 02:03:24 UTC
Permalink
Flash Gordon wrote:
... snip ...
Post by Flash Gordon
Now, have you posted all these other limitations every time you
have said it will work on any implementation which provides
sbrck? I don't remember you pointing them out, so your code does
not work by your own arguments.
By the way, it is "sbrk". What I believe I have done is include
the information that nmalloc was written to run on DJGPP. I have
not made a point of being specific about the variations involved.
Actually, there is no need to specify this, since it is impossible
to write a generic portable malloc package.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
jacob navia
2009-04-23 09:02:01 UTC
Permalink
Post by George Peter Staplin
From what I recall, Boehm also mentions problems with his GC in some papers.
This is utterly vague. Which papers please?
Post by George Peter Staplin
There are certain structures that can lead to memory being retained. Boehm
has a paper that suggests some changes he would make to C to improve the
accuracy of a GC for C, but these changes require compiler support.
What paper?
Post by George Peter Staplin
Most C environments aren't designed for GC, and a conservative collector can
easily mistake a random integer or part of an integer as the case may be,
as a pointer.
That will be collected next gc if the integer is no longer in the stack
Post by George Peter Staplin
Moreover, this can happen randomly, so you might have a user
complaining about your program or library retaining a 1 GB allocation, and
you might never be able to duplicate that, unless you duplicate the exact
usage patterns of the user. That means pressing a button or key at a
specific time, and so on. It's really not the way to go for reliable
software, unless you have the entire environment setup to help the GC (some
systems do this though).
The gc software has been running with lcc-win since 4-5 years at least.
Your vague allegations doesn't cut it. Any specific example?
Post by George Peter Staplin
If you also work with memory mapped files (with mmap and munmap used to work
with segments of the file), or send pointers over sockets, you would find
those objects can lose all references and are thus freed for reuse, or back
to the system.
Yes, this is described in the documentation.

Why should use the example of the pointer being written to a file
then retrieved,an example that the regulars here use since ages.
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can still have
unbounded growth from errors in array growth algorithms, and abstract data
structures not having their elements pruned. In other words: GC doesn't
fix the flaws in your program, and sometimes it makes them more complicated
to track down.
GC doesn't fix the flaws of your program. True. And doesn't make you the
coffee. You STILL have to walk to the coffee machine (shudder).
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Ian Collins
2009-04-23 10:30:58 UTC
Permalink
Post by jacob navia
Post by George Peter Staplin
Moreover, this can happen randomly, so you might have a user
complaining about your program or library retaining a 1 GB allocation, and
you might never be able to duplicate that, unless you duplicate the exact
usage patterns of the user. That means pressing a button or key at a
specific time, and so on. It's really not the way to go for reliable
software, unless you have the entire environment setup to help the GC
(some systems do this though).
The gc software has been running with lcc-win since 4-5 years at least.
Your vague allegations doesn't cut it. Any specific example?
Speaking as one who has used a GC with C (to keep a leaky binary only
application running and to check for memory leaks in better ones), I
agree with Jacob. I've never seen any problems.
--
Ian Collins
George Peter Staplin
2009-04-23 16:11:56 UTC
Permalink
Post by jacob navia
Post by George Peter Staplin
From what I recall, Boehm also mentions problems with his GC in some papers.
This is utterly vague. Which papers please?
http://www.hpl.hp.com/personal/Hans_Boehm/gc/papers/boecha.ps.gz
Post by jacob navia
Post by George Peter Staplin
There are certain structures that can lead to memory being retained.
Boehm has a paper that suggests some changes he would make to C to
improve the accuracy of a GC for C, but these changes require compiler
support.
What paper?
See above. There are others I believe, but you may need an ACM account.
Post by jacob navia
Post by George Peter Staplin
Most C environments aren't designed for GC, and a conservative collector
can easily mistake a random integer or part of an integer as the case may
be, as a pointer.
That will be collected next gc if the integer is no longer in the stack
That is not necessarily true. If you have a random global integer that has
the same size and alignment offset as a pointer, or even a series of char
or short that randomly have the same value as a large allocation, that
allocation will be retained. This is why the Boehm collector is
called "conservative." It's conservative about releasing or reusing an
allocation. If it sees something that could be a reference, it assumes
that series of bytes is a reference.
Post by jacob navia
Post by George Peter Staplin
Moreover, this can happen randomly, so you might have a user
complaining about your program or library retaining a 1 GB allocation,
and you might never be able to duplicate that, unless you duplicate the
exact
usage patterns of the user. That means pressing a button or key at a
specific time, and so on. It's really not the way to go for reliable
software, unless you have the entire environment setup to help the GC
(some systems do this though).
The gc software has been running with lcc-win since 4-5 years at least.
Your vague allegations doesn't cut it. Any specific example?
Yes, there have been some over the years. While it generally works fine, I
think you would find that some data sets don't work well with conservative
collectors. The larger your data set the more likely it is that you will
have a false reference that potentially keeps a large allocation retained,
rather than resulting in that large allocation being reused.

Here is a specific example:
http://www.cs.brown.edu/pipermail/plt-scheme/2006-June/013840.html

Towards the end of this thread there is more information:
http://groups.google.com/group/plt-scheme/browse_thread/thread/8f6ceaf436cfa2aa
Post by jacob navia
Post by George Peter Staplin
If you also work with memory mapped files (with mmap and munmap used to
work with segments of the file), or send pointers over sockets, you would
find those objects can lose all references and are thus freed for reuse,
or back to the system.
Yes, this is described in the documentation.
So, there is no way around that AFAIK. The collector will release those
perfectly valid objects.
Post by jacob navia
Why should use the example of the pointer being written to a file
then retrieved,an example that the regulars here use since ages.
OK, imagine you're working with a large data set, and you happen to have a
restriction on the amount of RAM you can use. The system in question
offers 64-bit file offsets. You might work with segments of the data
structure stored in a file, and several small objects references/pointers
that are usually persistent, and related to the data. So if you munmap
(Windows and some other systems have memory-mapped files as well) a few
pages for the allocation to save some RAM, that removes those vital
references. You might find unexpected faults when that occurs (depending
on the collector's state) when you try to use those objects the next time
the pages are mapped.
Post by jacob navia
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can still have
unbounded growth from errors in array growth algorithms, and abstract data
structures not having their elements pruned. In other words: GC doesn't
fix the flaws in your program, and sometimes it makes them more
complicated to track down.
GC doesn't fix the flaws of your program. True. And doesn't make you the
coffee. You STILL have to walk to the coffee machine (shudder).
Please don't take it personally Jacob. I think GC has its place, but a
conservative GC is a bit too dangerous for my tastes at times, and implies
restrictions on the whole program and its possible progressions that I
don't agree with. Why should I remove potential capabilities from the
future of my code for the GC?

My thoughts are that reference counting is often superior, because it's much
finer grained, though much more difficult to manage. When the reference
counting is correct there is no need to scan the entire set of references
in order to collect some unused objects. I happen to believe that the idea
that reference counting is more costly than GC is largely a myth. It has
some associated costs, such as a slightly larger object (for the reference
count), but even with a typical accurate GC you have tag bits, or similar
things, and those have costs in terms of CPU instructions.

GC doesn't cure all memory leaks. It's still easy to have 1 random pointer
retained in some far off object that doesn't get reached, especially with a
language like C that wasn't designed for it. If that pointer has valid or
invalid references to other allocations, then those allocations will be
retained too. I recall a few cases of Java memory leaks relating to
patterns like this.

If you don't trust or test your C memory allocation patterns, why should you
trust the program to be any better with a conservative GC?

-George
CBFalconer
2009-04-24 01:18:23 UTC
Permalink
... snip ...
Post by jacob navia
Post by George Peter Staplin
If you also work with memory mapped files (with mmap and munmap
used to work with segments of the file), or send pointers over
sockets, you would find those objects can lose all references
and are thus freed for reuse, or back to the system.
Yes, this is described in the documentation.
Why should use the example of the pointer being written to a file
then retrieved,an example that the regulars here use since ages.
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can
still have unbounded growth from errors in array growth
algorithms, and abstract data structures not having their
elements pruned. In other words: GC doesn't fix the flaws in
your program, and sometimes it makes them more complicated to
track down.
GC doesn't fix the flaws of your program. True. And doesn't make
you the coffee. You STILL have to walk to the coffee machine
(shudder).
#include <stdlib.h>

void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}

int main(void) {
char *ptr;

if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}

So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Ben Bacarisse
2009-04-24 02:41:49 UTC
Permalink
<snip>
Post by CBFalconer
Post by jacob navia
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can
still have unbounded growth from errors in array growth
algorithms, and abstract data structures not having their
elements pruned. In other words: GC doesn't fix the flaws in
your program, and sometimes it makes them more complicated to
track down.
GC doesn't fix the flaws of your program. True. And doesn't make
you the coffee. You STILL have to walk to the coffee machine
(shudder).
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).

I have a feeling it shows you don't know how GC would work in C.
--
Ben.
CBFalconer
2009-04-24 03:23:15 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by jacob navia
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can
still have unbounded growth from errors in array growth
algorithms, and abstract data structures not having their
elements pruned. In other words: GC doesn't fix the flaws in
your program, and sometimes it makes them more complicated to
track down.
GC doesn't fix the flaws of your program. True. And doesn't make
you the coffee. You STILL have to walk to the coffee machine
(shudder).
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).
I have a feeling it shows you don't know how GC would work in C.
Possibly true. I am assuming the GC system scans memory for a
pointer to the memory area. If it doesn't find one, it frees that
memory block. ptr+128 points just past the block, so is
legitimate. It could well point to the next block assigned. It is
not dereferenced, so that does not create an error. So I expect
that during the execution of longtime() the allocated block will be
freed. The access by *ptr (which is valid) will fail. The free
(which is legitimate) will fail.

Am I wrong in this analysis? If so, where.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
George Peter Staplin
2009-04-24 03:52:01 UTC
Permalink
Post by CBFalconer
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by jacob navia
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can
still have unbounded growth from errors in array growth
algorithms, and abstract data structures not having their
elements pruned. In other words: GC doesn't fix the flaws in
your program, and sometimes it makes them more complicated to
track down.
GC doesn't fix the flaws of your program. True. And doesn't make
you the coffee. You STILL have to walk to the coffee machine
(shudder).
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).
I have a feeling it shows you don't know how GC would work in C.
Possibly true. I am assuming the GC system scans memory for a
pointer to the memory area. If it doesn't find one, it frees that
memory block. ptr+128 points just past the block, so is
legitimate. It could well point to the next block assigned. It is
not dereferenced, so that does not create an error. So I expect
that during the execution of longtime() the allocated block will be
freed. The access by *ptr (which is valid) will fail. The free
(which is legitimate) will fail.
Am I wrong in this analysis? If so, where.
I believe you're right with a conservative GC. You might need to run some
other allocations to cause the GC to collect. As far as the GC is
concerned though that + 128 is beyond the allocation, so the memory
reference stored by char *ptr would no longer refer to the memory from the
malloc call, or even a part of it. It depends on the implementation of the
GC, and any fudge factors they might use.

-George
Flash Gordon
2009-04-24 06:11:14 UTC
Permalink
Post by George Peter Staplin
Post by CBFalconer
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by jacob navia
Post by George Peter Staplin
GC doesn't eliminate all unbounded growth patterns. You can
still have unbounded growth from errors in array growth
algorithms, and abstract data structures not having their
elements pruned. In other words: GC doesn't fix the flaws in
your program, and sometimes it makes them more complicated to
track down.
GC doesn't fix the flaws of your program. True. And doesn't make
you the coffee. You STILL have to walk to the coffee machine
(shudder).
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).
I have a feeling it shows you don't know how GC would work in C.
Possibly true. I am assuming the GC system scans memory for a
pointer to the memory area. If it doesn't find one, it frees that
memory block. ptr+128 points just past the block, so is
legitimate. It could well point to the next block assigned. It is
not dereferenced, so that does not create an error. So I expect
that during the execution of longtime() the allocated block will be
freed. The access by *ptr (which is valid) will fail. The free
(which is legitimate) will fail.
Am I wrong in this analysis? If so, where.
I believe you're right with a conservative GC. You might need to run some
other allocations to cause the GC to collect. As far as the GC is
concerned though that + 128 is beyond the allocation, so the memory
reference stored by char *ptr would no longer refer to the memory from the
malloc call, or even a part of it. It depends on the implementation of the
GC, and any fudge factors they might use.
I would expect a conservative GC to NOT free a block with a pointer to
one past it for the simple reason that it CAN be a valid pointer for
that block. It may indeed cause it t not free another block. That is why
conservative GCs are called "conservative"!
--
Flash Gordon
Ben Bacarisse
2009-04-24 14:20:52 UTC
Permalink
<snip>
Post by CBFalconer
Post by Ben Bacarisse
Post by CBFalconer
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).
I have a feeling it shows you don't know how GC would work in C.
Possibly true. I am assuming the GC system scans memory for a
pointer to the memory area. If it doesn't find one, it frees that
memory block. ptr+128 points just past the block, so is
legitimate. It could well point to the next block assigned. It is
not dereferenced, so that does not create an error. So I expect
that during the execution of longtime() the allocated block will be
freed. The access by *ptr (which is valid) will fail. The free
(which is legitimate) will fail.
Am I wrong in this analysis? If so, where.
I think it is wrong to criticise the technique by assuming an
incorrect implementation. I can't image a real GC getting this
wrong. It must consider the one-past-the-end pointer to be a valid
pointer and must not free the 128 byte block. Do you have any evidence
that any GC makes such a basic error?

The Boehm GC does not even touch malloc'd memory, BTW.
--
Ben.
Chris Dollin
2009-04-24 14:27:07 UTC
Permalink
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
--
"He gave the matter some thought." - Mermaid Kiss, /The City of Clouds/

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England
Ben Bacarisse
2009-04-24 14:36:20 UTC
Permalink
Post by Chris Dollin
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
You allocate GC memory using GC_malloc (or, better, the macro
GC_MALLOC) and, of course, one can arrange that this be a replacement
for malloc but I understood the two heaps were usually separate.

From http://www.hpl.hp.com/personal/Hans_Boehm/gc/simple_example.html

Interaction with the system malloc

It is usually best not to mix garbage-collected allocation with the
system malloc-free. If you do, you need to be careful not to store
pointers to the garbage-collected heap in memory allocated with the
system malloc.
--
Ben.
CBFalconer
2009-04-24 22:17:18 UTC
Permalink
Post by Ben Bacarisse
Post by Chris Dollin
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
You allocate GC memory using GC_malloc (or, better, the macro
GC_MALLOC) and, of course, one can arrange that this be a
replacement for malloc but I understood the two heaps were
usually separate.
You can't do that, because malloc has to be system sensitive. It
has to know things not directly available, such as where to get
memory, alignment requirements, etc.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Keith Thompson
2009-04-24 23:04:27 UTC
Permalink
Post by CBFalconer
Post by Ben Bacarisse
Post by Chris Dollin
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
You allocate GC memory using GC_malloc (or, better, the macro
GC_MALLOC) and, of course, one can arrange that this be a
replacement for malloc but I understood the two heaps were
usually separate.
You can't do that, because malloc has to be system sensitive. It
has to know things not directly available, such as where to get
memory, alignment requirements, etc.
Did you, by any chance, actually mean to say that you *can* do that,
but only in a system-sensitive manner?
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ben Bacarisse
2009-04-24 23:21:03 UTC
Permalink
Post by CBFalconer
Post by Ben Bacarisse
Post by Chris Dollin
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
You allocate GC memory using GC_malloc (or, better, the macro
GC_MALLOC) and, of course, one can arrange that this be a
replacement for malloc but I understood the two heaps were
usually separate.
You can't do that, because malloc has to be system sensitive. It
has to know things not directly available, such as where to get
memory, alignment requirements, etc.
Er, yes. Why does that mean one can't have a C implementation where
malloc calls the Boehm allocator?
--
Ben.
Chris Dollin
2009-04-27 08:01:08 UTC
Permalink
Post by Ben Bacarisse
Post by Chris Dollin
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
You allocate GC memory using GC_malloc (or, better, the macro
GC_MALLOC) and, of course, one can arrange that this be a replacement
for malloc but I understood the two heaps were usually separate.
Ah! Sorry, I had an over-general interpretation of `malloced` in
my head at that time. I /thought/ something was iffy, and I was
right -- it was me.
--
"He gave the matter some thought." - Mermaid Kiss, /The City of Clouds/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN
Chris Dollin
2009-04-27 08:02:45 UTC
Permalink
Post by Ben Bacarisse
Post by Chris Dollin
Post by Ben Bacarisse
The Boehm GC does not even touch malloc'd memory, BTW.
Are you sure? Because, what else is there to GC?
You allocate GC memory using GC_malloc (or, better, the macro
GC_MALLOC) and, of course, one can arrange that this be a replacement
for malloc but I understood the two heaps were usually separate.
Ah, I was being an over-generalising idiot. Where's my cow?
--
"Don't pluck it as you pass." - Trees, /The Garden of Jane Delawney/

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England
CBFalconer
2009-04-24 22:13:53 UTC
Permalink
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by Ben Bacarisse
Post by CBFalconer
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).
I have a feeling it shows you don't know how GC would work in C.
Possibly true. I am assuming the GC system scans memory for a
pointer to the memory area. If it doesn't find one, it frees that
memory block. ptr+128 points just past the block, so is
legitimate. It could well point to the next block assigned. It is
not dereferenced, so that does not create an error. So I expect
that during the execution of longtime() the allocated block will be
freed. The access by *ptr (which is valid) will fail. The free
(which is legitimate) will fail.
Am I wrong in this analysis? If so, where.
I think it is wrong to criticise the technique by assuming an
incorrect implementation. I can't image a real GC getting this
wrong. It must consider the one-past-the-end pointer to be a valid
pointer and must not free the 128 byte block. Do you have any evidence
that any GC makes such a basic error?
The Boehm GC does not even touch malloc'd memory, BTW.
Where am I assuming even an incorrect implementation? All I am
assuming is a method of deciding whether a pointer to anywhere
within an allocated block exists. I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Ben Bacarisse
2009-04-24 23:14:56 UTC
Permalink
Post by CBFalconer
Post by Ben Bacarisse
<snip>
Post by CBFalconer
Post by Ben Bacarisse
Post by CBFalconer
#include <stdlib.h>
void longtime(void) {
/* this takes 1/2 hour to execute */
return;
}
int main(void) {
char *ptr;
if (!(ptr = malloc(128))) exit(EXIT_FAILURE);
ptr += 128;
longtime();
ptr -= 128;
*ptr = 'A';
free(ptr);
return 0;
}
So, after coding longtime() to agree with the comment, do you claim
this program is flawed? Admitted, I had to work at it.
What is this code meant to show? You might want to pick an example
where all the ptr stuff can't be removed by the optimiser (or was that
part of the point?).
I have a feeling it shows you don't know how GC would work in C.
Possibly true. I am assuming the GC system scans memory for a
pointer to the memory area. If it doesn't find one, it frees that
memory block. ptr+128 points just past the block, so is
legitimate. It could well point to the next block assigned. It is
not dereferenced, so that does not create an error. So I expect
that during the execution of longtime() the allocated block will be
freed. The access by *ptr (which is valid) will fail. The free
(which is legitimate) will fail.
Am I wrong in this analysis? If so, where.
I think it is wrong to criticise the technique by assuming an
incorrect implementation. I can't image a real GC getting this
wrong. It must consider the one-past-the-end pointer to be a valid
pointer and must not free the 128 byte block. Do you have any evidence
that any GC makes such a basic error?
The Boehm GC does not even touch malloc'd memory, BTW.
Where am I assuming even an incorrect implementation? All I am
assuming is a method of deciding whether a pointer to anywhere
within an allocated block exists.
Yes. You assume that adding a valid offset to a pointer will confuse
a garbage collector. Such a collector would an incorrect
implementation of the technique. That's not fair critique unless you
know (and you might do) of such brain-dead implementations.
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. I know what
you mean though -- you might write the pointer using %p and read it
back later. That *would* have been fair and I would not have
commented. It would show the lengths you have to go to to confuse a
good collector. Instead you implied that a simple and valid pointer
manipulation might cause trouble.
--
Ben.
Keith Thompson
2009-04-24 23:41:24 UTC
Permalink
[...]
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. I know what
you mean though -- you might write the pointer using %p and read it
back later. That *would* have been fair and I would not have
commented. It would show the lengths you have to go to to confuse a
good collector. Instead you implied that a simple and valid pointer
manipulation might cause trouble.
Yes, I think you could fool Boehm GC by saving a pointer value in hex.

Treat the pointer object as an array of unsigned char, and use sprintf
or something similar to save the representation in hexadecimal in a
string somewhere (or even write it to a file). Then clobber the
actual pointer, wait a while, and restore the pointer from the hex
string. C semantics require this to restore the valid pointer value;
Boehm GC is likely to assume that no copy of the pointer is stored
anywhere in memory and release the pointed-to memory. It's really the
same idea as using "%p".

(This is not intended to be an argument against GC; it's just an
obscure corner case that you have to watch out for when using it, and
that would have to be taken into account if GC were ever added to the
C standard (presumably as an optional feature).)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ben Bacarisse
2009-04-25 00:31:06 UTC
Permalink
Post by Keith Thompson
[...]
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. I know what
you mean though -- you might write the pointer using %p and read it
back later. That *would* have been fair and I would not have
commented. It would show the lengths you have to go to to confuse a
good collector. Instead you implied that a simple and valid pointer
manipulation might cause trouble.
Yes, I think you could fool Boehm GC by saving a pointer value in hex.
Treat the pointer object as an array of unsigned char, and use sprintf
or something similar to save the representation in hexadecimal in a
string somewhere (or even write it to a file). Then clobber the
actual pointer, wait a while, and restore the pointer from the hex
string. C semantics require this to restore the valid pointer value;
Boehm GC is likely to assume that no copy of the pointer is stored
anywhere in memory and release the pointed-to memory. It's really the
same idea as using "%p".
Yes, that would fool most GCs but it's not what CBFlaconer was talking
about and thus not what I said was impossible without UB. CBF was
talking about "writing a hex version of the pointer value" and your
example writes out the representation.

I know this is a pedantic point, but it seems to matter in this case.
Valid manipulations of the pointer's value don't usually fools
collectors, but lots of manipulations of the representation do.

The writing and reading of the pointer using %p is defined in terms of
what is does to the value.
--
Ben.
Keith Thompson
2009-04-25 01:42:17 UTC
Permalink
[...]
Post by Ben Bacarisse
Post by Keith Thompson
Yes, I think you could fool Boehm GC by saving a pointer value in hex.
Treat the pointer object as an array of unsigned char, and use sprintf
or something similar to save the representation in hexadecimal in a
string somewhere (or even write it to a file). Then clobber the
actual pointer, wait a while, and restore the pointer from the hex
string. C semantics require this to restore the valid pointer value;
Boehm GC is likely to assume that no copy of the pointer is stored
anywhere in memory and release the pointed-to memory. It's really the
same idea as using "%p".
Yes, that would fool most GCs but it's not what CBFlaconer was talking
about and thus not what I said was impossible without UB. CBF was
talking about "writing a hex version of the pointer value" and your
example writes out the representation.
I know this is a pedantic point, but it seems to matter in this case.
Valid manipulations of the pointer's value don't usually fools
collectors, but lots of manipulations of the representation do.
The writing and reading of the pointer using %p is defined in terms of
what is does to the value.
Ok, I see what you mean. I didn't take "writing a hex version of the
pointer value" quite that literally. (Note that sprintf with "%p"
uses hex on many systems.)

You could also fool a garbage collector by something as simple as
byte-swapping a pointer in place (or shuffling the bits if a pointer
is 1 byte).
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
James Kuyper
2009-04-25 10:51:42 UTC
Permalink
...
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. ...
Citation, please?
Ben Bacarisse
2009-04-25 11:17:10 UTC
Permalink
Post by James Kuyper
...
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. ...
Citation, please?
Sorry, none available since it is a "can't" rather than a "can"
argument. I just can't see any way to write the value of a pointer in
hex so that it can be recovered later.

%p might use hex, but that is not assured, and there is no int type
that can always be used (6.3.2.3 p6). Of course, I could simply be
missing the way to do it. I should have asked CBFalconer what he
means by writing the value in hex and left it at that.

If you take the phrase less literally and allow the representation or
the pointer to be used then, yes, its possible. But then I why go to
the bother of using IO? Almost any manipulation of the representation
can be used to exhibit a potential problem with a collector.
--
Ben.
James Kuyper
2009-04-25 11:50:41 UTC
Permalink
Post by Ben Bacarisse
Post by James Kuyper
...
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. ...
Citation, please?
Sorry, none available since it is a "can't" rather than a "can"
argument. I just can't see any way to write the value of a pointer in
hex so that it can be recovered later.
%p might use hex, but that is not assured, and there is no int type
that can always be used (6.3.2.3 p6). Of course, I could simply be
missing the way to do it. I should have asked CBFalconer what he
means by writing the value in hex and left it at that.
If you take the phrase less literally and allow the representation or
the pointer to be used then, yes, its possible. ...
You can use "%p" to display and retrieve the value of a pointer, and
that presents a problem for GC, regardless of whether it actually prints
in hex format.

You can convert a pointer value to uintptr_t and print it in
hexadecimal, and then read it back in, and reverse the conversion; this
present a problem for GC on any systems which supports uintptr_t,
regardless of the fact that there are other systems which do not support it.

You can print out the representation of a pointer in hex, and read it
back in again, and that cause a problem for GC, even though Chuck
specified he was printing the value, not the representation.
Post by Ben Bacarisse
... But then I why go to
the bother of using IO? Almost any manipulation of the representation
can be used to exhibit a potential problem with a collector.
I believe that this is precisely the point he was trying to make.
CBFalconer
2009-04-26 02:08:56 UTC
Permalink
... snip ...
Post by James Kuyper
But then I why go to the bother of using IO? Almost any
manipulation of the representation can be used to exhibit a
potential problem with a collector.
I believe that this is precisely the point he was trying to make.
Thank you.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Ben Bacarisse
2009-04-26 15:43:19 UTC
Permalink
Post by CBFalconer
... snip ...
Post by James Kuyper
But then I why go to the bother of using IO? Almost any
manipulation of the representation can be used to exhibit a
potential problem with a collector.
I believe that this is precisely the point he was trying to make.
Thank you.
The whole thread could have been circumvented had you posted a program
that manipulated the representation, then. Instead you posted one
that manipulated the value, and talked about doing IO on the value.
--
Ben.
CBFalconer
2009-04-26 02:07:08 UTC
Permalink
Post by Ben Bacarisse
Post by James Kuyper
...
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. ...
Citation, please?
Sorry, none available since it is a "can't" rather than a "can"
argument. I just can't see any way to write the value of a pointer in
hex so that it can be recovered later.
You can make a hex representation of any bit pattern. Since
pointers are stored in memory, they are made up of bits. Therefore
representable in hex. This is a silly argument.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Richard
2009-04-26 03:46:24 UTC
Permalink
Post by CBFalconer
Post by Ben Bacarisse
Post by James Kuyper
...
Post by Ben Bacarisse
Post by CBFalconer
I could probably replace the
+128 with writing a hex version of the pointer value, and restoring
it later, which avoids all arguments about how clost the possible
pointer must be.
No, not by using hex you can't -- your program would be in the realms
of undefined behaviour even without a garbage collector. ...
Citation, please?
Sorry, none available since it is a "can't" rather than a "can"
argument. I just can't see any way to write the value of a pointer in
hex so that it can be recovered later.
You can make a hex representation of any bit pattern. Since
pointers are stored in memory, they are made up of bits. Therefore
representable in hex. This is a silly argument.
What part of "recovered later" confuses you this time?
--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c
Keith Thompson
2009-04-26 05:23:07 UTC
Permalink
[...]
Post by CBFalconer
Post by Ben Bacarisse
Sorry, none available since it is a "can't" rather than a "can"
argument. I just can't see any way to write the value of a pointer in
hex so that it can be recovered later.
You can make a hex representation of any bit pattern. Since
pointers are stored in memory, they are made up of bits. Therefore
representable in hex. This is a silly argument.
I think the point is that "value" and "representation" are two
different things. For example, two different bit patterns might
represent the same pointer value; your approach then gives two
distinct hex representations for one value. Yes, it's a nitpick; in
any case, you can reconstruct the same pointer value from either
representation.

(I suppose you could use sprintf with "%p" to obtain a string
representation of a pointer value, and then construct a hexadecimal
string from that, but that's not what anybody meant.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Gene
2009-04-23 01:57:45 UTC
Permalink
Hi,
    I am now designing a library in C, and the libary dynamically
allocate much memory, and now  I use reference couting to deal with
memory alloc/free. I mean, the client of this libary should call unref
() for many pointer of the data structure of the library. And this ref/
unref interface impose more additional task for client programmers.
But I am wondering is there a better way? So, I goole for "gc for C",
and got a mark-sweep collector (http://www.hpl.hp.com/personal/
Hans_Boehm/gc/). Could anybody here please give me some advice on GC
of the library? Thanks a lot!
Reference counting is probably the simplest way to gc library-
allocated objects. However, it has the worst performance among gc
algorithms. You could consider implementing your own gc within the
library. It's pretty easy to write a simple mark-and-sweep or even a
2-space copying collector on top of malloc and free. The ugly part is
that your library caller must register/unregister gc roots. This may
be no worse than reference counting, however. To see one way of doing
this, look at the code for emacs.

Boehm's collector is a beautiful idea, but, as others have said, it
has significant practical problems. Code like

char *s = malloc(1000);
scanf("%s", s);
int len = 0;
while (*s++) {
malloc(1 << 27); // induce a gc
len++;
}
printf("%d\n", len);

will fail with Boehm because the induced gc will not find any pointer
to the start of the 1000 char string, and the string will be
collected. Other more subtle problems occur in less contrived code
due to aggressive optimizations by compilers.
George Peter Staplin
2009-04-23 03:34:26 UTC
Permalink
Post by Gene
Post by byang
Hi,
I am now designing a library in C, and the libary dynamically
allocate much memory, and now  I use reference couting to deal with
memory alloc/free. I mean, the client of this libary should call unref
() for many pointer of the data structure of the library. And this ref/
unref interface impose more additional task for client programmers.
But I am wondering is there a better way? So, I goole for "gc for C",
and got a mark-sweep collector (http://www.hpl.hp.com/personal/
Hans_Boehm/gc/). Could anybody here please give me some advice on GC
of the library? Thanks a lot!
Reference counting is probably the simplest way to gc library-
allocated objects. However, it has the worst performance among gc
algorithms. You could consider implementing your own gc within the
library. It's pretty easy to write a simple mark-and-sweep or even a
2-space copying collector on top of malloc and free. The ugly part is
that your library caller must register/unregister gc roots. This may
be no worse than reference counting, however. To see one way of doing
this, look at the code for emacs.
Some GC research has shown that in fact on modern equipment with large data
sets that often full blown GC with mark-and-sweep or a copying collector is
more costly.

I recall a possibly related article here:
http://lambda-the-ultimate.org/node/2552

Implicit malloc and free is generally much faster than any GC, unless you
can reuse and retain the allocations for longer with reference counting.
The reason behind this has mostly to do with the design of a full blown
non-reference-counting GC. It has to touch a lot more pages (potentially)
and do a lot more cache loading for each root that must be analyzed in a
program to determine if that root has a reference. With a large program or
data set that's a lot of potential references.

On modern systems the effects of a mark-and-sweep as LISP originally
implemented the algorithm would be unusable. Users would complain (as they
have before) about the blocking of the interface. However mark-and-sweep
is generally more efficient with that particular implementation, but it
would regrettably be unusable in some cases. Most modern GC are more
fine-grained, and attempt to use generations of objects to reduce the total
overall cost and delays.

-George
jacob navia
2009-04-23 08:57:44 UTC
Permalink
Post by Gene
Boehm's collector is a beautiful idea, but, as others have said, it
has significant practical problems. Code like
char *s = malloc(1000);
scanf("%s", s);
int len = 0;
while (*s++) {
malloc(1 << 27); // induce a gc
len++;
}
printf("%d\n", len);
will fail with Boehm because the induced gc will not find any pointer
to the start of the 1000 char string, and the string will be
collected.
This is completely wrong.

I compiled following program:
#include <gc.h>
int main(void)
{
char *s = GC_malloc(1000);
scanf("%s", s);
int len = 0;
while (*s++) {
GC_malloc(1 << 27); // induce a gc
len++;
}
printf("%d\n", len);
}

And the following run happened:
d:\lcc\mc76\test>tgc
abc
3

Any pointer to the inside part of a block will avoid it being collected
by the gc.



Other more subtle problems occur in less contrived code
Post by Gene
due to aggressive optimizations by compilers.
Yeah sure.

But maybe you can give an example?
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Continue reading on narkive:
Loading...