memcpy can't take null.

Discussion:

memcpy can't take null.

(too old to reply)

Malcolm McLean

2020-04-05 13:47:22 UTC

memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

So what are the implications for passing empty buffers about? For example,
how would people write

/* struct employee has no embedded pointers */

Employee *duplicateEmployees(Employee *employees, int Nemployees)

Öö Tiib

2020-04-05 14:30:53 UTC

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.
So what are the implications for passing empty buffers about? For example,
how would people write
/* struct employee has no embedded pointers */
Employee *duplicateEmployees(Employee *employees, int Nemployees)

{
if (!employees || Nemployees < 1) return NULL;

Employee *ret = malloc(Nemployees * sizeof(Employee));
if (ret) memcpy(ret, employees, Nemployees * sizeof(Employee));

return ret;
}

Richard Damon

2020-04-05 18:39:56 UTC

Post by ÃÃ¶ Tiib

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.
So what are the implications for passing empty buffers about? For example,
how would people write
/* struct employee has no embedded pointers */
Employee *duplicateEmployees(Employee *employees, int Nemployees)

{
if (!employees || Nemployees < 1) return NULL;
Employee *ret = malloc(Nemployees * sizeof(Employee));
if (ret) memcpy(ret, employees, Nemployees * sizeof(Employee));
return ret;
}

Since malloc can return either a null pointer or a unique pointer for an
allocation of size 0, if I wanted to be able to use a null pointer for
an error return in functions like this, I would be tempted to create a
wrapper for malloc() if the size requested was 0, would allocate a
buffer of 1 byte, so null returns are always errors. This wrapper could
perhaps use knowledge of the implementation, if it KNOWS that malloc(0)
will return unique pointers and not nulls (except for out of memory),
then it could still call malloc(0).

This lack of standardization of behavior is a historical artifact that
K&R didn't define what to do here, and when the first Standard was being
written, both options were is common usage, so rather than break
existing implementations, it was left undefined. It is of course fairly
trivial to make a wrapper to provide the alternate behavior if you
really need it as described above.

Philipp Klaus Krause

2020-04-06 16:46:14 UTC

Post by Richard Damon
This lack of standardization of behavior is a historical artifact that
K&R didn't define what to do here, and when the first Standard was being
written, both options were is common usage, so rather than break
existing implementations, it was left undefined. It is of course fairly
trivial to make a wrapper to provide the alternate behavior if you
really need it as described above.

The next C standard will even make realloc for zero size undefined
behaviour.
Too many implementations didn't conform with the old C11 standard for
zero size realloc. C17 thus allowed more implementation-defined
behaviour, but at that point it wasn't very useable for portable
programs anymore. So it will just be undefined behaviour in C2X.

Keith Thompson

2020-04-06 19:45:36 UTC

Philipp Klaus Krause <***@spth.de> writes:
[...]

Post by Philipp Klaus Krause
The next C standard will even make realloc for zero size undefined
behaviour.
Too many implementations didn't conform with the old C11 standard for
zero size realloc. C17 thus allowed more implementation-defined
behaviour, but at that point it wasn't very useable for portable
programs anymore. So it will just be undefined behaviour in C2X.

Do you have a citation for that?

The latest draft I have (n2455.pdf, November 18, 2019) does have this
new wording relative to n1570.pdf, but is identical to the wording in
C17. There is no new undefined behavior as far as I can tell.

If size is nonzero and memory for the new object is not
allocated, the old object is not deallocated. If size is
zero and memory for the new object is not allocated, it is
implementation-defined whether the old object is deallocated.
If the old object is not deallocated, its value shall be
unchanged.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Philipp Klaus Krause

2020-04-07 05:28:58 UTC

Post by Keith Thompson
[...]

Post by Philipp Klaus Krause
The next C standard will even make realloc for zero size undefined
behaviour.
Too many implementations didn't conform with the old C11 standard for
zero size realloc. C17 thus allowed more implementation-defined
behaviour, but at that point it wasn't very useable for portable
programs anymore. So it will just be undefined behaviour in C2X.

Do you have a citation for that?

Not really. There is N2464, but you'll have to wait a a bit for the
draft meeting minutes for the meeting last week to see the votes on it
(or until August for the decision on the final meeting minutes).

There is item 6.35 in the 2019 Ithaka meeting minutes regarding an
earlier version of the realloc paper.

Post by Keith Thompson
The latest draft I have (n2455.pdf, November 18, 2019)

N2478 / N2479, February 2020 is the latest draft (but naturally doesn't
have changes voted on last week).

Philipp

Keith Thompson

2020-04-05 23:00:28 UTC

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.
So what are the implications for passing empty buffers about? For example,
how would people write
/* struct employee has no embedded pointers */
Employee *duplicateEmployees(Employee *employees, int Nemployees)

C doesn't directly support empty arrays, but since arrays are usually
manipulated via pointers to the initial element, it's easy enough
to treat a pointer as pointing to an empty array.

A null pointer doesn't point to an empty array. It doesn't point
to *anything*. Similarly, a null pointer doesn't point to an empty
string (there have been problems in the past with code that assumed
it did, and that happened to "work" on some systems).

Just keep that consistently in mind, and write your code in
whatever way is necessary to maintain that. This might require
a little special case code for handling the result of malloc(0).
So write that code.

Employee *duplicateEmployees(Employee *employees, size_t Nemployees) {
const size_t requestedCount = Nemployees + (Nemployees == 0);
Employee *const new = malloc(requestedCount * sizeof *new);
if (new != NULL) {
memcpy(new, employees, Nemployees * sizeof *new);
}
return new;
}

An alternative would be to treat a null pointer as a pointer to an
empty array, but then you'd probably need more special case code on
top of that to keep things consistent (to avoid passing null pointers
to memcpy, for example). By *not* treating a null pointer as a
pointer to an empty array, you can (probably) isolate the special
case code to allocation functions. You'd also lose the ability to
use a null pointer as a special value that doesn't point to anything.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Joe Pfeiffer

2020-04-06 00:29:20 UTC

Post by Keith Thompson
A null pointer doesn't point to an empty array. It doesn't point
to *anything*. Similarly, a null pointer doesn't point to an empty
string (there have been problems in the past with code that assumed
it did, and that happened to "work" on some systems).

In particular, VAXen had page 0 user readable, and byte 0 contained a 0
by pure luck. I thought for years a null pointer could be used to
represent an empty string -- I learned my mistake when I moved my code
over to a Sun workstation.

Scott Lurndal

2020-04-06 18:08:08 UTC

Post by Joe Pfeiffer

Post by Keith Thompson
A null pointer doesn't point to an empty array. It doesn't point
to *anything*. Similarly, a null pointer doesn't point to an empty
string (there have been problems in the past with code that assumed
it did, and that happened to "work" on some systems).

In particular, VAXen had page 0 user readable, and byte 0 contained a 0
by pure luck. I thought for years a null pointer could be used to
represent an empty string -- I learned my mistake when I moved my code
over to a Sun workstation.

BSD unix supported a readonly page of zeros at virtual address zero (on the VAX),
which caused difficulties in porting BSD utilities to SVR4 when they
were merged in.

I don't remember that VMS did the same, but it's been forty years since.

Joe Pfeiffer

2020-04-06 21:44:12 UTC

Post by Scott Lurndal

Post by Joe Pfeiffer

Post by Keith Thompson
A null pointer doesn't point to an empty array. It doesn't point
to *anything*. Similarly, a null pointer doesn't point to an empty
string (there have been problems in the past with code that assumed
it did, and that happened to "work" on some systems).

In particular, VAXen had page 0 user readable, and byte 0 contained a 0
by pure luck. I thought for years a null pointer could be used to
represent an empty string -- I learned my mistake when I moved my code
over to a Sun workstation.

BSD unix supported a readonly page of zeros at virtual address zero (on the VAX),
which caused difficulties in porting BSD utilities to SVR4 when they
were merged in.
I don't remember that VMS did the same, but it's been forty years since.

Yes, I was referring to BSD. I didn't remember (if I ever knew) that
the whole first page was 0; the first byte was sufficient to create the
misunderstanding.

Malcolm McLean

2020-04-06 20:54:01 UTC

Post by Keith Thompson

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.
So what are the implications for passing empty buffers about? For example,
how would people write
/* struct employee has no embedded pointers */
Employee *duplicateEmployees(Employee *employees, int Nemployees)

C doesn't directly support empty arrays, but since arrays are usually
manipulated via pointers to the initial element, it's easy enough
to treat a pointer as pointing to an empty array.
A null pointer doesn't point to an empty array. It doesn't point
to *anything*. Similarly, a null pointer doesn't point to an empty
string (there have been problems in the past with code that assumed
it did, and that happened to "work" on some systems).
Just keep that consistently in mind, and write your code in
whatever way is necessary to maintain that. This might require
a little special case code for handling the result of malloc(0).
So write that code.
Employee *duplicateEmployees(Employee *employees, size_t Nemployees) {
const size_t requestedCount = Nemployees + (Nemployees == 0);
Employee *const new = malloc(requestedCount * sizeof *new);
if (new != NULL) {
memcpy(new, employees, Nemployees * sizeof *new);
}
return new;
}
An alternative would be to treat a null pointer as a pointer to an
empty array, but then you'd probably need more special case code on
top of that to keep things consistent (to avoid passing null pointers
to memcpy, for example). By *not* treating a null pointer as a
pointer to an empty array, you can (probably) isolate the special
case code to allocation functions. You'd also lose the ability to
use a null pointer as a special value that doesn't point to anything.

I think this is the best answer.

The standard library doesn't accept NULL as a valid pointer to the
empty array, so it's reasonable to say that user-defined functions
don't accept it either. That means that if you are passed a non-
null pointer (presumably valid) with a size of zero, you should pass
back a valid pointer. Otherwise caller suddenly finds NULLs infesting
what he thought were valid empty arrays.

if you are passed NULL and non-zero for the size, it is clearly
an error, and it's best to pass it onto memcpy in the hope that
memcpy will crash out. Since we don't know what we are running on,
the implementation has a better idea how to crash the program
than we do. You shouldn't suppress the error and pass back NULL -
that's making it harder for caller to find where the bug lies.

But if you are passed NULL and zero for the size, should you still
pass it on to memcpy() ?

Keith Thompson

2020-04-06 21:30:13 UTC

Post by Malcolm McLean

Post by Keith Thompson

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.
So what are the implications for passing empty buffers about? For example,
how would people write
/* struct employee has no embedded pointers */
Employee *duplicateEmployees(Employee *employees, int Nemployees)

C doesn't directly support empty arrays, but since arrays are usually
manipulated via pointers to the initial element, it's easy enough
to treat a pointer as pointing to an empty array.
A null pointer doesn't point to an empty array. It doesn't point
to *anything*. Similarly, a null pointer doesn't point to an empty
string (there have been problems in the past with code that assumed
it did, and that happened to "work" on some systems).
Just keep that consistently in mind, and write your code in
whatever way is necessary to maintain that. This might require
a little special case code for handling the result of malloc(0).
So write that code.
Employee *duplicateEmployees(Employee *employees, size_t Nemployees) {
const size_t requestedCount = Nemployees + (Nemployees == 0);
Employee *const new = malloc(requestedCount * sizeof *new);
if (new != NULL) {
memcpy(new, employees, Nemployees * sizeof *new);
}
return new;
}
An alternative would be to treat a null pointer as a pointer to an
empty array, but then you'd probably need more special case code on
top of that to keep things consistent (to avoid passing null pointers
to memcpy, for example). By *not* treating a null pointer as a
pointer to an empty array, you can (probably) isolate the special
case code to allocation functions. You'd also lose the ability to
use a null pointer as a special value that doesn't point to anything.

I think this is the best answer.
The standard library doesn't accept NULL as a valid pointer to the
empty array, so it's reasonable to say that user-defined functions
don't accept it either. That means that if you are passed a non-
null pointer (presumably valid) with a size of zero, you should pass
back a valid pointer. Otherwise caller suddenly finds NULLs infesting
what he thought were valid empty arrays.
if you are passed NULL and non-zero for the size, it is clearly
an error, and it's best to pass it onto memcpy in the hope that
memcpy will crash out. Since we don't know what we are running on,
the implementation has a better idea how to crash the program
than we do. You shouldn't suppress the error and pass back NULL -
that's making it harder for caller to find where the bug lies.
But if you are passed NULL and zero for the size, should you still
pass it on to memcpy() ?

You should decide *and document* what argument values are valid, and
what values are invalid. If a function receives invalid values, either
it should behave in some documented manner, or the documentation should
say that it's the caller's responsibility to pass only valid arguments
(otherwise the behavior is undefined).

For some functions, a null pointer might be valid, indicating that there
is no array (distinct from an empty array). For others, a null pointer
might be invalid. Decide and document.

You might consider wrapping the address and count in a structure.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Ben Bacarisse

2020-04-06 00:20:58 UTC

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.
So what are the implications for passing empty buffers about? For example,
how would people write
/* struct employee has no embedded pointers */
Employee *duplicateEmployees(Employee *employees, int Nemployees)

Employee *duplicateEmployees(Employee *employees, int Nemployees)
{
Employee *copy = 0;
if (Nemployees > 0 && (copy = malloc(Nemployees * sizeof *copy)))
memcpy(copy, employees, Nemployees * sizeof *copy);
return copy;
}

though I'd rather it were

Employee *duplicateEmployees(const Employee *employees, size_t Nemployees);

--
Ben.

Siri Cruise

2020-04-06 02:09:10 UTC

In article

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

rmemcpy does.

#define rmemcpy(dd, ss, len) ({ \
void *_dd = (dd), *_ss = (ss); size_t _len = (len); \
if (_dd==0 || _len<=0) ; \
else if (_ss==0) memset(_dd, 0, _len); \
else memcpy(_dd, _ss, _len); \
_dd; \
})

Ma deese! Do you mean I am allowed to wrap other code to even out
interfaces to what I want? Epouvantable!

#define tmemcpy(Type, dd, ss, len) \
((Type*)rmemcpy(dd, ss, (len)*sizeof(Type)))

How is this allowed?

#define rstrcmp(a, b) ({char *_a=(a), *_b=(b); \
_a==_b ? 0 : !_a ? -1 : !_b ? 1 : strcmp(_a, _b) ; \
})
#define rstreq(a, b) (rstrcmp(a, b)==0)
#define rstrne(a, b) (rstrcmp(a, b)!=0)
#define rstrlt(a, b) (rstrcmp(a, b)<0)
#define rstrle(a, b) (rstrcmp(a, b)<=0)
#define rstrgt(a, b) (rstrcmp(a, b)>0)
#define rstrge(a, b) (rstrcmp(a, b)>=0)

--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

Guillaume

2020-04-07 14:50:27 UTC

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

Most of the string/memory functions of the C std lib don't do any kind
of error checking. You may consider that passing memcpy a null pointer
AND a size of 0 is not an error per se, but I wouldn't necessarily agree
(along with probably many others.)

The reason for no error checking is likely historical. Possibly for
performance reasons. Not quite sure, but the idea is pervasive in most
of the std lib.

BTW, what would you expect memcpy to do in such a case? I guess just do
nothing and return NULL?

In which case you'd just write:

void * mymemcpy(void *destination, const void *source, size_t size)
{
if (destination == NULL)
return NULL;

memcpy(destination, source, size);
}

Note that I can't see how you could usefully distinguish the case where
destination is NULL and size isn't zero, from the case where destination
is NULL and size is zero. In both cases, there's nothing to do, and the
function could return nothing else than NULL IMO, so the above
definition should be OK.

Post by Malcolm McLean
So what are the implications for passing empty buffers about?

You implement error checking yourself, which many find tedious in C.

But maybe you should tell us a bit more about what you call an "empty
buffer".

Typically, to be consistent IMO a NULL pointer would mean an unallocated
buffer, whereas an "empty" buffer may be more like an allocated buffer
with no data. Really depends on your requirements/implementation. In the
later case, you'd need to define your buffers as structs with a bit more
info than just a pointer to the buffer memory.

Malcolm McLean

2020-04-07 16:19:57 UTC

Post by Guillaume

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

Most of the string/memory functions of the C std lib don't do any kind
of error checking. You may consider that passing memcpy a null pointer
AND a size of 0 is not an error per se, but I wouldn't necessarily agree
(along with probably many others.)

It's not inherently a logical error to regard NULL as a pointer to no
bytes, and it's not enforced by the C language as such. However the
convention that the standard library uses is that NULL with a
size of zero is an error, rather than a valid call for the null case.

Post by Guillaume

Post by Malcolm McLean
So what are the implications for passing empty buffers about?

You implement error checking yourself, which many find tedious in C.
But maybe you should tell us a bit more about what you call an "empty
buffer".

"Empty buffer" is maybe the wrong term. I mean a buffer with no data
in it, and of zero capacity. C doesn't allow zero-sized arrays, but
it does allow malloc() to return a non-null, freeable pointer for
a request of zero bytes.

Keith Thompson

2020-04-07 16:21:39 UTC

Post by Guillaume

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

Most of the string/memory functions of the C std lib don't do any kind
of error checking. You may consider that passing memcpy a null pointer
AND a size of 0 is not an error per se, but I wouldn't necessarily
agree (along with probably many others.)

It's not as ambiguous as you make it sound. Passing a null pointer to
memcpy has undefined behavior.

The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1.

If s1 or s2 doesn't point to an object, then that argument is invalid as
described in 7.1.4.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Guillaume

2020-04-07 23:41:50 UTC

Post by Keith Thompson

Post by Guillaume

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

Most of the string/memory functions of the C std lib don't do any kind
of error checking. You may consider that passing memcpy a null pointer
AND a size of 0 is not an error per se, but I wouldn't necessarily
agree (along with probably many others.)

It's not as ambiguous as you make it sound. Passing a null pointer to
memcpy has undefined behavior.

None of what I said makes it ambiguous. I just said most of those
functions don't do any error checking for invalid parameters. Nothing
ambiguous about it.

Post by Keith Thompson
The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1.
If s1 or s2 doesn't point to an object, then that argument is invalid as
described in 7.1.4.

Exactly. To me an invalid argument is an error. Then the underlying
function is implemented with error checking (so the behavior is defined)
or with no error checking (so the behavior is obviously undefined.)

Error checking here meant "parameter validity checking".

Of course, note that the case exposed by Malcolm is just a particular
case that *could* have been handled by memcpy(). But the standard is
more general, as it generally talks about "pointers not pointing to an
object". Which of course means NULL, but also any other value that would
happen to point to a random location that would not have been allocated
properly, or that would have been free'd.

In that regard, the usefulness of checking for just NULL when any other
value could also be not pointing to an object is admittedly
questionable, and is probably the main reason null pointers are not
checked explicitely in those functions.

Scott Lurndal

2020-04-08 00:30:01 UTC

Post by Guillaume

Post by Keith Thompson

Post by Guillaume

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

Most of the string/memory functions of the C std lib don't do any kind
of error checking. You may consider that passing memcpy a null pointer
AND a size of 0 is not an error per se, but I wouldn't necessarily
agree (along with probably many others.)

It's not as ambiguous as you make it sound. Passing a null pointer to
memcpy has undefined behavior.

None of what I said makes it ambiguous. I just said most of those
functions don't do any error checking for invalid parameters. Nothing
ambiguous about it.

Post by Keith Thompson
The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1.
If s1 or s2 doesn't point to an object, then that argument is invalid as
described in 7.1.4.

Exactly. To me an invalid argument is an error. Then the underlying
function is implemented with error checking (so the behavior is defined)
or with no error checking (so the behavior is obviously undefined.)
Error checking here meant "parameter validity checking".

How would you validate any of the parameters to memcpy? Particularly
considering that from a hardware point of view, zero is a legal virtual
address.

Back a quarter century ago, one of the standards committees I sat on
discussed this very topic (with options like an OS system call to validate
that the virtual addresses were valid for the process - huge overhead).

C is, after all, used in many contexts, often as bare metal code.

The final consensus was that there was no effective way to check
the parameters to mem* or str* functions without a serious performance
drop.

Malcolm McLean

2020-04-08 09:06:38 UTC

Post by Scott Lurndal

Post by Guillaume

Post by Keith Thompson

Post by Guillaume

Post by Malcolm McLean
memcpy() cannot take null as a pointer parametrer, even if the size argument
is zero.

Most of the string/memory functions of the C std lib don't do any kind
of error checking. You may consider that passing memcpy a null pointer
AND a size of 0 is not an error per se, but I wouldn't necessarily
agree (along with probably many others.)

It's not as ambiguous as you make it sound. Passing a null pointer to
memcpy has undefined behavior.

None of what I said makes it ambiguous. I just said most of those
functions don't do any error checking for invalid parameters. Nothing
ambiguous about it.

Post by Keith Thompson
The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1.
If s1 or s2 doesn't point to an object, then that argument is invalid as
described in 7.1.4.

Exactly. To me an invalid argument is an error. Then the underlying
function is implemented with error checking (so the behavior is defined)
or with no error checking (so the behavior is obviously undefined.)
Error checking here meant "parameter validity checking".

How would you validate any of the parameters to memcpy? Particularly
considering that from a hardware point of view, zero is a legal virtual
address.
Back a quarter century ago, one of the standards committees I sat on
discussed this very topic (with options like an OS system call to validate
that the virtual addresses were valid for the process - huge overhead).
C is, after all, used in many contexts, often as bare metal code.
The final consensus was that there was no effective way to check
the parameters to mem* or str* functions without a serious performance
drop.

If Nemployees is non-zero whilst the pointer employees points to invalid
memory, it's likely that the bug will be discovered soon enough.
But if Nemployees is zero and employees is NULL, then the loops through
the employees array will all be no-ops and have defined, correct
behaviour. It's only if you call a library function like memcpy that
the code becomes incorrect.

Richard Damon

2020-04-08 13:17:21 UTC

Post by Malcolm McLean
If Nemployees is non-zero whilst the pointer employees points to invalid
memory, it's likely that the bug will be discovered soon enough.
But if Nemployees is zero and employees is NULL, then the loops through
the employees array will all be no-ops and have defined, correct
behaviour. It's only if you call a library function like memcpy that
the code becomes incorrect.

The issue is that since memcpy is a very basic function, you want it as
efficient as possible. On some machines, it may be quicker to load the
pointers into registers that might object to having a null pointer
before processing the loop, and thus the machine might trap on a null
pointer.

Also, if you really want this safe version, you can make a memcpy_safe0
function defined as:

void* memcpy_safe0(void* dest, const void* src, size_t num) {
if(num) return memcpy(dest, src, num);
else return dest;
}

Note, that if the library function includes the check, there is no way
to wrap it to remove that check, thus it make sense to make the standard
version the quick one.

Note that in the above, with a decent optimizing compiler, the call to
memcpy above is tail-recursive, so becomes just a jump (or an inline
expansion of the memcpy code), and the above definition could be made
inline so perhaps the compiler could know if num was zero or not an
optimize the code.

Tim Rentsch

2020-04-09 14:16:02 UTC

Post by Richard Damon

If Nemployees is non-zero while the pointer employees points to invalid
memory, it's likely that the bug will be discovered soon enough.
But if Nemployees is zero and employees is NULL, then the loops through
the employees array will all be no-ops and have defined, correct
behaviour. It's only if you call a library function like memcpy that
the code becomes incorrect.

The issue is that since memcpy is a very basic function, you want it as
efficient as possible. On some machines, it may be quicker to load the
pointers into registers that might object to having a null pointer
before processing the loop, and thus the machine might trap on a null
pointer.
Also, if you really want this safe version, you can make a memcpy_safe0
void* memcpy_safe0(void* dest, const void* src, size_t num) {
if(num) return memcpy(dest, src, num);
else return dest;
}
Note, that if the library function includes the check, there is no way
to wrap it to remove that check, thus it make sense to make the standard
version the quick one.
Note that in the above, with a decent optimizing compiler, the call to
memcpy above is tail-recursive,

It is a tail call, but not a recursive tail call.

Keith Thompson

2020-04-08 20:33:23 UTC

Malcolm McLean <***@gmail.com> writes:
[...]

Post by Malcolm McLean
If Nemployees is non-zero whilst the pointer employees points to invalid
memory, it's likely that the bug will be discovered soon enough.
But if Nemployees is zero and employees is NULL, then the loops through
the employees array will all be no-ops and have defined, correct
behaviour. It's only if you call a library function like memcpy that
the code becomes incorrect.

Your function that takes a pointer to an employee structure and
a count indicating the number of elements in the array should be
accompanied by documentation (at least a comment) that tells the
user what combinations are valid. If you write a call that violates
those requirements, that call is incorrect -- even if the function
happens not to misbehave.

If that documentation says that func(NULL, 0) is valid and
func(NULL, 1) is not, that's fine -- but then you need to guarantee
that func(NULL, 0) won't blow up in future versions unless you want
to risk breaking existing code.

See memcpy for an example of this. Ideally, your function should
be documented as clearly. If it isn't, it might be impossible to
tell whether a given call is correct or not.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Malcolm McLean

2020-04-09 16:05:50 UTC

Post by Keith Thompson
[...]

Post by Malcolm McLean
If Nemployees is non-zero whilst the pointer employees points to invalid
memory, it's likely that the bug will be discovered soon enough.
But if Nemployees is zero and employees is NULL, then the loops through
the employees array will all be no-ops and have defined, correct
behaviour. It's only if you call a library function like memcpy that
the code becomes incorrect.

Your function that takes a pointer to an employee structure and
a count indicating the number of elements in the array should be
accompanied by documentation (at least a comment) that tells the
user what combinations are valid. If you write a call that violates
those requirements, that call is incorrect -- even if the function
happens not to misbehave.
If that documentation says that func(NULL, 0) is valid and
func(NULL, 1) is not, that's fine -- but then you need to guarantee
that func(NULL, 0) won't blow up in future versions unless you want
to risk breaking existing code.
See memcpy for an example of this. Ideally, your function should
be documented as clearly. If it isn't, it might be impossible to
tell whether a given call is correct or not.

The standard is clear, though you have to refer back to the
preamble on library functions to read it. An array argument may
not be null unless otherwise specified, as happens for snprintf().

However if you look in the Apple man page for memcpy, the issue
is nowhere to be seen.

Keith Thompson

2020-04-09 20:28:10 UTC

Malcolm McLean <***@gmail.com> writes:
[...]

Post by Malcolm McLean
The standard is clear, though you have to refer back to the
preamble on library functions to read it. An array argument may
not be null unless otherwise specified, as happens for snprintf().
However if you look in the Apple man page for memcpy, the issue
is nowhere to be seen.

Perhaps not directly. The Apple man page for memcpy (at least the
one I found in a quick Google search) says:

The memcpy() function copies n bytes from memory area s2 to memory
area s1. If s1 and s2 overlap, behavior is undefined. Applications
in which s1 and s2 might overlap should use memmove(3) instead.

The phrase "memory area s2" seems a bit vague. Perhaps Apple's
documentation defines it elsewhere. But surely if s2 is a null pointer
there is no "memory area s2" (and likewise for s1, of course).

But it also says:

The memcpy() function conforms to ISO/IEC 9899:1990 (``ISO C90'').

which does completely describe it.

The Linux man page for memcpy is similar.

In any case the definitive documentation for memcpy is the standard.
And when you document your own functions, I don't suggest following
the example of those man pages.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Richard Tobin

2020-04-09 23:41:41 UTC

Post by Keith Thompson
Perhaps not directly. The Apple man page for memcpy (at least the
The memcpy() function copies n bytes from memory area s2 to memory
area s1. If s1 and s2 overlap, behavior is undefined. Applications
in which s1 and s2 might overlap should use memmove(3) instead.
The phrase "memory area s2" seems a bit vague.

That text, and the term "memory area", appear to derive ultimately
from the description of memcpy() in the System V manual page
MEMORY(3C), which starts:

These functions operate efficiently on memory areas (arrays of
characters bounded by a count, not terminated by a null character).

-- Richard

Guillaume

2020-04-08 14:17:57 UTC

Post by Scott Lurndal
How would you validate any of the parameters to memcpy? Particularly
considering that from a hardware point of view, zero is a legal virtual
address.

If you read my whole post, note that I exactly said that. You can't in
the general case, but you could just for a null pointer. Whether that
would be useful is also questionable, which I also said.

As to zero (null pointers) being valid, even for memcpy(), it's a bit
confusing. Whereas 0 can be a "valid" address on some targets at a very
low level, according to the standard (at least C99):

"If a null pointer constant is converted to a pointer type, the
resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function."

So by that definition, a null pointer can't point to any valid object.

And whatever happens with non-valid objects, is, you guessed...
undefined behavior. What it all means is that the whole spirit of C (at
least for a large chunk of the standard) doesn't really make any
parameter invalid per se. It just states "undefined behavior" for some
parameter values, and the cases in which some values are illegal are
very few.

A corollary here is that you can perfectly pass pointers to non-valid
objects to memcpy(). But: this triggers undefined behavior. We're
running in circles. ;)

I'll have to admit that parameters triggering undefined behavior are not
necessarily "invalid" per se by the standard. This kind of makes sense,
but it is certainly confusing for many people.

Anyway, as I said in previous posts, this is really a non-issue: if you
need a specific behavior for undefined behavior cases, just implement
functions on top (or in replacement of) the existing ones. I'm fine with
this.

One wrong approach though IMO would be *assuming* any specific behavior
for undefined behavior cases.

Ike Naar

2020-04-07 22:01:06 UTC

Post by Guillaume
void * mymemcpy(void *destination, const void *source, size_t size)
{
if (destination == NULL)
return NULL;
memcpy(destination, source, size);

What does the function return here?

Post by Guillaume
}

Guillaume

2020-04-07 23:32:01 UTC

Post by Ike Naar

Post by Guillaume
void * mymemcpy(void *destination, const void *source, size_t size)
{
if (destination == NULL)
return NULL;
memcpy(destination, source, size);

What does the function return here?

Post by Guillaume
}

sorry, it should have been: return memcpy(destination, source, size);
of course.

28 Replies
247 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Malcolm McLean 2020-04-05 13:47:22 UTC

Öö Tiib 2020-04-05 14:30:53 UTC

Richard Damon 2020-04-05 18:39:56 UTC

Philipp Klaus Krause 2020-04-06 16:46:14 UTC

Keith Thompson 2020-04-06 19:45:36 UTC

Philipp Klaus Krause 2020-04-07 05:28:58 UTC

Keith Thompson 2020-04-05 23:00:28 UTC

Joe Pfeiffer 2020-04-06 00:29:20 UTC

Scott Lurndal 2020-04-06 18:08:08 UTC

Joe Pfeiffer 2020-04-06 21:44:12 UTC

Malcolm McLean 2020-04-06 20:54:01 UTC

Keith Thompson 2020-04-06 21:30:13 UTC

Ben Bacarisse 2020-04-06 00:20:58 UTC

Siri Cruise 2020-04-06 02:09:10 UTC

Guillaume 2020-04-07 14:50:27 UTC

Malcolm McLean 2020-04-07 16:19:57 UTC

Keith Thompson 2020-04-07 16:21:39 UTC

Guillaume 2020-04-07 23:41:50 UTC

Scott Lurndal 2020-04-08 00:30:01 UTC

Malcolm McLean 2020-04-08 09:06:38 UTC

Richard Damon 2020-04-08 13:17:21 UTC

Tim Rentsch 2020-04-09 14:16:02 UTC

Keith Thompson 2020-04-08 20:33:23 UTC

Malcolm McLean 2020-04-09 16:05:50 UTC

Keith Thompson 2020-04-09 20:28:10 UTC

Richard Tobin 2020-04-09 23:41:41 UTC

Guillaume 2020-04-08 14:17:57 UTC

Ike Naar 2020-04-07 22:01:06 UTC

Guillaume 2020-04-07 23:32:01 UTC

about - legalese

Loading...