Discussion:
Syntax for coordinating value bound together by constraining logic
(too old to reply)
Rick C. Hodgin
2017-05-15 19:33:25 UTC
Permalink
Raw Message
Suppose a function exists which will return either a valid value or
NULL, and populate a passed pointer-to-pointer parameter with another
value unrelated to the return value, but it is known that if either
of them are valid, then both will be.

Is there any convention in C to allow those two items to be associated
so the language parser and optimizer can recognize that a potentially
stale or uninitialized value is not being used?

void *p1;
void (*p2)(void*);

// Call some function to initialize them
p1 = getFunc(&p2);

// At this point, if p1 is valid, we know p2 is also.
// At this point, if p2 is valid, we know p1 is also.

Is there some way to tell the compiler that both p1 and p2 will
be valid, so that if I used the code:

if (p1)
p2();

... the compiler would know that I don't also need to test if p2 is
valid, but can go ahead and use it?

-----
If it exists, I'd like to know the syntax. If not, I'd like some
suggestions on syntax. My initial thought is a compiler directive,
along with an undo operation:

#bind p1, p2
#unbind p1, p2

Thank you,
Rick C. Hodgin
Thiago Adams
2017-05-15 19:55:47 UTC
Permalink
Raw Message
Post by Rick C. Hodgin
Suppose a function exists which will return either a valid value or
NULL, and populate a passed pointer-to-pointer parameter with another
value unrelated to the return value, but it is known that if either
of them are valid, then both will be.
Is there any convention in C to allow those two items to be associated
so the language parser and optimizer can recognize that a potentially
stale or uninitialized value is not being used?
void *p1;
void (*p2)(void*);
// Call some function to initialize them
p1 = getFunc(&p2);
// At this point, if p1 is valid, we know p2 is also.
// At this point, if p2 is valid, we know p1 is also.
Is there some way to tell the compiler that both p1 and p2 will
if (p1)
p2();
... the compiler would know that I don't also need to test if p2 is
valid, but can go ahead and use it?
-----
If it exists, I'd like to know the syntax. If not, I'd like some
suggestions on syntax. My initial thought is a compiler directive,
#bind p1, p2
#unbind p1, p2
This is static analysis.

getFunc has an argument that is out.
Not only this, but its a uninitialized memory, so you cannot use inside getFunc.

The static analysis needs to know that the out argument is valid
depending on the function result. (microsoft sal has annotations for this)

I like this subject.
The type system need more info. But what is interesting, is that the
info changes depending on code paths.

For instance:

int * p = null;

This type has a hidden info that p can_be_null
something like const.

can_be_null int * p = null;


but when you do

if (p != null)
{
//now p cannot be null here
}

and here it can be null again.

Differently from const, where type is always const in all paths these modifiers "can_be_null", "initialized" and "uninitialized" changes along code paths.

You can have the modifier "never_null" like const that don't
need to check paths.
Rick C. Hodgin
2017-05-15 19:56:59 UTC
Permalink
Raw Message
Post by Rick C. Hodgin
Suppose a function exists which will return either a valid value or
NULL, and populate a passed pointer-to-pointer parameter with another
value unrelated to the return value, but it is known that if either
of them are valid, then both will be.
Is there any convention in C to allow those two items to be associated
so the language parser and optimizer can recognize that a potentially
stale or uninitialized value is not being used?
void *p1;
void (*p2)(void*);
// Call some function to initialize them
p1 = getFunc(&p2);
// At this point, if p1 is valid, we know p2 is also.
// At this point, if p2 is valid, we know p1 is also.
Is there some way to tell the compiler that both p1 and p2 will
if (p1)
p2();
... the compiler would know that I don't also need to test if p2 is
valid, but can go ahead and use it?
-----
If it exists, I'd like to know the syntax. If not, I'd like some
suggestions on syntax. My initial thought is a compiler directive,
#bind p1, p2
#unbind p1, p2
Thank you,
Rick C. Hodgin
I think this should actually be taken out a step further, so that the
constraints are defined in the called function. The parameters which
are passed in should have in, out, inout definitions which identify
their intended use. And the constraints on values should be found
there where they're defined and used, either indirectly through code
analysis, or by explicit direction by the developer.

I think this binding together of related data items not naturally
bound as by their member association in a parent container, should
have a presence in C-like source code.

It may even desirable to remove the definition from such concepts
like #bind and #unbind into an ad hoc container object indicating
they are bound together:

void* getFunc(... p2)
{
bind BData {
return,
p2
};
}

By creating this ad hoc structure, the data items for the return value
and the input parameter could be referenced through the named parent:

BData.return = whatever;
BData.p2 = &someFunction;

And extending this out even further, the ability to expressly name
return parameters, so that you could have:

void* r = getFunc(... p2)
{
bind BData { r, p2 };
}

In this way, the compiler could check and make sure that if r was
defined, then p2 must also be defined, and vice-versa. The return
value now becomes a variable that can be used at any point by name,
so that it doesn't actually require an explicit return(x) command
to populate the return value, but the first assignment sets its
value.

It would be slightly less efficient in binary, but it would have
the advantage of making more sense to developers, and it would
implement a constraint that the compiler can check at compile-
time as well as run-time with a switch enabling those features.

Thank you,
Rick C. Hodgin
David Brown
2017-05-15 21:26:35 UTC
Permalink
Raw Message
Post by Rick C. Hodgin
Suppose a function exists which will return either a valid value or
NULL, and populate a passed pointer-to-pointer parameter with another
value unrelated to the return value, but it is known that if either
of them are valid, then both will be.
Is there any convention in C to allow those two items to be associated
so the language parser and optimizer can recognize that a potentially
stale or uninitialized value is not being used?
void *p1;
void (*p2)(void*);
// Call some function to initialize them
p1 = getFunc(&p2);
// At this point, if p1 is valid, we know p2 is also.
// At this point, if p2 is valid, we know p1 is also.
Is there some way to tell the compiler that both p1 and p2 will
if (p1)
p2();
... the compiler would know that I don't also need to test if p2 is
valid, but can go ahead and use it?
The compiler does not need to check if p2 is valid here - it assumes it
/is/ valid, otherwise your code would not make sense (it would have
undefined behaviour) - and it assumes that the programmer is smart
enough not to do that.

C by itself does not have any way of telling the compiler about
assumptions it can make. But it can be possible with compiler
extensions. This uses gcc's extensions, valid also for clang, but may
be adaptable to other compilers:

// Non-existent function called if it is known at compile-time
// that an assume will fail
extern void __attribute__((error("Assume failed"))) assumeFailed(void);

// The compiler can assume that "x" is true, and optimise or
// warn accordingly
// If the compiler can see that the assume will fail, it gives an error
#define assume(x) \
do { \
if (__builtin_constant_p(x)) { \
if (!(x)) { \
assumeFailed(); \
} \
} \
if (!(x)) __builtin_unreachable(); \
} while (0)

Then you can write:

assume((p1 && p2) || (!p1 && !p2));

That tells the compiler that either /both/ of p1 and p2 are non-zero, or
both of them /are/ zero. It generates no code, and if the compiler can
see at compile-time that the assumption does not hold, it will give an
error.


Is that any help?
Post by Rick C. Hodgin
-----
If it exists, I'd like to know the syntax. If not, I'd like some
suggestions on syntax. My initial thought is a compiler directive,
#bind p1, p2
#unbind p1, p2
Thank you,
Rick C. Hodgin
Loading...