Discussion:
Introducing the C_ Dialect
Add Reply
cHaR
2025-02-28 21:37:55 UTC
Reply
Permalink
I started working on a preprocessing-based dialect of C a couple of
years ago for use in personal projects, and now that its documentation
is complete, I am pleased to share the reference implementation with
fellow programmers.

https://github.com/cHaR-shinigami/c_

The entire implementation rests on the C preprocessor, and the ellipsis
framework is its metaprogramming cornerstone, which can perform any kind
form of mathematical and logical computation with function composition.
A new higher-order function named "omni" is introduced, which provides a
generalized syntax for operating with arrays and scalars; for example:

`op_(&arr0, +, &arr1)` adds elements at same indices in arr0 and arr1
`op_(&arr, *, 10)` scales each element of arr by 10
`op_(sum, +, &arr)` adds all elements of arr to sum
`op_(price, -, discount)` is simply price - discount

The exact semantics are a tad detailed, and can be found in chapters 4
and 5 of the documentation.

C_ establishes quite a few naming conventions: for example, type
synonyms are named with a leading uppercase letter, the notable aspect
being that they are non-modifiable by default; adding a trailing
underscore makes them modifiable. Thus an Int cannot be modified after
initialization, but an Int_ can be.

The same convention is also followed for pointers: `Ptr (Char_) ptr`
means `ptr` cannot be modified but `*ptr` (type Char_) can be, whereas
`Ptr_(Char) ptr_` means something else: `ptr_` can be modified but
`*ptr_` (type Char) cannot be. `Ptr (Int [10]) p1, p2` says both are
non-modifiable pointers to non-modifiable array of 10 integers; this
conveys intent more clearly than the conventional `const int (* const
p1)[10], p2` which ends up declaring something else: `p2` is not a
pointer, but a plain non-modifiable int.

C_ blends several ideas from object-oriented paradigms and functional
programming to facilitate abstraction-oriented designs with protocols,
procedures, classes and interfaces, which are explored from chapter 6.
For algorithm enthusiasts, I have also presented my designs on two
new(?) sorting strategies in the same chapter: "hourglass sort" uses
twin heaps with quick sort, and "burrow sort" uses a quasi-inplace merge
strategy. For the preprocessor sorting, I have used a custom-made
variant of adaptive bubble sort.

The sample examples have been tested with gcc-14 and clang-19 on a
32-bit variant of Ubuntu having glibc 2.39; setting the path for header
files is shown in the README file, and other options are discussed in
the documentation. I should mention that due to the massive (read as
obsessive) use of preprocessing by yours truly, the transpilation to C
programs is slow enough to rival the speed of a tortoise. This is
currently a major bottleneck without an easy solution.

Midway through the development, I set an ambitious goal of achieving
full-conformance with the C23 standard (back then in its draft stage),
and several features have evolved through a long cycle of changes to fix
language-lawyer(-esque) corner-cases that most programmers never worry
about. While the reference implementation may not have touched the
finish line of that goal, it is close enough, and at the very least, I
believe that the ellipsis framework fully conforms to C99 rules of the
preprocessor (if not, then it is probably a bug).

The documentation has been prepared in LaTeX and its generated PDF (with
300-ish pages of content) can be downloaded from
https://github.com/cHaR-shinigami/c_/blob/main/c_.pdf

I tried to maintain a formal style of writing throughout the document,
and as an unintended byproduct, some of the wording may seem overly
standardese. I am not sure if being a non-native English speaker was an
issue here, but I am certain that the writing can be made more
beginner-friendly in future revisions without loss of technical rigor.

While it took a considerably longer time than I had anticipated, the
code is still not quite polished yet, and the dialect has not matured
enough to suggest that it will "wear well with experience". However, I
do hope that at least some parts of it can serve a greater purpose for
other programmers to building something better. Always welcome to bug
reports on the reference implementation, documentation typos, and
general suggestions on improving the dialect to widen its scope of
application.

Regards,
cHaR
Ar Rakin
2025-03-01 16:16:47 UTC
Reply
Permalink
Post by cHaR
I started working on a preprocessing-based dialect of C a couple of
years ago for use in personal projects, and now that its documentation
is complete, I am pleased to share the reference implementation with
fellow programmers.
https://github.com/cHaR-shinigami/c_
The entire implementation rests on the C preprocessor, and the ellipsis
framework is its metaprogramming cornerstone, which can perform any kind
form of mathematical and logical computation with function composition.
A new higher-order function named "omni" is introduced, which provides a
`op_(&arr0, +, &arr1)` adds elements at same indices in arr0 and arr1
`op_(&arr, *, 10)` scales each element of arr by 10
`op_(sum, +, &arr)` adds all elements of arr to sum
`op_(price, -, discount)` is simply price - discount
The exact semantics are a tad detailed, and can be found in chapters 4
and 5 of the documentation.
C_ establishes quite a few naming conventions: for example, type
synonyms are named with a leading uppercase letter, the notable aspect
being that they are non-modifiable by default; adding a trailing
underscore makes them modifiable. Thus an Int cannot be modified after
initialization, but an Int_ can be.
The same convention is also followed for pointers: `Ptr (Char_) ptr`
means `ptr` cannot be modified but `*ptr` (type Char_) can be, whereas
`Ptr_(Char) ptr_` means something else: `ptr_` can be modified but
`*ptr_` (type Char) cannot be. `Ptr (Int [10]) p1, p2` says both are
non-modifiable pointers to non-modifiable array of 10 integers; this
conveys intent more clearly than the conventional `const int (* const
p1)[10], p2` which ends up declaring something else: `p2` is not a
pointer, but a plain non-modifiable int.
C_ blends several ideas from object-oriented paradigms and functional
programming to facilitate abstraction-oriented designs with protocols,
procedures, classes and interfaces, which are explored from chapter 6.
For algorithm enthusiasts, I have also presented my designs on two
new(?) sorting strategies in the same chapter: "hourglass sort" uses
twin heaps with quick sort, and "burrow sort" uses a quasi-inplace merge
strategy. For the preprocessor sorting, I have used a custom-made
variant of adaptive bubble sort.
The sample examples have been tested with gcc-14 and clang-19 on a 32-
bit variant of Ubuntu having glibc 2.39; setting the path for header
files is shown in the README file, and other options are discussed in
the documentation. I should mention that due to the massive (read as
obsessive) use of preprocessing by yours truly, the transpilation to C
programs is slow enough to rival the speed of a tortoise. This is
currently a major bottleneck without an easy solution.
Midway through the development, I set an ambitious goal of achieving
full-conformance with the C23 standard (back then in its draft stage),
and several features have evolved through a long cycle of changes to fix
language-lawyer(-esque) corner-cases that most programmers never worry
about. While the reference implementation may not have touched the
finish line of that goal, it is close enough, and at the very least, I
believe that the ellipsis framework fully conforms to C99 rules of the
preprocessor (if not, then it is probably a bug).
The documentation has been prepared in LaTeX and its generated PDF (with
300-ish pages of content) can be downloaded from
https://github.com/cHaR-shinigami/c_/blob/main/c_.pdf
I tried to maintain a formal style of writing throughout the document,
and as an unintended byproduct, some of the wording may seem overly
standardese. I am not sure if being a non-native English speaker was an
issue here, but I am certain that the writing can be made more beginner-
friendly in future revisions without loss of technical rigor.
While it took a considerably longer time than I had anticipated, the
code is still not quite polished yet, and the dialect has not matured
enough to suggest that it will "wear well with experience". However, I
do hope that at least some parts of it can serve a greater purpose for
other programmers to building something better. Always welcome to bug
reports on the reference implementation, documentation typos, and
general suggestions on improving the dialect to widen its scope of
application.
Regards,
cHaR
Very interesting. I haven't looked at everything yet, but I just want
to give my opinion on the language syntax. I will post my detailed
options and takes later.

From the PDF documentation:

--------------------------------------------
#include <c._>

Int_ main(Int argc, Ptr(Char_) argv[const])
begin
guard_(argc == 2)

Auto_ count_ = 0U;
guard_(input__(argv[1], count_) == 1)
Var prices = new__(Float_ [count_]);
guard_(prices)

Var discounts = new__(Float_ [count_]);
guard_(discounts)

loop_(0, count_ - 1)
print_("Enter price and discount (in %) for item", _i_ + 1);
guard_(scan__((*prices)[_i_], (*discounts)[_i_]) == 2, 1)
end

Auto_ price_ = 0.f;
op_(price_, +, prices)
print_("Total price is", price_);
op_(discounts, *, prices)
op_(discounts, /, 100)

loop_(0, count_ - 1)
print_("Discount on item", _i_ + 1, "is", (*discounts)[_i_]);
end

Auto_ discount_ = .0f;
op_(discount_, +, discounts)

print_("Total discount is", discount_);
print_("Final price is", price_ - discount_);
print_("Have a nice day");

end
--------------------------------------------

I like the types being in pascal case (in the form of Int_ and int_).
The function and block syntax looks a little bit like Elixir. However I
see that in some statements you do not use semicolon. Is that how it is
supposed to be? If so, then it is a bit inconsistent.

Other than that, the dialect looks fancy. Good work!
--
Rakin
Ar Rakin
2025-03-01 16:19:30 UTC
Reply
Permalink
Post by Ar Rakin
I like the types being in pascal case (in the form of Int_ and int_).
Would like to correct myself, I wanted to say in the form of Int_ and
*not* int_.
--
Rakin
cHaR
2025-03-01 17:12:38 UTC
Reply
Permalink
Post by cHaR
I started working on a preprocessing-based dialect of C a couple of
years ago for use in personal projects, and now that its documentation
is complete, I am pleased to share the reference implementation with
fellow programmers.
https://github.com/cHaR-shinigami/c_
The entire implementation rests on the C preprocessor, and the
ellipsis framework is its metaprogramming cornerstone, which can
perform any kind form of mathematical and logical computation with
function composition. A new higher-order function named "omni" is
introduced, which provides a generalized syntax for operating with
`op_(&arr0, +, &arr1)` adds elements at same indices in arr0 and arr1
`op_(&arr, *, 10)` scales each element of arr by 10
`op_(sum, +, &arr)` adds all elements of arr to sum
`op_(price, -, discount)` is simply price - discount
The exact semantics are a tad detailed, and can be found in chapters 4
and 5 of the documentation.
C_ establishes quite a few naming conventions: for example, type
synonyms are named with a leading uppercase letter, the notable aspect
being that they are non-modifiable by default; adding a trailing
underscore makes them modifiable. Thus an Int cannot be modified after
initialization, but an Int_ can be.
The same convention is also followed for pointers: `Ptr (Char_) ptr`
means `ptr` cannot be modified but `*ptr` (type Char_) can be, whereas
`Ptr_(Char) ptr_` means something else: `ptr_` can be modified but
`*ptr_` (type Char) cannot be. `Ptr (Int [10]) p1, p2` says both are
non-modifiable pointers to non-modifiable array of 10 integers; this
conveys intent more clearly than the conventional `const int (* const
p1)[10], p2` which ends up declaring something else: `p2` is not a
pointer, but a plain non-modifiable int.
C_ blends several ideas from object-oriented paradigms and functional
programming to facilitate abstraction-oriented designs with protocols,
procedures, classes and interfaces, which are explored from chapter 6.
For algorithm enthusiasts, I have also presented my designs on two
new(?) sorting strategies in the same chapter: "hourglass sort" uses
twin heaps with quick sort, and "burrow sort" uses a quasi-inplace
merge strategy. For the preprocessor sorting, I have used a
custom-made variant of adaptive bubble sort.
The sample examples have been tested with gcc-14 and clang-19 on a 32-
bit variant of Ubuntu having glibc 2.39; setting the path for header
files is shown in the README file, and other options are discussed in
the documentation. I should mention that due to the massive (read as
obsessive) use of preprocessing by yours truly, the transpilation to C
programs is slow enough to rival the speed of a tortoise. This is
currently a major bottleneck without an easy solution.
Midway through the development, I set an ambitious goal of achieving
full-conformance with the C23 standard (back then in its draft stage),
and several features have evolved through a long cycle of changes to
fix language-lawyer(-esque) corner-cases that most programmers never
worry about. While the reference implementation may not have touched
the finish line of that goal, it is close enough, and at the very
least, I believe that the ellipsis framework fully conforms to C99
rules of the preprocessor (if not, then it is probably a bug).
The documentation has been prepared in LaTeX and its generated PDF
(with 300-ish pages of content) can be downloaded from
https://github.com/cHaR-shinigami/c_/blob/main/c_.pdf
I tried to maintain a formal style of writing throughout the document,
and as an unintended byproduct, some of the wording may seem overly
standardese. I am not sure if being a non-native English speaker was
an issue here, but I am certain that the writing can be made more
beginner- friendly in future revisions without loss of technical rigor.
While it took a considerably longer time than I had anticipated, the
code is still not quite polished yet, and the dialect has not matured
enough to suggest that it will "wear well with experience". However, I
do hope that at least some parts of it can serve a greater purpose for
other programmers to building something better. Always welcome to bug
reports on the reference implementation, documentation typos, and
general suggestions on improving the dialect to widen its scope of
application.
Regards,
cHaR
Very interesting.  I haven't looked at everything yet, but I just want
to give my opinion on the language syntax.  I will post my detailed
options and takes later.
--------------------------------------------
#include <c._>
Int_ main(Int argc, Ptr(Char_) argv[const])
begin
    guard_(argc == 2)
    Auto_ count_ = 0U;
    guard_(input__(argv[1], count_) == 1)
    Var prices = new__(Float_ [count_]);
    guard_(prices)
    Var discounts = new__(Float_ [count_]);
    guard_(discounts)
    loop_(0, count_ - 1)
        print_("Enter price and discount (in %) for item", _i_ + 1);
        guard_(scan__((*prices)[_i_], (*discounts)[_i_]) == 2, 1)
    end
    Auto_ price_ = 0.f;
    op_(price_, +, prices)
    print_("Total price is", price_);
    op_(discounts, *, prices)
    op_(discounts, /, 100)
    loop_(0, count_ - 1)
        print_("Discount on item", _i_ + 1, "is", (*discounts)[_i_]);
    end
    Auto_ discount_ = .0f;
    op_(discount_, +, discounts)
    print_("Total discount is", discount_);
    print_("Final price is", price_ - discount_);
    print_("Have a nice day");
end
--------------------------------------------
I like the types being in pascal case (in the form of Int_ and int_).
The function and block syntax looks a little bit like Elixir.  However I
see that in some statements you do not use semicolon.  Is that how it is
supposed to be?  If so, then it is a bit inconsistent.
Other than that, the dialect looks fancy.  Good work!
Yes, the C_-specific statements do not need semicolon, though we can
always use one for consistency. The only time it would cause issues is
with code like:

if (expr) op_(arr_ptr, +, 10);
else /* something else */

It will not compile as the `else` gets detached from the `if` due to the
semicolon, but it would only cause compilation error, not a runtime issue.

Also, thanks for the feedback on type naming: the idea is to declare
everything as non-modifiable (Int or Char) as a default practice, and
suffixing an underscore later (Int_ or Char_) only if the variable needs
to be updated. While this is impractical for all variables, a good
starting point is to always declare function parameters as non modifiable.
Loading...