Post by BartPost by firPost by Paul EdwardsHi.
I have been after a public domain C compiler for decades.
None of them reach C90 compliance. SubC comes the
closest but was written without full use of C90, which
makes it difficult to read. I'm after C90 written in C90.
A number of people have tried, but they always seem
to fall short. One of those attempts is pdcc. The
preprocessor was done, but the attempt (by someone
else) to add C code generation was abandoned.
I decided to take a look at it, and it looks to me like
a significant amount of work has already been done.
Also, my scope is limited - I am only after enough
functionality to get my 80386 OS (PDOS) compiled,
and I don't mind short=int=long = 32 bits, I don't
mind not having float. I don't use bitfields.
Anyway, I have had some success in making enhancements
https://sourceforge.net/p/pdos/gitcode/
ci/3356e623785e2c2e16c28c5bf8737e72df
d39e04/
But I don't really know what I'm doing (I do know some
of the theory - but this is a particular design).
E.g. now that I have managed to get a variable passed to
a function, I now want the address of that variable passed
to the function - ie I want to do &x instead of x - and I am
not sure whether to create a new ADDRESS type, or
whether it is part of VARREF or what - in the original
(incomplete) concept. Or CC_EXPR_AMPERSAND.
I am happy to do the actual coding work - I'm just looking
for some nudges in the right direction if anyone can assist.
Thanks. Paul.
you mean there is no such a compiler? rise a fund for some to
write it and they will write it..and if few thousand of people
will give some money there it will be written
There are any number of open source C compilers. But they need to be
good enough (too many support only a subset, which may not be enough for
the OP) and they need to be public domain for the OP's purposes.
I am more in the camp of MIT or BSD license should be good enough for
most things.
Trying to go full public domain has a few of its own issues:
* Not always recognized as valid;
* Implicitly lacks "No Warranty" and "No Liability" protections for the
author (say, if someone wanted to file a lawsuit over the code being
buggy, etc).
* ...
There could almost be a "MIT Minus" or something, which could be, say,
MIT with a clause saying one is allowed to discard the license terms for
sake of derived works (but still offering protection from liability).
As for C compilers, I have a compiler for my own uses, but:
* MIT licensed;
* Doesn't target x86.
* Sorta implements C99 with various fragments of newer standards.
** Though, is a bit hit/miss on the now-optional parts.
** VLAs sorta exist but do not necessarily work correctly.
*** Currently unsupported in DLLs;
*** Seemingly may result in memory leaks if used.
*** Essentially, they are implemented via runtime library calls.
**** With memory provided indirectly via malloc.
Old target list (for which the code still exists):
* SH-4 (AKA: SuperH, most well known for SEGA Saturn and Dreamcast)
* BJX-1 (Was a highly modified version of SH-4)
* BTSR1 (a small SH inspired ISA, intended to be comparable to MSP430).
** Not maintained, RV32IC seems like a better option.
Currently active targets:
* BJX-2: Now a group of several closely related variants.
** All are 64-bit, most using a 48-bit VAS (some had a 32-bit VAS)
** Baseline: 16/32/64/96 bit instructions, 32 or 64 GPRs
** XG2: 32/64/96 bit, 64 GPRs
* RISC-V, RV64G + Custom Extensions
** Has some extensions which can help notably with performance.
** Can support plain RV64G as well.
** No current support for the 16-bit 'C' encodings.
* XG3RV
** Mostly a tweaked and repacked version of XG2 used alongside RV64G.
** The XG3 encoding space replaces the RV64 'C' (Compressed) extension.
** Both XG3 and RV64 instructions may be encoded at the same time.
** XG3 is used in a functionally-similar subset, just with 64 GPRs.
Not yet bothered with a target for RV32IC, GCC does this well enough.
* x86/x86-64/ARM: We generally have GCC and Clang.
Granted, GCC and Clang are both very large and slow/painful to rebuild
from source. My compiler is at least a lot smaller and easy to rebuild.
Likely, far more of the total effort of my project has ended up going
into my compiler than into the emulator or Verilog implementation though.
The BJX-2 register space had 64 registers and was split in half for the
RV64G modes (32 GPRs and 32 FPRs), whereas XG3 and my jumbo-prefix
extensions partly undo this split.
( Decided to try changing the way I write my ISA name as maybe adding a
hyphen will get me less trouble... ).
Though, partly this is because for performance BGBCC seems to need a lot
of registers (it could barely operate with the SH4's 16 GPRs, and still
has a fairly high spill-and-fill rate with 32 GPRs).
Though, can note that with my compiler and XG3RV, despite not adding
much over RV64+Jumbo, does beat both code density and performance of
RV64G via "GCC -O3" (and also beats the code-density of RV64GC, as in
this case, fewer instructions is better than smaller instructions).
A big part of the performance delta between the ISAs could be addressed
by adding a few major features to RV64:
* Jumbo Prefixes: Prefix may extend 12-bit imm/disp fields to 33 bits;
** Also extends LUI, AUIPC, and JAL to 33-bit forms.
* Load/Store with a register index;
* Load/Store Pair.
With BGBCC vs GCC RV64G, this gives around a 30% speedup.
* It is closer to 70% if comparing against BGBCC with plain RV64G.
* BGBCC can't match GCC if both are targeting RV64G.
** I am not sure what GCC would do if it had my extensions.
The specific extensions here mostly targeting the dominant sources of
inefficiency in the RV64G encodings as they exist (the ISA design deals
poorly to exceeding what can be encoded directly in an immediate, ...).
The jumbo prefixes may also be used to merge the register space back
into a 64 register space (at the cost of using 64-bit instruction
encodings to do so), but this only extends the imm/disp fields to 23
bits (except for LUI/AUPIC/JAL, which always have an expanded 6b
register field with jumbo prefixes).
Note that J+AUIPC loads an address of PC +/- 4GB into Rd. Likewise,
J+JAL is +/- 4GB (with LSB as MBZ).
The relative performance gains from the XG3RV vs extended RV64G were
smaller, it mostly serves to improve code-density (makes Doom roughly
16% smaller; and is around 44% smaller than plain RV64G).
Main thing it has (in theory) is access to a lot of the specialized SIMD
instructions and similar that exist in my ISA but lack equivalents on
the RV64G side of things.
There are a few instructions that exist here which are tempting to add
as extended instructions to RV64:
* Compare Equal, Not-Equal, and Greater-Equal instructions (SEQ, SNE, SGE);
* Load/Store relative to GP with a larger displacement (TBD, 2).
Some notable features from BJX-2 were effectively made optional in XG3,
such as support for an SR.T bit (originally carried over SuperH), and
predication (in BJX-2, instructions could be encoded for whether or not
to execute based on the status of the SR.T bit). However, no direct
architectural equivalent exists in RV64.
In XG3RV, the questionable design choice had been made to conceptually
holding these parts of the architectural state in the high-order bits of
PC and LR/RA (in my other ISA variants, LR merely captured these bits
from SR).
2: It is tempting to consider, possibly:
LW/LD, SW/SD, with an addressing mode like: [GP+Disp14u*4|8]
So, able to encode an access 64 or 128K relative to GP rather than +/-
2K. This would save some space over the use of a jumbo prefix (at least
with my compiler tending to use GP to access global variables).
Where, it would be "better" here if one could access most of the global
variables in a single 32-bit instruction. But, wouldn't fit in as well
with the existing ISA encodings.
Generally, BGBCC uses a modified PE/COFF variant.
* For RV64G, I switched it to default to using plain PE/COFF.
* Some people might find this slightly easier to deal with.
Though, can note that GNU binutils still has no idea how to handle RV64
PE/COFF, as it seemingly treats every machine-type as its own file
format (and does not support any RV64 + PE/COFF targets).
Where, for some of the other ISAs, BGBCC generates LZ4 compressed
binaries (file headers are uncompressed, but the rest of the image is
compressed). Rationale is mostly that loading binaries from an SDcard is
IO bound, and within the limits tested, LZ4 did best for executable code.
I have another byte-oriented LZ format (RP2) which works better for
general data, but seemingly worse on program binaries. Entropy coded
format were not used, as the speed cost of Huffman decoding is higher
than that of the time spent reading data from an SDcard.
Seeming main difference is the RP2 correlates match length and distance
(to encode a larger distance also encodes a longer match-length field).
This correlation is true of most data, but less true of program
binaries. LZ4 has a fixed 16-bit match distance, and came out ahead.
TBD if I should add support for 64-bit ELF, but ELF kinda sucks IMHO
(and for ELF PIE binaries, roughly around half of the binary ends up
eaten by metadata).
Where, bloated binaries are bad for both loading time and memory use (it
is bad to have a 900K binary as an "ELF tax" when PE/COFF would have
only needed 400K; well, and say the binary LZ4 compresses down to around
260K in the latter case; though one will still need 400K in RAM).
Further not helped in this case by ELF needing to load a new copy of the
binaries for every program instance, whereas with my ABI, I was able to
share the read-only sections across multiple instances (only the
data/bss sections need to be instantiated per-process).
Can note that in my case, the PE/COFF "Global Pointer" entry in the Data
Directory is effectively used to express the start of ".data" (which is
also where the Global Pointer points), along with the combined size of
data and bss (if the size is non-zero).
Global Pointer:
RVA=Size=0: No Global Pointer
RVA!=0, Size=0: Global Pointer points here, may not be relocated.
RVA!=0, Size!=0: Start of data area, may be relocated per instance.
...