A bit of a story.

At SiCortex, we were well aware that structure layout and bitfields were compiler dependent, yet we had real requirements that high (?) level languages be able to directly interact with hardware defined structures.

So there was a chip spec that was the master truth about register layouts and hardware structures.  Wilson Snyder (of Verilator fame) wrote a postprocessor that swallowed the chip spec document and spit out header files in various languages, one of which was C Preprocessor compatible #defines for field offset, position, and width.

We programmers were careful to always use GET and SET macros to access objects with hardware specified layouts.
IIRC there were additional macros to make sure that hardware accesses always used the proper machine instructions and fences.

I don’t think the use of these macros cost much or any performance because they mostly resolved into shifts and masks with constants.

It’s too bad that that C doesn’t have standardized rules for structures, because even when hardware isn’t involved, one must be extremely careful with cacheline alignment of fields in order to get good performance.  __attribute__((aligned(64)) for the win, but it is ugly.  I don’t think there is even a portable way to get the cacheline size.

Then there is compiler register promotion, and volatile.

-L


On Mar 3, 2025, at 11:59, Paul Winalski <paul.winalski@gmail.com> wrote:

On Mon, Mar 3, 2025 at 6:51 AM Peter Yardley <peter.martin.yardley@gmail.com> wrote:
Problems I have with C as a systems language is there is no certainty about representation of a structure in memory and hence when it is written to disk.

I believe you are correct.  Alignment and padding within structs in C is implementation-dependent.
 
I will be happy to be corrected but I remember this behaviour to be compiler dependant. Other languages such as Bliss and to perhaps a lesser degree Pascal had implicit control of this.

 In the case of BLISS, I would call it explicit control.  BLISS is a peculiar language in that it has only one data type, the BLISS value which is a contiguous set of bits of fixed length, typically the size of the target machine word.  There are four dialects of BLISS that differed mainly in the size of the BLISS value:  16 for BLISS-16 (for the PDP-11, 32 for BLISS-32 (VAX, x86), 36 for BLISS-36 (PDP-10), and 64 for BLISS-64 (Alpha, Itanium, x86-64).

Data typing in BLISS is a function of the operations performed on the BLISS value rather than a property of the value itself.  BLISS is also unusual in that data are retrieved from memory or registers using an explicit fetch operator.  There is no distinction between lvalues and rvalues.  BLISS is also an expression language rather than a statement language.  Everything, including procedural code, is an expression with a BLISS value and can be used as such.  For example:

     a = (if .b then .c else .d)

This code fetches the BLISS value whose address is b ('.' is the fetch operator).  The IF expression treats this fetched value as a logical true/false.  If the value is true, the value whose address is c is fetched, otherwise the value whose address is d is fetched.  The value of IF expression is the fetched value.  The = expression then stores that value at the address represented by a.  The stored value is also the value of the = expression itself.  The semicolon (;) is an expression separator with the semantics "discard the current expression value".  It is common practice to write procedural code with operations separated by semicolons, which gives you a syntax similar to statement-oriented languages.  But you don't have to code that way.

BLISS does have data aggregates, both vectors (one-dimensional arrays of BLISS values) and C-style structs.  The latter allows one to attach identifier names to pieces of the aggregate, supplying the same information as with the fetch operator (starting bit within the BLISS value, length in bits, whether or not it's to be sign-extended).  In C the programmer merely specifies the order of the fields within a struct and their data types.  The compiler actually lays out the positions of the fields.  In BLISS the programmer must explicitly position the fields and specify their lengths.