A bit of a story.
At SiCortex, we were well aware that structure layout and bitfields were compiler
dependent, yet we had real requirements that high (?) level languages be able to directly
interact with hardware defined structures.
So there was a chip spec that was the master truth about register layouts and hardware
structures. Wilson Snyder (of Verilator fame) wrote a postprocessor that swallowed the
chip spec document and spit out header files in various languages, one of which was C
Preprocessor compatible #defines for field offset, position, and width.
We programmers were careful to always use GET and SET macros to access objects with
hardware specified layouts.
IIRC there were additional macros to make sure that hardware accesses always used the
proper machine instructions and fences.
I don’t think the use of these macros cost much or any performance because they mostly
resolved into shifts and masks with constants.
It’s too bad that that C doesn’t have standardized rules for structures, because even when
hardware isn’t involved, one must be extremely careful with cacheline alignment of fields
in order to get good performance. __attribute__((aligned(64)) for the win, but it is
ugly. I don’t think there is even a portable way to get the cacheline size.
Then there is compiler register promotion, and volatile.
-L
On Mar 3, 2025, at 11:59, Paul Winalski
<paul.winalski(a)gmail.com> wrote:
On Mon, Mar 3, 2025 at 6:51 AM Peter Yardley <peter.martin.yardley(a)gmail.com
<mailto:peter.martin.yardley@gmail.com>> wrote:
Problems I have with C as a systems language is
there is no certainty about representation of a structure in memory and hence when it is
written to disk.
I believe you are correct. Alignment and padding within structs in C is
implementation-dependent.
I will be happy to be corrected but I remember
this behaviour to be compiler dependant. Other languages such as Bliss and to perhaps a
lesser degree Pascal had implicit control of this.
In the case of BLISS, I would call it explicit control. BLISS is a peculiar
language in that it has only one data type, the BLISS value which is a contiguous set of
bits of fixed length, typically the size of the target machine word. There are four
dialects of BLISS that differed mainly in the size of the BLISS value: 16 for BLISS-16
(for the PDP-11, 32 for BLISS-32 (VAX, x86), 36 for BLISS-36 (PDP-10), and 64 for BLISS-64
(Alpha, Itanium, x86-64).
Data typing in BLISS is a function of the operations performed on the BLISS value rather
than a property of the value itself. BLISS is also unusual in that data are retrieved
from memory or registers using an explicit fetch operator. There is no distinction
between lvalues and rvalues. BLISS is also an expression language rather than a statement
language. Everything, including procedural code, is an expression with a BLISS value and
can be used as such. For example:
a = (if .b then .c else .d)
This code fetches the BLISS value whose address is b ('.' is the fetch
operator). The IF expression treats this fetched value as a logical true/false. If the
value is true, the value whose address is c is fetched, otherwise the value whose address
is d is fetched. The value of IF expression is the fetched value. The = expression then
stores that value at the address represented by a. The stored value is also the value of
the = expression itself. The semicolon (;) is an expression separator with the semantics
"discard the current expression value". It is common practice to write
procedural code with operations separated by semicolons, which gives you a syntax similar
to statement-oriented languages. But you don't have to code that way.
BLISS does have data aggregates, both vectors (one-dimensional arrays of BLISS values)
and C-style structs. The latter allows one to attach identifier names to pieces of the
aggregate, supplying the same information as with the fetch operator (starting bit within
the BLISS value, length in bits, whether or not it's to be sign-extended). In C the
programmer merely specifies the order of the fields within a struct and their data types.
The compiler actually lays out the positions of the fields. In BLISS the programmer must
explicitly position the fields and specify their lengths.