At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen"
<thomas.paulsen(a)firemail.de> wrote:
Subject: Re: [TUHS] History of popularity of C
When I'm doing C I always have the CPU and its instructions in mind.
And that's exactly what might trip you up unless you _exactly_
understand how the language standard defines the operations of the
abstract virtual machine (right down to the implications of every
sequence point in the code); how compilers and optimizers do and (more
importantly) do not work when mapping the abstract virtual machine
operations into real-world machine instructions; and what how _all_
instances of "undefined behaviour" can arise, and exactly what the
optimizer is allowed to do when and if it spots UB conditions in the
code.
A big part of the problem is that the C Standard mandates compilation
will and must succeed (and allows this success to be totally silent too)
even if the code contains instances of undefined behaviour. This means
that the successful execution of the generated code may depend on what
optimization level was chosen. Code that does security tests on input
values might be entirely and silently eliminated by the optimizer
because of some innocuous-seeming UB instance, and this is exactly what
has happened in the Linux kernel, for example (probably more than once).
UB can be introduced quite innocently just by moving sequence points in
variable references in ways that are not necessarily obvious even to
seasoned programmers (and indeed "seasoned" programmers are often the
ones who's old-fashioned coding habits might lead to introduction of
serious problems in such a way).
I've found dozens of instances of UB in mature and well tested code, and
sometimes only by luck of having chosen the "right" compiler and enabled
its feature of introducing illegal instructions in places where UB might
occur, _and_ having had the luck to test in such a way as to encounter
the specific code path where this UB occurred.
I would claim it's truly safer now to write C without understanding the
underlying mechanics of the CPU and memory, but rather by just paying
very close attention to the detailed semantics of the language,
understanding only the abstract virtual C machine, and hoping your
compiler will at least warn if anything even remotely suspicious is done
in your code; and lastly (but perhaps most importantly) avoiding like
the plague any coding constructs which might make UB harder to spot
(e.g. never ever initialize local variables with their definition when
pointers are involved).
Unfortunately the new "most advanced" C compilers also make it quite a
bit more difficult for those of us writing C code that must have
specific actions on the bare metal hardware, e.g. in embedded systems,
kernels, hardware drivers, etc.; including especially where UB detection
tools are far more difficult to use.
--
Greg A. Woods <gwoods(a)acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca>
Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>