[TUHS] Re: Minimum Array Sizes in 16 bit C (was Maximum)

28 Sep 2024

I have to concede Branden's "gotcha". Struct copying is definitely not
O(1).
A real-life hazard of non-O(1) operations. Vic Vyssotsky put bzero (I
forget what Vic called it) into Fortran II. Sometime later, he found that a
percolation simulation was running annoyingly slowly. It took some study to
discover that the real inner loop of the program was not the percolation,
which touched only a small fraction of a 2D field. More time went into the
innocuous bzero initializer. The fix was to ADD code to the inner loop to
remember what entries had been touched, and initialize only those for the
next round of the simulation.
...
  for a while, every instruction has [sic] an exactly
predictable and constant cycle
...
   count. ...[Then] all of a sudden you had instructions
with O(n) cycle counts.
O(n) cycle counts were nothing  new. In the 1950s we had the IBM 1620 with
arbitrary-length arithmetic and the 709 with "convert" instructions whose
cycle count went up to 256.
...
 > spinelessly buckled to allow malloc(0) to return
0, as some
> implementations gratuitously did. 
...
  What was the alternative?  There was no such thing as
an exception, and
 if a pointer was an int and an int was as wide as a machine address,
 there'd be no way to indicate failure in-band, either. 
What makes you think allocating zero space should fail? If the size n of a
set is determined at run time, why should one have to special-case its
space allocation when n=0?  Subsequent processing of the form for(i=0; i<n;
i++) {...} will handle it gracefully with no special code. Malloc should do
as it did in v7--return a non-null pointer different from any other active
malloc pointer, as Bakul stated. If worse comes to worst[1] this can be
done by padding up to the next feasible size. Regardless of how the pointer
is created, any access via it would of course be out of bounds and hence
wrong.
...
  How does malloc(0) get this job done and what benefit
does it bring? 
If I understand the "job" (about initializing structure members) correctly,
malloc(0) has no bearing on it. The benefit lies elsewhere.
Apropos of tail calls, Rob Pike had a nice name for an explicit tail call,
"become". It's certainly reasonable, though, to make compilers recognize
tail calls implicitly.
[1] Worse didn't come to worst in the original malloc. It attached metadata
to each block, so even blocks of size zero consumed some memory.
Doug
On Sat, Sep 28, 2024 at 1:59 PM Bakul Shah via TUHS &lt;tuhs(a)tuhs.org&gt; wrote:
...
  Just responding to random things that I noticed:
 You don't need special syntax for tail-call. It should be done
 transparently when a call is the last thing that gets executed. Special
 syntax will merely allow confused people to use it in the wrong place and
 get confused more.
 malloc(0) should return a unique ptr. So that "T* a = malloc(0); T* b =
 malloc(0); a != (T*)0 && a != b". Without this, malloc(0) acts differently
 from malloc(n) for n > 0.
 Note that except for arrays, function arguments & result are copied so
 copying a struct makes perfect sense. Passing arrays by reference may have
 been due to residual Fortran influence! [Just guessing] Also note: that one
 exception has been the cause of many problems.
 In any case you have not argued convincingly about why dynamic memory
 allocation should be in the language (runtime) :-) And adding that wouldn't
 have fixed any of the existing problems with the language.
 Bakul
  On Sep 28, 2024, at 9:58 AM, G. Branden Robinson
<  g.branden.robinson(a)gmail.com&gt; wrote:

 At 2024-09-28T09:34:14-0400, Douglas McIlroy wrote:
   C's
refusal to specify dynamic memory allocation in the language
 runtime (as opposed to, eventually, the standard library) 
 This complaint overlooks one tenet of C: every operation in what you
 call "language runtime" takes O(1) time. Dynamic memory allocation
 is not such an operation. 
 A fair point.  Let me argue about it anyway.  ;-)
 I'd make three observations.  First, K&R did _not_ tout this in their
 book presenting ANSI C.  I went back and checked the prefaces,
 introduction, and the section presenting a simple malloc()/free()
 implementation.  The tenet you claim for the language is not explicitly
 articulated and, if I squint really hard, I can only barely perceive
 (imagine?) it deeply between the lines in some of the prefatory material
 to which K&R mostly confine their efforts to promote the language.  In
 my view, a "tenet" is something more overt: the sort of thing U.S.
 politicians try to get hung on the walls of every public school
 classroom, like Henry Spencer's Ten Commandments of C[1] (which itself
 does not mention this "core language has only O(1) features" principle).
 Second, in reviewing K&R I was reminded that structure copying is part
 of the language.  ("There are no operations that manipulate an entire
 array or string, although structures may be copied as a unit."[2])
 Doesn't that break the tenet right there?
 Third, and following on from the foregoing, your point reminded me of my
 youth programming non-pipelined machines with no caches.  You could set
 your watch by (just about) any instruction in the set--and often did,
 because we penurious microcomputer guys often lacked hardware real-time
 clocks, too.  That is to say, for a while, every instruction has an
 exactly predictable and constant cycle count.  (The _value_ of that
 constant might depend on the addressing mode, because that would have
 consequences on memory fetches, but the principle stood.)  When the Z80
 extended the 8080's instruction set, they ate from Tree of Knowledge
 with block-copy instructions like LDIR and LDDR, and all of a sudden you
 had instructions with O(n) cycle counts.  But as a rule, programmers
 seemed to welcome this instead of recognizing it as knowing sin, because
 you generally knew worst-case how many bytes you'd be copying and take
 that into account.  (The worst worst case was a mere 64kB!)
 Further, Z80 home computers in low-end configurations (that is, no disk
 drives) often did a shocking thing: they ran with all interrupts masked.
 All the time.  The one non-maskable interrupt was RESET, after which you
 weren't going to be resuming execution of your program anyway.  Not from
 the same instruction, at least.  As I recall the TRS-80 Model I/III/4
 didn't even have logic on the motherboard to decode the Z80's "interrupt
 mode 2", which was vectored, I think.  Even in the "high-end"
 configurations of these tiny machines, you got a whopping ONE interrupt
 to play with ("IM 1").
 Later, when the Hitachi 6309 smuggled similar block-transfer decadence
 into its extensions to the Motorola 6809 (to the excitement of we
 semi-consciously Unix-adjacent OS-9 users) they faced a starker problem,
 because the 6809 didn't wall off interrupts in the same way the 8080 and
 Z80.  They therefore presented the programmer with the novelty of the
 restartable instruction, and a new generation of programmers became
 acquainted with the hard lessons time-sharing minicomputer people were
 familiar with.
 My point in this digression is that, in my opinion, it's tough to hold
 fast to the O(1) tenet you claim for C's core language and to another at
 the same time: the language's identity as a "portable assembly
 language".  Unless every programmer has control over the compiler--and
 they don't--you can't predict when the compiler will emit an O(n) block
 transfer instruction.  You'll just have to look at the disassembly.
 _Maybe_ you can retain purity by...never copying structs.  I don't think
 lint or any other tool ever checked for this.  Your advocacy of this
 tenet is the first time I've heard it presented.
 If you were to suggest to me that most of the time I've spent in my life
 arguing with C advocates was with rotten exemplars of the species and
 therefore was time wasted, I would concede the point.
 There's just so danged _many_ of them...
  Your hobbyhorse awakened one of mine.
 malloc was in v7, before the C standard was written. The standard
 spinelessly buckled to allow malloc(0) to return 0, as some
 implementations gratuitously did. 
 What was the alternative?  There was no such thing as an exception, and
 if a pointer was an int and an int was as wide as a machine address,
 there'd be no way to indicate failure in-band, either.
 If the choice was that or another instance of atoi()'s wincingly awful
 "does this 0 represent an error or successful conversion of a zero
 input?" land mine, ANSI might have made the right choice.
  I can't imagine that any program ever
actually wanted the feature. Now
 it's one more undefined behavior that lurks in thousands of programs. 
 Hoare admitted to only one billion-dollar mistake.  No one dares count
 how many to write in C's ledger.  This was predicted, wasn't it?
 Everyone loved C because it was fast: it was performant, because it
 never met a runtime check it didn't eschew--recall again Kernighan
 punking Pascal on this exact point--and it was quick for the programmer
 to write because it never met a _compile_-time check it didn't eschew.
 C was born as a language for wizards who never made mistakes.
 The problem is that, like James Madison's fictive government of angels,
 such entities don't exist.  The staff of the CSRC itself may have been
 overwhelmingly populated with frank, modest, and self-deprecating
 people--and I'll emphasize here that I'm aware of no accounts that this
 is anything but true--but C unfortunately played a part in stoking a
 culture of pretension among software developers.  "C is a language in
 which wizards program.  I program in C.  Therefore I'm a wizard." is how
 the syllogism (spot the fallacy) went.  I don't know who does more
 damage--the people who believe their own BS, or the ones who know
 they're scamming their colleagues.
  There are two arguments for malloc(0), Most
importantly, it caters for
 a limiting case for aggregates generated at runtime--an instance of
 Kernighan's Law, "Do nothing gracefully". It also provides a way to
 create a distinctive pointer to impart some meta-information, e.g.
 "TBD" or "end of subgroup", distinct from the null pointer, which
 merely denotes absence. 
 I think I might be confused now.  I've frequently seen arrays of structs
 initialized from lists of literals ending in a series of "NULL"
 structure members, in code that antedates or ignores C99's wonderful
 feature of designated initializers for aggregate types.[3]  How does
 malloc(0) get this job done and what benefit does it bring?
 Last time I posted to TUHS I mentioned a proposal for explicit tail-call
 elimination in C.  I got the syntax wrong.  The proposal was "return
 goto;".  The WG14 document number is N2920 and it's by Alex Gilding.
 Good stuff.[4]  I hope we see it in C2y.
 Predictably, I must confess that I didn't make much headway on
 Schiller's 1975 "secure kernel" paper.  Maybe next time.
 Regards,
 Branden
 [1] https://web.cs.dal.ca/~jamie/UWO/C/the10fromHenryS.html
    I can easily imagine that the tenet held at _some_ point in the
    C's history.  It's _got_ to be the reason that the language
    relegates memset() and memcpy() to the standard library (or to the
    programmer's own devise)!  :-O
 [2] Kernighan & Ritchie, _The C Programming Language_, 2nd edition, p. 2
    Having thus admitted the camel's nose to the tent, K&R would have
    done the world a major service by making memset(), or at least
    bzero(), a language feature, the latter perhaps by having "= 0"
    validly apply to an lvalue of non-primitive type.  Okay,
    _potentially_ a major service.  You'd still need the self-regarding
    wizard programmers to bother coding it, which they wouldn't in many
    cases "because speed".  Move fast, break stuff.
    C++ screwed this up too, and stubbornly stuck by it for a long time.
    https://cplusplus.github.io/CWG/issues/178.html
 [3] https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
 [4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920.pdf

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: Minimum Array Sizes in 16 bit C (was Maximum)