Here is some code from typo.
int table[2]; /*keep these four cards in order*/
int tab1[26];
int tab2[730];
char tab3[19684];
...
er = read(salt,table,21200);
Note the use of the word 'card'.
The past is a different country.
-rob
On Sat, Sep 21, 2024 at 7:07 AM Warner Losh <imp(a)bsdimp.com> wrote:
On Fri, Sep 20, 2024 at 9:16 PM Bakul Shah via TUHS <tuhs(a)tuhs.org> wrote:
You are a bit late with your screed. You will
find posts
with similar sentiments starting back in 1980s in Usenet
groups such as comp.lang.{c,misc,pascal}.
Perhaps a more interesting (but likely pointless) question
is what is the *least* that can be done to fix C's major
problems.
Compilers can easily add bounds checking for the array[index]
construct but ptr[index] can not be checked, unless we make
a ptr a heavy weight object such as (address, start, limit).
One can see how code can be generated for code such as this:
Foo x[count];
Foo* p = x + n; // or &x[n]
Code such as "Foo *p = malloc(size);" would require the
compiler to know how malloc behaves to be able to compute
the limit. But for a user to write a similar function will
require some language extension.
[Of course, if we did that, adding proper support for
multidimensional slices would be far easier. But that
is an exploration for another day!]
The CHERI architecture extensions do this. It pushes this info into
hardware
where all pointers point to a region (gross simplification) that also
grant you
rights the area (including read/write/execute). It's really cool, but it
does come
at a cost in performance. Each pointer is a pointer, and a capacity that's
basically
a cryptographically signed bit of data that's the bounds and access
permissions
associated with the pointer. There's more details on their web site:
https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/
CHERI-BSD is a FreeBSD variant that runs on both CHERI variants (aarch64
and
riscv64) and where most of the research has been done. There's also a
Linux
variant as well.
Members of this project know way too many of the corner cases of the C
language
from porting most popular software to the CHERI... And have gone on
screeds of
their own. The only one I can easily find is
https://people.freebsd.org/~brooks/talks/asiabsdcon2017-helloworld/hellowor…
Warner
> Converting enums to behave like Pascal scalars would
> likely break things. The question is, can such breakage
> be fixed automatically (by source code conversion)?
>
> C's union type is used in two different ways: 1: similar
> to a sum type, which can be done type safely and 2: to
> cheat. The compiler should produce a warning when it can't
> verify a typesafe use -- one can add "unsafe" or some such
> to let the user absolve the compiler of such check.
>
> [May be naively] I tend to think one can evolve C this way
> and fix a lot of code &/or make a lot of bugs more explicit.
>
> > On Sep 20, 2024, at 10:11 AM, G. Branden Robinson <
> g.branden.robinson(a)gmail.com> wrote:
> >
> > At 2024-09-21T01:07:11+1000, Dave Horsfall wrote:
> >> Unless I'm mistaken (quite possible at my age), the OP was referring
> >> to that in C, pointers and arrays are pretty much the same thing i.e.
> >> "foo[-2]" means "take the pointer 'foo' and go back
two things"
> >> (whatever a "thing" is).
> >
> > "in C, pointers and arrays are pretty much the same thing" is a
common
> > utterance but misleading, and in my opinion, better replaced with a
> > different one.
> >
> > We should instead say something more like:
> >
> > In C, pointers and arrays have compatible dereference syntaxes.
> >
> > They do _not_ have compatible _declaration_ syntaxes.
> >
> > Chapter 4 of van der Linden's _Expert C Programming_: Deep C Secrets_
> > (1994) tackles this issue head-on and at length.
> >
> > Here's the salient point.
> >
> > "Consider the case of an external declaration `extern char *p;` but a
> > definition of `char p[10];`. When we retrieve the contents of `p[i]`
> > using the extern, we get characters, but we treat it as a pointer.
> > Interpreting ASCII characters as an address is garbage, and if you're
> > lucky the program will coredump at that point. If you're not lucky it
> > will corrupt something in your address space, causing a mysterious
> > failure at some point later in the program."
> >
> >> C is just a high level assembly language;
> >
> > I disagree with this common claim too. Assembly languages correspond to
> > well-defined machine models.[1] Those machine models have memory
> > models. C has no memory model--deliberately, because that would have
> > gotten in the way of performance. (In practice, C's machine model was
> > and remains the PDP-11,[2] with aspects thereof progressively sanded off
> > over the years in repeated efforts to salvage the language's reputation
> > for portability.)
> >
> >> there is no such object as a "string" for example: it's just
an "array
> >> of char" with the last element being "\0" (viz:
"strlen" vs. "sizeof".
> >
> > Yeah, it turns out we need a well-defined string type much more
> > powerfully than, it seems, anyone at the Bell Labs CSRC appreciated.
> > string.h was tacked on (by Nils-Peter Nelson, as I understand it) at the
> > end of the 1970s and C aficionados have defended the language's
> > purported perfection with such vigor that they annexed the haphazardly
> > assembled standard library into the territory that they defend with much
> > rhetorical violence and overstatement. From useless or redundant return
> > values to const-carelessness to Schlemiel the Painter algorithms in
> > implementations, it seems we've collectively made every mistake that
> > could be made with Nelson's original, minimal API, and taught those
> > mistakes as best practices in tutorials and classrooms. A sorry affair.
> >
> > So deep was this disdain for the string as a well-defined data type, and
> > moreover one conceptually distinct from an array (or vector) of integral
> > types that Stroustrup initially repeated the mistake in C++. People can
> > easily roll their own, he seemed to have thought. Eventually he thought
> > again, but C++ took so long to get standardized that by then, damage was
> > done.
> >
> > "A string is just an array of `char`s, and a `char` is just a
> > byte"--another hasty equivalence that surrendered a priceless hostage to
> > fortune. This is the sort of fallacy indulged by people excessively
> > wedded to machine language programming and who apply its perspective to
> > every problem statement uncritically.
> >
> > Again and again, with signed vs. unsigned bytes, "wide" vs.
"narrow"
> > characters, and "base" vs. "combining" characters, the
champions of the
> > "portable assembly" paradigm charged like Lord Cardigan into the pike
> > and musket lines of the character type as one might envision it in a
> > machine register. (This insistence on visualizing register-level
> > representations has prompted numerous other stupidities, like the use of
> > an integral zero at the _language level_ to represent empty, null, or
> > false literals for as many different data types as possible. "If it
> > ends up as a zero in a register," the thinking appears to have gone,
"it
> > should look like a zero in the source code." Generations of code--and
> > language--cowboys have screwed us all over repeatedly with this hasty
> > equivalence.
> >
> > Type theorists have known better for decades. But type theory is (1)
> > hard (it certainly is, to cowboys) and (2) has never enjoyed a trendy
> > day in the sun (for which we may be grateful), which means that is
> > seldom on the path one anticipates to a comfortable retirement from a
> > Silicon Valley tech company (or several) on a private yacht.
> >
> > Why do I rant so splenetically about these issues? Because the result
> > of such confusion is _bugs in programs_. You want something concrete?
> > There it is. Data types protect you from screwing up. And the better
> > your data types are, the more care you give to specifying what sorts of
> > objects your program manipulates, the more thought you give to the
> > invariants that must be maintained for your program to remain in a
> > well-defined state, the fewer bugs you will have.
> >
> > But, nah, better to slap together a prototype, ship it, talk it up to
> > the moon as your latest triumph while interviewing with a rival of the
> > company you just delivered that prototype to, and look on in amusement
> > when your brilliant achievement either proves disastrous in deployment
> > or soaks up the waking hours of an entire team of your former colleagues
> > cleaning up the steaming pile you voided from your rock star bowels.
> >
> > We've paid a heavy price for C's slow and seemingly deeply grudging
> > embrace of the type concept. (The lack of controlled scope for
> > enumeration constants is one example; the horrifyingly ill-conceived
> > choice of "typedef" as a keyword indicating _type aliasing_ is
another.)
> > Kernighan did not help by trashing Pascal so hard in about 1980. He was
> > dead right that Pascal needed, essentially, polymorphic subprograms in
> > array types. Wirth not speccing the language to accommodate that back
> > in 1973 or so was a sad mistake. But Pascal got a lot of other stuff
> > right--stuff that the partisanship of C advocates refused to countenance
> > such that they ended up celebrating C's flaws as features. No amount of
> > Jonestown tea could quench their thirst. I suspect the truth was more
> > that they didn't want to bother having to learn any other languages.
> > (Or if they did, not any language that anyone else on their team at work
> > had any facility with.) A rock star plays only one instrument, no?
> > People didn't like it when Eddie Van Halen played keyboards instead of
> > guitar on stage, so he stopped doing that. The less your coworkers
> > understand your work, the more of a genius you must be.
> >
> > Now, where was I?
> >
> >> What's the length of "abc" vs. how many bytes are needed to
store it?
> >
> > Even what is meant by "length" has several different correct answers!
> > Quantity of code points in the sequence? Number of "grapheme
clusters"
> > a.k.a. "user-perceived characters" as Unicode puts it? Width as
> > represented on the output device? On an ASCII device these usually had
> > the same answer (control characters excepted). But even at the Bell
> > Labs CSRC in the 1970s, thanks to troff, the staff knew that they didn't
> > necessarily have to. (How wide is an em dash? How many bytes represent
> > it, in the formatting language and in the output language?)
> >
> >> Giggle... In a device driver I wrote for V6, I used the expression
> >>
> >> "0123"[n]
> >>
> >> and the two programmers whom I thought were better than me had to ask
> >> me what it did...
> >>
> >> -- Dave, brought up on PDP-11 Unix[*]
> >
> > I enjoy this application of that technique, courtesy of Alan Cox.
> >
> > fsck-fuzix: blow 90 bytes on a progress indicator
> >
> > static void progress(void)
> > {
> > static uint8_t progct;
> > progct++;
> > progct&=3;
> > printf("%c\010", "-\\|/"[progct]);
> > fflush(stdout);
> > }
> >
> >> I still remember the days of BOS/PICK/etc, and I staked my career on
> >> Unix.
> >
> > Not a bad choice. Your exposure to and recollection of other ways of
> > doing things, I suspect, made you a more valuable contributor than those
> > who mazed themselves with thoughts of "the Unix way" to the point
that
> > they never seriously considered any other.
> >
> > It's fine to prefer "the C way" or "the Unix way", if
you can
> > intelligibly define what that means as applied to the issue in dispute,
> > and coherently defend it. Demonstrating an understanding of the
> > alternatives, and being able to credibly explain why they are inferior
> > approaches, is how to do advocacy correctly.
> >
> > But it is not the cowboy way. The rock star way.
> >
> > Regards,
> > Branden
> >
> > [1] Unfortunately I must concede that this claim is less true than it
> > used to be thanks to the relentless pursuit of trade-secret means of
> > optimizing hardware performance. Assembly languages now correspond,
> > particularly on x86, to a sort of macro language that imperfectly
> > masks a massive amount of microarchitectural state that the
> > implementors themselves don't completely understand, at least not in
> > time to get the product to market. Hence the field day of
> > speculative execution attacks and similar. It would not be fair to
> > say that CPUs of old had _no_ microarchitectural state--the Z80, for
> > example, had the not-completely-official `W` and `Z` registers--but
> > they did have much less of it, and correspondingly less attack
> > surface for screwing your programs. I do miss the days of
> > deterministic cycle counts for instruction execution. But I know
> > I'd be sad if all the caches on my workaday machine switched off.
> >
> > [2]
https://queue.acm.org/detail.cfm?id=3212479
>
>