Re: Minimum Array Sizes in 16 bit C (was Maximum)

List overview All Threads
Download

newer

older

Proposed release goal: [PATCH]...

Looking for historical vi sources

Douglas McIlroy

29 Sep 2024 29 Sep '24

4:56 p.m.

...

>> malloc(0) isn't undefined behaviour but implementation defined. > > In modern C there is no difference between those two concepts.

...

Can you explain more about your view

There certainly is a difference, but in this case the practical implications are the same: avoid malloc(0). malloc(0) lies at the high end of a range of severity of concerns about implementation-definedness. At the low end are things like the size of ints, which only affects applications that may confront very large numbers. In the middle is the default signedness of chars, which generally may be mitigated by explicit type declarations. For the size of ints, C offers guardrails like INT_MAX. There is no test to discern what an error return from malloc(0) means. Is there any other C construct that implementation-definedness renders useless? Doug

Attachments:

attachment.html (text/html — 1.0 KB)

Show replies by date

Rob Pike

29 Sep 29 Sep

8:29 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Gradually the writers of optimizing compilers have leaned so hard on the implementation-defined and undefined behaviors that, while far from useless, C and C++ have become non-portable and dangerously insecure, as well as often very surprising to the point that the US government arguing against using them. -rob On Mon, Sep 30, 2024 at 2:56 AM Douglas McIlroy < douglas.mcilroy(a)dartmouth.edu> wrote:

...

>> malloc(0) isn't undefined behaviour but implementation defined. > > In modern C there is no difference between those two concepts.

Can you explain more about your view

Rik Farrow

9:13 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

On Sun, Sep 29, 2024 at 1:29 PM Rob Pike <robpike(a)gmail.com> wrote:

...

Gradually the writers of optimizing compilers have leaned so hard on the implementation-defined and undefined behaviors that, while far from useless, C and C++ have become non-portable and dangerously insecure, as well as often very surprising to the point that the US government arguing against using them. > > Thank goodness. I loved C when I encountered it, because the alternative

was Z80 assembler. I loved having structs, because I was lousy at using offsets (off by one so often you'd think I would just have adjusted for being wrong). I'll take the guardrails, please. C: the assault rifle of programming languages... Rik

Rich Salz

10:21 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

...

C and C++ have become non-portable and dangerously insecure, as well as often very surprising to the point that the US government arguing against using them.

I thought their main arguments were to use memory-safe languages. Are you saying the C language can be as safe s go, rust, etc., by language design? (I don't think you are, but the sentence I quoted kinda implies that, at least to me.)

Rob Pike

11:56 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

I'm saying the exact opposite: they are unavoidably unsafe. -rob On Mon, Sep 30, 2024 at 8:21 AM Rich Salz <rich.salz(a)gmail.com> wrote:

...

C and C++ have become non-portable and dangerously insecure, as well as

often very surprising to the point that the US government arguing against using them.

Larry McVoy

30 Sep 30 Sep

12:36 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

...

I'm saying the exact opposite: they are unavoidably unsafe. -rob On Mon, Sep 30, 2024 at 8:21???AM Rich Salz <rich.salz(a)gmail.com> wrote: > C and C++ have become non-portable and dangerously insecure, as well as >> often very surprising to the point that the US government arguing against >> using them. >> > > I thought their main arguments were to use memory-safe languages. Are you > saying the C language can be as safe s go, rust, etc., by language design? > (I don't think you are, but the sentence I quoted kinda implies that, at > least to me.) >

-- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat

Larry McVoy

12:55 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

And one other comment. I've seen all the nay sayers saying this is undefined and that will trip you up, C is full of land mines, etc, etc. I ran a company that developed a product that was orders of magnitude more complex than the v7 kernel (low bar but still) all in C and we had *NONE* of those supposed problems. We were careful to use stuff that worked, I'm "famous" in that company as the guy was viewed as "that was invented after 1980 so Larry won't let us use it". Not true, we used mmap and used POSIX signals, but mostly true. If you stick to the basics, C just works. And is portable, we supported every Unix (even SCO), MacOS, Windows, and all the Linux variants from ARM to IBM mainframes. It was fine. We did ship our own libc, NetBSD's because it had fpush() and we used that to do compression and CRC checking and it got around other crappy libc implementations. If you want to go out of your way to find places where it doesn't, umm, go you I guess, but why go there? I have an existence proof that you can use C sensibly and it is fine. The company made it to 18 years before the open source guys shut us down, not a bad run. All that said, I get it, you want guard rails. You are not wrong, the caliber of programmers these days are nowhere near Bell Labs or Sun or my guys. I'm not sure what I'd do if I were starting over, I'd lean towards C but would take a hard look at Rust. On Sun, Sep 29, 2024 at 05:36:30PM -0700, Larry McVoy wrote:

...

-- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat

Luther Johnson

1:09 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

C# addresses some of the things being discussed here. I've used it, I don't care for it all that much, I prefer straight, not-at-all modern C, but I think there are probably a few dialects over the years (Objective C ?) that have addressed some of these desires for a "better C, but not C++". Do others here have comments on these inspired by C, kind of C-like, but with a few other computer science components, thrown into the language machine ? On 09/29/2024 05:36 PM, Larry McVoy wrote:

...

It doesn't have to be that way, C could be evolved, I built a very C like language (to the point that one of my engineers, who hated the new language on principle, fixed a bug in some diffs that flew by, he thought he was fixing a bug in C). No pointers, reference counted garbage collection, pass by value or reference, switch values could be anything, values, variables, regular expressions, etc. If I had infinite energy and money, I'd fund a gcc dialect of that C. Alas, I don't. But C is very fixable. On Mon, Sep 30, 2024 at 09:56:47AM +1000, Rob Pike wrote: > I'm saying the exact opposite: they are unavoidably unsafe. > > -rob > > > On Mon, Sep 30, 2024 at 8:21???AM Rich Salz <rich.salz(a)gmail.com> wrote: > >> C and C++ have become non-portable and dangerously insecure, as well as >>> often very surprising to the point that the US government arguing against >>> using them. >>> >> I thought their main arguments were to use memory-safe languages. Are you >> saying the C language can be as safe s go, rust, etc., by language design? >> (I don't think you are, but the sentence I quoted kinda implies that, at >> least to me.) >>

Luther Johnson

1:37 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

'Go' is also a pretty C-like advanced C kind of thing. What do Go writers think of it vs. C, for safety, reliability, clarity of expression, etc. ? On 09/29/2024 06:09 PM, Luther Johnson wrote:

...

It doesn't have to be that way, C could be evolved, I built a very C like language (to the point that one of my engineers, who hated the new language on principle, fixed a bug in some diffs that flew by, he thought he was fixing a bug in C). No pointers, reference counted garbage collection, pass by value or reference, switch values could be anything, values, variables, regular expressions, etc. If I had infinite energy and money, I'd fund a gcc dialect of that C. Alas, I don't. But C is very fixable. On Mon, Sep 30, 2024 at 09:56:47AM +1000, Rob Pike wrote: > I'm saying the exact opposite: they are unavoidably unsafe. > > -rob > > > On Mon, Sep 30, 2024 at 8:21???AM Rich Salz <rich.salz(a)gmail.com> > wrote: > >> C and C++ have become non-portable and dangerously insecure, as >> well as >>> often very surprising to the point that the US government arguing >>> against >>> using them. >>> >> I thought their main arguments were to use memory-safe languages. >> Are you >> saying the C language can be as safe s go, rust, etc., by language >> design? >> (I don't think you are, but the sentence I quoted kinda implies >> that, at >> least to me.) >>

ron minnich

3:52 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Data point. I hope this is not too far out of TUHS scope, but ... you asked. In 2010, we at sandia journeyed to Usenix to take Russ's course on Go. At that time, we had created megatux, all written in C, all based on earlier HPC work at LANL, that allowed us to run 80,000 or so Windows VMs on a 400 node cluster, and from there run real malware to study it (and, in one case, fix a bug :-). We got done Russ's course, and on the way home, I said "we're moving it all to Go." Nobody disagreed. We never once regretted that decision. On Sun, Sep 29, 2024 at 6:47 PM Luther Johnson <luther.johnson(a)makerlisp.com> wrote:

...

'Go' is also a pretty C-like advanced C kind of thing. What do Go writers think of it vs. C, for safety, reliability, clarity of expression, etc. ? On 09/29/2024 06:09 PM, Luther Johnson wrote:

It doesn't have to be that way, C could be evolved, I built a very C like language (to the point that one of my engineers, who hated the new language on principle, fixed a bug in some diffs that flew by, he thought he was fixing a bug in C). No pointers, reference counted garbage collection, pass by value or reference, switch values could be anything, values, variables, regular expressions, etc. If I had infinite energy and money, I'd fund a gcc dialect of that C. Alas, I don't. But C is very fixable. On Mon, Sep 30, 2024 at 09:56:47AM +1000, Rob Pike wrote: > I'm saying the exact opposite: they are unavoidably unsafe. > > -rob > > > On Mon, Sep 30, 2024 at 8:21???AM Rich Salz <rich.salz(a)gmail.com> > wrote: > >> C and C++ have become non-portable and dangerously insecure, as >> well as >>> often very surprising to the point that the US government arguing >>> against >>> using them. >>> >> I thought their main arguments were to use memory-safe languages. >> Are you >> saying the C language can be as safe s go, rust, etc., by language >> design? >> (I don't think you are, but the sentence I quoted kinda implies >> that, at >> least to me.) >>

arnold＠skeeve.com

1 Oct 1 Oct

12:43 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Luther Johnson <luther.johnson(a)makerlisp.com> wrote:

...

'Go' is also a pretty C-like advanced C kind of thing. What do Go writers think of it vs. C, for safety, reliability, clarity of expression, etc. ?

I have what to say about the topics in this thread, but I wanted to answer this. I've been working in Go for about two years in my current $DAYJOB. I haven't done as much of it as I'd like. As preface, I've been programming (heavily) in C since 1983 and quite a fair amount in C++ since 1999 or so, with Python in the mix more recently. Overall, I like Go. The freedom from manual memory management takes some getting used to, but is very liberating once you do. I do find it too terse in some cases, mostly in the initialization syntax, and in the use of nested function objects. I also find Go modules to be totally opaque, I don't understand them at all. One thing I'm still getting used to is the scoping with := vs. =. Here's a nasty bug that I just figured out this week. Consider: var ( clientSet *kubernetes.Interface ) func main() { .... // set the global var (we think) clientSet, err := cluster.MakeClient() // or whatever .... } func otherfunc() { // use the global var (but not really) l := clientSet.CoreV1().NetworkPolicyList(ns).Items } In main(), I *think* I'm assigning to the global clientSet so that I can use it later. But because of the 'err' and the :=, I've actually created a local variable that shadows the global one, and in otherfunc(), the global clientSet is still nil. Kaboom! The correct way to write the code is: var err error clientSet, err = cluster.MakeClient() // or whatever "When the light went on, it was blinding." Of course, the Goland IDE actually warns me that this is the case, by changing the color of clientSet in the assignment, but it's an extremely subtle warning, and if you don't hover over it, and you're not paying a lot of attention to the coloring, you miss it. So, I like Go, and for a new project that I wouldn't write in Awk or Python, I would use Go. The time or two I've looked at Rust, it seemed to be just too difficult to learn, as well as still evolving too fast. It does look like Rust will eventually replace C and C++ for new systems level code. We can hope that will be a good thing. My two cents, Arnold

Steffen Nurpmeso

30 Sep 30 Sep

7:12 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Rob Pike wrote in <CAKzdPgwJ7-_BWztOQKiB6h5a+OGwXtefsD47Fq+GDwGGF7N4UA(a)mail.gmail.com>: |On Mon, Sep 30, 2024 at 2:56 AM Douglas McIlroy < |douglas.mcilroy(a)dartmouth.edu> wrote: |>>>> malloc(0) isn't undefined behaviour but implementation defined. |>>> |>>> In modern C there is no difference between those two concepts. |> |>> Can you explain more about your view |> |> There certainly is a difference, but in this case the practical |> implications are the same: avoid malloc(0). malloc(0) lies at the \ |> high end |> of a range of severity of concerns about implementation-definedness. \ |> At the |> low end are things like the size of ints, which only affects applications |> that may confront very large numbers. In the middle is the default |> signedness of chars, which generally may be mitigated by explicit type |> declarations. |> |> For the size of ints, C offers guardrails like INT_MAX. There is no test |> to discern what an error return from malloc(0) means. |> |> Is there any other C construct that implementation-definedness renders |> useless? |Gradually the writers of optimizing compilers have leaned so hard on the |implementation-defined and undefined behaviors that, while far from |useless, C and C++ have become non-portable and dangerously insecure, as |well as often very surprising to the point that the US government arguing |against using them. Never attribute to malice what can adequately be explained by incompetence. is the signature of Poul-Henning Kamp (whose email regarding that cstr list (i did not yet have TM-74-1273-1 aka the C Reference Manual as of 1974-01-15, thank you!) was forwarded by Warren). Only propaganda left everywhere, including here, namely for Rust. But your statement in particular reminds me of "Who's afraid of a big bad optimizing compiler?" of LWN.net, from July 15, 2019. It may be fun to read by some. (Offline, searching surely works.) It ends with Acknowledgments We owe thanks to a surprisingly large number of compiler writers and members of the C and C++ standards committees who introduced us to some of the things a big bad optimizing compiler can do,[.] But i think those eh people there (that US government) surely have been hammered with "memory-safe language" a thousand times, and noone ever told them that even the eldest C can be used in a safe way; on freebsd-hackers (PHK is an early FreeBSD committer) there was one of their bikeshed threads around September 5th, after several CVEs where fixed, and another long time major contributor started a thread saying things like |>|The real takeaway here is that C is no longer sufficient for writing |>|high quality code in the 2020s. Everyone needs to adapt their tools. which (also) i (not FreeBSD, only by heart, maybe) spoke against. He was hailing Option<Box<Vec<u8>>> or Vec::with_capacity(262144) as solutions to the CVEs. (Which, as far as i looked, had nothing really to do with C as such; one "guilty" programmer said, as far as i understood that, the same for "his CVE".) Vectors and string "objects" with (optionally) checked index access and such are uncomfortable, but easy to do, also with C, and right from the start (i said). (P.S.: i miss bit enumerations in C and C++, as compilers get stricter and stricter (you cannot even enum1|enum2 without warnings no more, in i think C23; without cast, of course), but bit flags "can" only come in via preprocessor constants, and are completely unchecked. And enum1|enum2 *i* often have, if some subtype adds flags on top of a basic type, isn't that natural, no support on that front. And cast-less "super class casts". I had not downloaded cstr#108-the_c++_programming_language yet.) --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)

Rich Salz

8:03 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

On Mon, Sep 30, 2024 at 3:12 PM Steffen Nurpmeso <steffen(a)sdaoden.eu> wrote

...

noone ever told them that even the eldest C can be used in a safe way;

Perhaps we have different meanings of the word safe. void foo(char *p) { /* interesting stuff here */ ; free(p); } void bar() { char *p = malloc(20); foo(p); printf("foo is %s\n", p); foo(p); } Why should I have to think about this code when the language already knows what is wrong.

Steffen Nurpmeso

9:15 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Rich Salz wrote in <CAFH29tp4fZR7ct57F-BmyqoJwwRfHkSbiVPS1mj89e-_gzhsHQ(a)mail.gmail.com>: |On Mon, Sep 30, 2024 at 3:12 PM Steffen Nurpmeso <steffen(a)sdaoden.eu> wrote |> noone ever told them that even the eldest C can be used in a safe |> way; | |Perhaps we have different meanings of the word safe. | | void foo(char *p) { /* interesting stuff here */ ; free(p); } | void bar() { char *p = malloc(20); | foo(p); | printf("foo is %s\n", p); | foo(p); |} |Why should I have to think about this code when the language already knows |what is wrong. It can also be used in an unsafe way? O-ha. Sounds dangerous in my ears (given human behaviour in particular). I was so focused on winning against Putin, .. that i did realize this context. (Btw i like The Cures new song "Alone", after 14 years!, and its accompanying thrilling mega mega boom boom boom video.) P.S.: In the real world i would now, as the conservative and integer (up against the floating-point!) person that i am point "you" to C++, and simply pass a string object. But let us pass a reference so we gain compile-time non-NIL "meat" (vegan!!), and impressive state-of-the-art speed characteristics. Human behaviour can destroy everything: faster so with C++! --End of <CAFH29tp4fZR7ct57F-BmyqoJwwRfHkSbiVPS1mj89e-_gzhsHQ(a)mail.gmail\ .com> Btw i hate these random message-ids, they reveal nothing! Like in all those rooms i never visit. I am all for the wonderful Klaus von Dohnanyi, now also 96 years old, and his saying "Ach bitte, sagen Sie nicht Chatroom, sagen Sie Plauderstübchen" ("Oh please, do not say chatroom, just say [little talk chamber][ʃtyːpçən]" [1] (needs scripting-enabled browser). [1] https://translate.google.com/?sl=auto&tl=en&text=Plauderst%C3%BCbch… --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)

Bakul Shah

10:14 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

On Sep 30, 2024, at 1:03 PM, Rich Salz <rich.salz(a)gmail.com> wrote:

...

On Mon, Sep 30, 2024 at 3:12 PM Steffen Nurpmeso <steffen(a)sdaoden.eu> wrote noone ever told them that even the eldest C can be used in a safe way; Perhaps we have different meanings of the word safe. void foo(char *p) { /* interesting stuff here */ ; free(p); } void bar() { char *p = malloc(20); foo(p); printf("foo is %s\n", p); foo(p); } Why should I have to think about this code when the language already knows what is wrong.

The language doesn't know! The compiler can't know without the programmer indicating this somehow, especially if foo() is an extern function. I am still interested in making C safer (to secure as best as possible all the zillions of lines of C code in OS kernels). The question is, can we retrofit safety features into C without doing major violence to it & turning it into an ugly mess? No idea if this is even feasible but seems worth exploring with possibly a great ROI. Take the above code as an example. It is "free()" that invalidates the value of its argument on return and this property is then inherited by its callers. One idea is to declare free as follows: void free(`zap void*ptr); // `zap is says *ptr will be invalid on return Now a compiler can see and complain that foo() will break this and insist that foo() too must express the same thing. So we change it to void foo(`zap char* p) { ... free(p); } Now the compiler knows p can't dereferenced after calling foo() and can complain on seeing p being printf'ed. This was just an example of an idea -- remains to be seen if it amounts to anything useful. In an earlier email I had explored bounds checking. Clearly all such extensions would have to play well together as well as with the existing language. My hope is that the language can be evolved in this way and gradually kernel code can be fixed up to benefit from it. Bakul

Alexis

1 Oct 1 Oct

1:42 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Bakul Shah via TUHS <tuhs(a)tuhs.org> writes:

...

I am still interested in making C safer (to secure as best as possible all the zillions of lines of C code in OS kernels). The question is, can we retrofit safety features into C without doing major violence to it & turning it into an ugly mess? No idea if this is even feasible but seems worth exploring with possibly a great ROI.

Related: Ten years ago, Pascal Cuoq, Matthew Flatt, and John Regehr proposed "Friendly C":

...

We are not trying to fix the deficiencies of the C language nor making an argument for or against C. Rather, we are trying rescue the predictable little language that we all know is hiding within the C standard. This language generates tight code and doesn’t make you feel like the compiler is your enemy. We want to decrease the rate of bit rot in existing C code and also to reduce the auditing overhead for safety-critical and security-critical C code. The intended audience for -std=friendly-c is people writing low-level systems such as operating systems, embedded systems, and programming language runtimes. These people typically have a good guess about what instructions the compiler will emit for each line of C code they write, and they simply do not want the compiler silently throwing out code. If they need code to be faster, they’ll change how it is written.

-- https://blog.regehr.org/archives/1180 Some of the concrete features proposed included:

...

1. The value of a pointer to an object whose lifetime has ended remains the same as it was when the object was alive. 2. Signed integer overflow results in two’s complement wrapping behavior at the bitwidth of the promoted type. 3. Shift by negative or shift-past-bitwidth produces an unspecified result.

i seem to recall there have been other proposals in the vein of "Friendly C", but they're not coming to mind right now. Alexis.

Rik Farrow

30 Sep 30 Sep

8:14 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

This is the 'problem' with C/C++: it's not the language itself so much as the people who are allowed, or forced, to use it. Many, if not all, of the people on this list have worked with great programmers, when most programmers are average at best. I saw some terrible things back when doing technical sales support for a startup selling a graphics library with C bindings. I came away convinced that most of the 'programmers' I was training were truly clueless. Rik On Mon, Sep 30, 2024 at 12:12 PM Steffen Nurpmeso

...

Never attribute to malice what can adequately be explained by incompetence. is the signature of Poul-Henning Kamp (whose email regarding that cstr list <snip>

Steffen Nurpmeso

10 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Rik Farrow wrote in <CACY3YMHzg+6U_zTuhMTORgfh_Kse6MTspaGDfuUjXb4vLvV9mw(a)mail.gmail.com>: |On Mon, Sep 30, 2024 at 12:12 PM Steffen Nurpmeso | || Never attribute to malice what can adequately be explained by || incompetence. | ||is the signature of Poul-Henning Kamp (whose email regarding that ||cstr list <snip> |This is the 'problem' with C/C++: it's not the language itself so much as |the people who are allowed, or forced, to use it. Many, if not all, of the |people on this list have worked with great programmers, when most |programmers are average at best. I saw some terrible things back when doing |technical sales support for a startup selling a graphics library with C |bindings. I came away convinced that most of the 'programmers' I was |training were truly clueless. I cannot comment on that, humans "get a good job with more pay you are ok", have interests here and there, have (unfulfilled) desires, problems with family or partner, aka "that sex machine", and so in the end one can be lucky if that was not Jeffrey Dahmer or something, and you do not end up as canned. "Or forced", yes! Here on this lists are (me aside) intellectual but especially witty people who love(d) their (likely) even-more-than-a-job, at the "top of the pyramid", Mr. McIlroy just again remembered an impressive scenario of how these people were (and are) self-driving up that spiral staircase with nothing but the help of their mind and a free library .. which was available for them. Other than that, you know, there are plenty of languages with plenty of support (syntax checks, sanitizers, "debug stuff"), far beyond vim(1), let that start with JAVA (documented pretty well right from the start as far as i know), and all the other options that arose since then, and in parts are used. I personally "go on the gums" if i have to work with OpenSSL, and image processing is no fun either, so it could be it was me who drove you down... (And complexity is never easy, i think, to noone.) --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)

Dan Cross

1 Oct 1 Oct

12:53 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

On Mon, Sep 30, 2024 at 4:22 PM Rik Farrow <rik(a)rikfarrow.com> wrote:

...

This is the 'problem' with C/C++: it's not the language itself so much as the people who are allowed, or forced, to use it.

Programmer ability is certainly an issue, but I would suggest that another goes back to what Rob was alluding to: compiler writers have taken too much advantage of UB, making it difficult to write well-formed programs that last. The `realloc` function I mentioned earlier is a good case in point; the first ANSI C standard says this: "If ptr is a null pointer, the realloc function behaves like the malloc function for the specified size. ... If size is zero and ptr is not a null pointer, the object it points to is freed." While the description of `malloc` doesn't say thing about what happens when `size` is 0, perhaps making `realloc(0, NULL)` nominally UB (??), the behavior of `realloc(0, ptr)` is clearly well defined when `ptr` is not nil, and it's entirely possible that programs were written with that well-defined behavior as an assumption. (Worth mentioning is that this language was changed in C99, and implementations started differing from there.) But now, C23 has made `realloc(0, ptr)` UB, regardless of the value of `ptr`, and since compiler writers have given themselves license to take an extremely broad view of what they can do if a program exhibits UB, programs that were previously well-defined with respect to C90 may well stop working properly when compiled with modern compilers. I don't think this is a hypothetical; C programs that appear to be working as expected for years have, and will continue, to suddenly break when compiled with a newer compiler, because the programmer tripped a UB trigger somewhere along the way, likely without even recognizing it. Moreover, I don't believe that there are any non-trivial C programs out there that don't have such timebombs lurking throughout. How could they not, if things that were previously well-defined can become UB in subsequent revisions of the standard? Perhaps I've mentioned it before, but a great example of the surprising nature of UB is the following program: unsigned short mul(unsigned short a, unsigned short b) { return a * b; } Is this tiny function always well-defined? Sadly, no, at least not on most common platforms where `int` is 32 bits and `short` is 16. On such platforms, the "usual arithmetic conversions" will kick in before the multiplication, and the values will be converted to _signed_ ints and _then_ multiplied; the product will then be converted back to `unsigned short`. And while the type conversion process both ways is well-defined, there exist values a,b of type unsigned short so that a*b will overflow a signed 32-bit int (consider 0xffff*0xffff), and signed integer overflow is UB; a compiler would be well within its rights to assume that such overflow can never occur and generate, say, a saturating multiplication instruction if it so chose. This would work, be perfectly legal, and almost certainly be surprising to the programmer. The fix is simple, of course: unsigned short mul(unsigned short a, unsigned short b) { unsigned int aa = a, bb = b; return aa * bb; } But one would have to know to write such a thing in the first place.

...

Many, if not all, of the people on this list have worked with great programmers, when most programmers are average at best. I saw some terrible things back when doing technical sales support for a startup selling a graphics library with C bindings. I came away convinced that most of the 'programmers' I was training were truly clueless.

My sense is that tossing in bad programmers is just throwing gasoline onto a dumpster fire. Particularly when they look to charlatans like Robert Martin or Allen Holub as sources of education and inspiration instead of seeking out proper sources of education. - Dan C.

Anton Shepelev

18 Nov 18 Nov

noon

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Dan Cross <crossd(a)gmail.com> wrote:

...

Following the letter, rather than the spirit, of the standard?

...

The `realloc` function I mentioned earlier is a good case in point; the first ANSI C standard says this: "If ptr is a null pointer, the realloc function behaves like the malloc function for the specified size. ... If size is zero and ptr is not a null pointer, the object it points to is freed." While the description of `malloc` doesn't say thing about what happens when `size` is 0, perhaps making `realloc(0, NULL)` nominally UB (??), the behavior of `realloc(0, ptr)` is clearly well defined when `ptr` is not nil, and it's entirely possible that programs were written with that well-defined behavior as an assumption. (Worth mentioning is that this language was changed in C99, and implementations started differing from there.) But now, C23 has made `realloc(0, ptr)` UB, regardless of the value of `ptr`

Something similar happened with so-called strict aliasing, when the compilers started assuming pointers to incompatible types as always pointer to different non-overlapping locations: See, e.g.: <https://www.geeksforgeeks.org/strict-aliasing-rule-in-c-with-examples/> Linus ranterd about it: <https://lkml.org/lkml/2018/6/5/769>

...

I am a bad one as well, to have liked some things in Martin's books /Clean Code/ and /Clean Architecture/ . True, heis no Wirth, nor Dijxtra, nor Knuth, but why a charlatan?

Luther Johnson

12:46 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

C, Lisp, and probably many other languages have gone through a similar historical arc - first they are designed to solve problems, and be a useful and powerful means of expression - then they become official, standardized, and legalistic - then they become commercially competitive, and leverage the legalisms, benchmarks, and other collateral that has accrued - at this point, generations later, the language is evolving with no appreciation or understanding of the aesthetic and practical principles of the original language effort. My old-man-grousing for today. On 11/18/2024 05:00 AM, Anton Shepelev wrote:

...

Dan Cross <crossd(a)gmail.com> wrote:

Following the letter, rather than the spirit, of the standard?

I am a bad one as well, to have liked some things in Martin's books /Clean Code/ and /Clean Architecture/ . True, heis no Wirth, nor Dijxtra, nor Knuth, but why a charlatan?

Steve Nickolas

2:05 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

On Mon, 18 Nov 2024, Luther Johnson wrote:

...

I feel like most of the changes to C after C89 were a waste. Apart from stdint.h, I mostly keep to coding in strict C89. -uso.

Anton Shepelev

3 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Steve Nickolas <usotsuki(a)buric.co> wrote:

...

I feel like most of the changes to C after C89 were a waste. Apart from stdint.h, I mostly keep to coding in strict C89.

And then I must not only specfy std-c89 in GCC, but add --pedantic, too. There is even a --pedantic club: <https://pedantic.software/> .

Alexander Schreiber

23 Nov 23 Nov

10:29 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

On Mon, Nov 18, 2024 at 03:00:22PM -0000, Anton Shepelev wrote:

...

Steve Nickolas <usotsuki(a)buric.co> wrote:

I feel like most of the changes to C after C89 were a waste. Apart from stdint.h, I mostly keep to coding in strict C89.

And then I must not only specfy std-c89 in GCC, but add --pedantic, too. There is even a --pedantic club: <https://pedantic.software/> .

Snippet from a Makefile for a small university exercise of mine from decades ago: ---- snip ----- # flags for debugging DEBUG = -g -DDEBUG # complain about everything suspicious ANAL_RETENTIVE = -ansi -Wall -pedantic -Wtraditional -Wpointer-arith -Werror -Wshadow -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wcast-qual -Waggregate-return -Wmissing-declarations -Wnested-externs -Winline # use this for development CC = gcc ${DEBUG} ${ANAL_RETENTIVE} ---- snip ----- Yes, I was keenly aware that I wasn't exactly "God's gift to C Programming" and tried to get as much help from the compiler to avoid obvious mistakes as possible. Kind regards, Alex. -- "Opportunity is missed by most people because it is dressed in overalls and looks like work." -- Thomas A. Edison

Anton Shepelev

18 Nov 18 Nov

2:55 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Luther Johnson <luther.johnson(a)makerlisp.com> wrote:

...

Yes, big money makes things strive not for excellence but for accessibility and a low, if not negative, entry threshold. When a language ends up catering to the majority, or developed by a commitee instead of a close-knit yet open society of loving and caring enthusiasts, it undergoes the process above on the way to become usable by volatile teams of poor programmers indoctrinated with doing as everybody else does in order to be understood (which means blingliy following whatver standards and practices are enforced), with any aspiration burned out of actualy inventing things or least experimenting creatively.

G. Branden Robinson

4:52 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

At 2024-11-18T14:55:55-0000, Anton Shepelev wrote:

...

Yes, big money makes things strive not for excellence but for accessibility and a low, if not negative, entry threshold.

I think that's an overgeneralization. The Ada language, for example, drew much wariness and even criticism for being funded by the U.S. Department of Defense, probably the most profligate spender in the history of mankind,[1] and at the same time was condemned for being too hard a language to grasp (too "big") and too hard to write a compiler for. Further, I would argue that Ada was indeed an excellent language, certainly for its time and arguably still. But it was not easy to acquire by programmers who took an absolutely slovenly attitude toward data type discipline, a characterization that fits many pre-ANSI C programmers perfectly. Perhaps those who learn how to manage data types using C as their first language suffer irrevocable brain damage, and are fit subjects for pity and mockery. You won't find _that_ opinion in the Jargon File. Regards, Branden [1] https://www.npr.org/2021/05/19/997961646/the-pentagon-has-never-passed-an-a…

Anton Shepelev

5 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

G. Branden Robinson <g.branden.robinson(a)gmail.com> wrote:

...

At 2024-11-18T14:55:55-0000, Anton Shepelev wrote:

Yes, big money makes things strive not for excellence but for accessibility and a low, if not negative, entry threshold.

I have nothing against Ada myself, but was referring the popular languages in demand on the market. I think Julia is harder than Ada.

Luther Johnson

6:56 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

I think both Ada and VHDL, started formally, and proceeded formally. I don't have experience with either of them, but from the outside they appear to be competent and rational. I was mostly commenting on early C and Lisp (or Verilog is another example, although I think Verilog has stayed pretty close to the original versions), of being less formal, with more implicit defaults, that just do pretty much what you would expect them to, which gives a great deal of concise leverage. Some of that gets lost when the language gets more formally "correct" or "complete", or you have to be more verbose, or maybe just stay away from certain constructs altogether, if you're not willing to qualify to the nth degree. And so, the later variants become less useful (to me) than the original versions. It's true that the first, informal versions of these languages, do require us to understand, and buy in to, the mindset of the language inventors. But with the later versions, you have to devour 1000 pages of detail from the standards committee. Which gets us to working, productive code faster ? Some prefer one path, some prefer another, but I like to learn the point of view and approach of the original authors, to me it's like reading a novel compared to reading an encyclopedia. On 11/18/2024 09:52 AM, G. Branden Robinson wrote:

...

At 2024-11-18T14:55:55-0000, Anton Shepelev wrote:

Yes, big money makes things strive not for excellence but for accessibility and a low, if not negative, entry threshold.

Dan Cross

22 Nov 22 Nov

1:53 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

[TUHS to Bcc:, Cc: COFF] On Mon, Nov 18, 2024 at 11:47 AM Anton Shepelev <anton.txt(a)gmail.com> wrote:

...

Dan Cross <crossd(a)gmail.com> wrote:

Following the letter, rather than the spirit, of the standard?

Pretty much!

...

[snip]

I am a bad one as well, to have liked some things in Martin's books /Clean Code/ and /Clean Architecture/ . True, heis no Wirth, nor Dijxtra, nor Knuth, but why a charlatan?

Briefly, because he writes with unwarranted confidence, and just isn't a very good programmer himself. He writes with an authoritative voice about things that he doesn't know very much, if anything, about. For example, the things he's written about static typing in programming languages are complete nonsense. Sriram Krishnamurthi called him out on that (https://x.com/ShriramKMurthi/status/1136411753590472707) and he did not respond well, doubling down on his unfounded opinions (https://blog.cleancoder.com/uncle-bob/2019/06/08/TestsAndTypes.html) Later, he justified his opinion by making allusions to the amount of time he's been programming (https://blog.cleancoder.com/uncle-bob/2021/06/25/OnTypes.html) Hey, when it comes to logical fallacies centered on appeals to length of experience, well...I swooshed a basketball for the first time more than 40 years ago, but I've given up any dream I may have ever had of being a point guard in the NBA. Just doing something for a long time doesn't mean you're good at it. Robert Martin doesn't write production-quality code, period. He claims to "ship" lots of code, but acknowledges that most of that is example code for his books and personal side-projects. But the code examples he has publicly available are not particularly well-structured, readable, or maintainable. For a particular egregious example, see https://github.com/unclebob/PDP8EmulatorIpad/blob/1eba53c08fb530effb9d29aca… (not the current commit; he modified it somewhat after I sent him https://github.com/unclebob/PDP8EmulatorIpad/commit/dbfa03e90a084a25992dff7…, which he did not acknowledge; see https://github.com/unclebob/PDP8EmulatorIpad/pull/2/commits/84483cd4d60320c… for the timeline). And while that small program is a particularly bad example, other bits of his code are also bad. Ousterhout was asked to comment on his "extract till you drop" approach and presented with a "refactoring" Martin did of a program due to Knuth (https://sites.google.com/site/unclebobconsultingllc/one-thing-extract-till-…) Ousterhout responded that he was "bewildered and horrified" by the approach. As Ousterhout put it, "He has taken 25 lines of code that are pretty straightforward and easy to understand, and turned them into 38 lines with 9 methods, none of which has a stitch of documentation. What was the point of this?" (https://groups.google.com/g/software-design-book/c/Kb5K3YcjIXw/m/qN8txMeOCA…) These are all typical of Martin's approach. Hence why I say the man is a charlatan. Others have written at length about why, and how, his advice is generally bad. - Dan C.

Luther Johnson

2:55 a.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

"Appeal to Authority", or "Argument from Authority", one of the fallacious argument methods. I think there are many "experts" who learn just enough about a topic to write a book about it, the idea seems to be, to become an "authority" on a topic, to sell more books and gain more authority and notoriety. I've read many good books by brilliant authors who not only could write about certain topics, but were actually some of the inventors and developers, who then explained it all to us so we could make more things practically work. And then there are slapped-together books of technically trendy slogans and simplistic contrived examples. I like to read books in the former category, the latter, not so much. We can tell the difference by whether the books give us anything we can use - and this is usually more in the realm of ideas, rather than code snippets or "design patterns". The way I like to think about things, anyway. On 11/21/2024 06:53 PM, Dan Cross wrote:

...

[TUHS to Bcc:, Cc: COFF] On Mon, Nov 18, 2024 at 11:47 AM Anton Shepelev <anton.txt(a)gmail.com> wrote:

Dan Cross <crossd(a)gmail.com> wrote:

Following the letter, rather than the spirit, of the standard?

Pretty much!

[snip]

I am a bad one as well, to have liked some things in Martin's books /Clean Code/ and /Clean Architecture/ . True, heis no Wirth, nor Dijxtra, nor Knuth, but why a charlatan?

Ralph Corderoy

29 Sep 29 Sep

9:24 p.m.

New subject: Minimum Array Sizes in 16 bit C (was Maximum)

Hi Doug,

...

> > malloc(0) isn't undefined behaviour but implementation defined. > > In modern C there is no difference between those two concepts.

There certainly is a difference, but in this case the practical implications are the same: avoid malloc(0).

Many programs wrap malloc() in some way, even if it's just to exit on failure, so working around malloc(0) is easy enough. void *saneloc(size_t size) { void *p = malloc(size); if (p || size) return p; return malloc(1); }

...

In the middle is the default signedness of chars, which generally may be mitigated by explicit type declarations.

Similarly, the signedness of an ‘int i: 3’ bit-field. (1) Whether a "plain" int bit-field is treated as a signed int bit-field or as an unsigned int bit-field (6.7.2, 6.7.2.1).

...

Is there any other C construct that implementation-definedness renders useless?

There's the '-' in "%[3-7]" for fscanf(3). (35) The interpretation of a − character that is neither the first nor the last character, nor the second where a ^ character is the first, in the scanlist for %[ conversion in the fscanf or fwscanf function (7.23.6.2, 7.31.2.1). -- Cheers, Ralph.

258

days inactive

313

days old

tuhs@tuhs.org

Manage subscription

30 comments

17 participants

tags (0)

participants (17)

Alexander Schreiber
Alexis
Anton Shepelev
arnold＠skeeve.com
Bakul Shah
Dan Cross
Douglas McIlroy
G. Branden Robinson
Larry McVoy
Luther Johnson
Ralph Corderoy
Rich Salz
Rik Farrow
Rob Pike
ron minnich
Steffen Nurpmeso
Steve Nickolas