[TUHS] Re: yet another C discussion (YACD) and: Rust is not C++

31 Jan 2023

That example was a simplified bit of code from a widely used code base. All
I need to do is change the function g go a pointer to function, or have it
be provided by a .so, and all bets are off.
In any event, the important thing here is not that y should be initialized,
or should not; it's that it is not possible to get a consistent answer on
the question, from people who have been writing in C for decades.
ron
On Mon, Jan 30, 2023 at 6:56 PM Dan Cross &lt;crossd(a)gmail.com&gt; wrote:
...
  On Mon, Jan 30, 2023 at 8:49 PM Alejandro Colomar
 &lt;alx.manpages(a)gmail.com&gt; wrote:
  Hello Ron,
 On 1/30/23 20:35, ron minnich wrote:
 > I don't know how many ways there are to say this, but Rust and C/C++  are
  > fundamentally different at the lowest level.
 >
 > If you are just looking at Rust syntax in a superficial way, you might  be
  > excused for thinking it's "C with
features / C++ with differences."
 >
 > But that's not how it is. It's like saying C is "just like
assembly"  because
  > labels have a ':' in them; or that
Unix is "just like RSX" because  they have
  > vaguely similar commands.
 >
 > Here's a real question that came up where I work: should the code  shown
below be
  > accepted (this is abstracted from a real
example that is in use ...  everywhere)?
  > We had one code analyzer that said,
emphatically, NO; one person said  YES,
   another
MAYBE. One piece of code, 3 answers :-)
 char f() {
     char *y;
     g(&y);
     return *y;
 }
 A specific question: should y be initialized to NULL? 
 No.  At least not if you don't want to use the value NULL in your  program.
  Using NULL as something to avoid Undefined
Behavior is wrong, and it will
 contribute to hide programmer errors. 
 Sorry, I think this misses the point: how do you meaningfully tell
 that `g` did something to `y` so that it's safe to indirect in the
 `return`?
 On the other hand, one could write,
 char f() {
     char *y = NULL;
     g(&y);
     if (y == NULL)
         panic("g failed");
     return *y;
 }
 C, of course, can't tell in the original. And while you can now tell
 that `g` did _something_ to `y`, you still really don't know that `y`
 points to something valid.
  These days, compilers and static analyzers are
smart enough to detect
 uninitialized variables, even across Translation Units, and throw an  error,
  letting the programmer fix such bugs, when they
occur. 
 In many cases, yes, but not in all. That would be equivalent to
 solving the halting problem.
  The practice of initializing always to NULL and 0
provides no value, and
 silences all of those warnings, thus creating silent bugs, that will  bite some
  cold winter night.
 I know some static analyzers (e.g., clang-tidy(1)) do warn when you don't
 initialize variables and especially pointers (well, you need to enable  the
  warning that does that, but it can warn).  That
warning is there due to  some
  coding style or certifications that require it.
I recommend disabling  those
  bogus warnings, and forgetting about the bogus
coding style or  certification
  that requires you to write bogus code. 
 Oh my.
  > The case to set y to NULL: otherwise it has
an unknown value and it's  unsafe.

 Is an undefined value less safe than an unexpected one?  I don't think  so.
At
  least compilers can detect the former, but not
the latter.
 > The case against setting y to NULL: it is pointless, as it slows the  code
down
   slightly
and g is going to change it anyway. 
 Performance is a very minor thing.  But it's a nice side-effect that  doing
the
  right thing has performance advantages.
Readability is a good reason  (and in
  fact, the compiler suffers that readability too,
which is the cause of  the
  silencing of the wanted warnings.
 > The case maybe: Why do you trust g() to always set it? Why don't you 
trust g()?
   convince
me. 
 Well, it depends on the contract of g().  If the contract is that it may  not
  initialize the variable, then sure, initialize it
yourself, or even  better,
  check for g()'s errors, and react when it
fails and doesn't initialize  it.

 If the contract is that it should always initialize it, then trust it  blindly.
  The compiler will tell you when it doesn't
happen (that is, when g() has  a bug).
 The number of situations where the compiler can't tell whether `g` has
 a bug is unbounded.
  > You can't write this in Rust with this
ambiguity. It won't compile. In  fact, &
 
doesn't mean in Rust what it does in C. 
 I don't know Rust.  Does it force NULL initialization?  If so, I guess 
it's a
  bad design choice.  Unless Rust is so different
that it can detect such
 programmer errors even having defined default initialization, but I can't
 imagine how that is. 
 Rust enforces that all variables must be initialized prior to use.
 Whether they're initialized with a zero value or something else is up
 to the programmer; but not initializing is a compile-time error.
 For example:
 | fn main() {
 |     let x;
 |     if thing_is_true() {
 |         x = 5;
 |     } else {
 |         x = 3;
 |     }
 |     println!("x={x}");
 | }
 In fact, this is good; this allows us to employ a technique called,
 "Type-Driven Development", whereby we can create some type that
 encodes an invariant about the object. An object of that type is
 written in such a way that once it has been initialized, the mere
 existence of the object is sufficient to prove that the invariant
 holds, and need not be retested whenever the object is used. For
 example:
 | #[repr(transparent)]
 | struct PageFrameAddr(u64);
 | impl PageFrameAddr {
 |     fn new_round_down(addr: u64) -> PageFrameAddr {
 |         PageFrameAddr(addr & !0xFFF)
 |     }
 | }
 Here, "PageFrameAddr" contains a 4KiB-aligned page address.  Since the
 only way to create one of these is by the, `new_round_down` associated
 method that masks off the low bits, we can be sure that if we get one
 of these, the contained address is properly aligned.  In C, we'd
 pretty much have to test at the site of use.
 This is an extremely powerful technique; cf Alexis King's blog post,
 "Parse Don't Validate"
 (https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)
 and Cliff Biffle's talk on the Hubris embedded RTOS
 (https://talks.osfc.io/osfc2021/talk/JTWYEH/)
  > Sorry to be such a pedant, but I was
concerned that we not fall into  the "Rust
  > is C++ all over again" trap.
 >
 > As for replacing C, the velocity of Rust is just astonishing. I think  folks
have
  > been waiting for something to replace C for
a long time, and Rust,  with all its
   headaches
and issues, is likely to be that thing. 
 Modern C is receiving a lot of improvements from C++ and other  languages.
It's
  getting really good in fixing the small issues it
had in the past (and  GNU C
  provides even more good things).  GNU C2x is
quite safe and readable,  compared
  to say ISO C99 versions. 
 C23 looks like it will be a better language that C11, but I don't know
 that even JeanHeyd would suggest it's "quite safe". :-/
         - Dan C.
  I don't think C will ever be replaced.  And
I hope it doesn't.
 Possibly, something like with Plan9 and Unix/Linux will happen.  The  good things
  from other languages will come back in one form
or another to C.  The
 not-so-good ones will be discarded.
 >
 > Personally, I still prefer Go, but I can also see which way the wind  is
blowing,
  > especially when I see Rust use exploding in
firmware and user mode,  and now even
   in the
Linux kernel. 
 Cheers,
 Alex

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: yet another C discussion (YACD) and: Rust is not C++