[TUHS] Re: Porting the SysIII kernel: boot, config & device drivers

1 Jan 2023

At 2023-01-01T21:29:17+0100, Paul Ruizendaal wrote:
...
   Mothy Roscoe
gave a really interesting keynote at OSDI'21:
 https://www.youtube.com/watch?v=36myc8wQhLo 
 What an interesting keynote! At first he makes the case that OS
 research is dead (not unlike Rob’s similar observation 20 years
 before).
 However, he goes on to point out a specific challenge that he
 feels is in dire need of research and innovation. In short his case is
 that a modern SoC is nothing like a VAX, but Linux still treats all
 hardware like a VAX. 
As do C programmers.  https://queue.acm.org/detail.cfm?id=3212479
...
   That makes
figuring out where the host OS image is and getting it
 loaded into memory a real pain in the booty; not to mention on
 modern systems where you’ve got to do memory training as part of
 initializing DRAM controllers and the like. 
 That was my immediate pain point in doing the D1 SoC port.
 Unfortunately, the manufacturer only released the DRAM init code as
 compiler ‘-S’ output and the 1,400 page datasheet does not discuss its
 registers. Maybe this is a-typical, as I heard in the above keynote
 that NXP provides 8,000 page datasheets with their SoC’s. Luckily,
 similar code for ARM chips from this manufacturer was available as C
 source and I could reverse engineer the assembler file back to about
 1,800 lines of C. See
 https://gitlab.com/pnru/xv6-d1/-/blob/master/boot0/sdram.c  
I don't think it's atypical.  I was pretty annoyed trying to use the
data sheet to program a simple timer chip on the ODROID-C2; the sheet
was simply erroneous about the address of one of the registers you
needed to poke to set up a repeating timer.  When I spoke to people more
experienced with banging on modern devices, I was mostly met with
resignation and rolled shoulders.  I grew up in the hobbyist era; these
days it is _only_ OS nerds who program these devices, and these OS nerds
don't generally handle procurement themselves.  Instead, purchasing
managers do, and those people don't have to face the pain.
Data sheets are only as good as they need to be to move the product,
which means they don't need to be good at all, since the people who pay
for them look only at the advertised feature list and the price.
...
  It does all the expected things (set voltage, switch
on the main
 clock, set up the PLL, calculate delays in terms of clocks, train for
 the line delays, probe address multiplexing, etc.) by accessing a
 multitude of registers that appear directly connected to the
 controller fabric. Yet, at the same time it has only a single entry
 point (“init_DRAM”) that takes 24 (poorly documented) words of 32 bits
 to define the desired configuration (with options to specify “default”
 or “auto”).
 Why does the main processor need to run this code? Why is there not a
 small RV32E CPU inside this controller that runs the 1,800 lines and
 exposes just the 24 words as registers to the main CPU? Are the square
 mils really that expensive? Or is this the reason that many SoC’s (but
 not the D1) have a small side CPU to do this for all devices? 
I can't speak to the economic arguments but it seems wasteful to me to
have an auxiliary CPU that is--or _should_--be used only to bring up the
board.  The other problem I have is more ominous.
"Ah, BMC.  Now every computer comes with an extra full-fledged computer!
The main computer is for your use, and the other computer is for the use
of the attacker." — Russ Allbery
But the real problem is with the software these auxiliary processors
run.  My first encounter with auxiliary processors was with the TRS-80
Model 16 line in the early 1980s, which was also available as an upgrade
option from the Model II.  The II ran a Zilog Z80.  The Model 16 upgrade
added a 68000 board which became the main CPU, and the Z80 was relegated
to I/O management.  This turned out to work really well.  I gather that
big IBM iron was doing this sort of thing decades earlier.  That at
least is an honorable and useful application of the additional core,
instead of smuggling your keystrokes and network packets out to
interested intelligence agencies and harvesters for targeted
advertising.
[...]
...
  One of the things on my mind was that SoC boards
change quite quickly:
 the D1 was last year’s hit with hobbyists, next year it will be
 something else. Not nice if one has to redo 3,500-10,000 lines of code
 for each board. Although I am not a fan of Forth, I can see that it is
 useful when the controller IP blocks of a SoC (or the PCI cards of
 discrete systems) expose some form of re-targetable program needed to
 boot it up. The manufacturer BSP code is just not plug & play. 
Yes.  I too think Forth is a little weird but I appreciate its
power/memory footprint ratio.  I really admire OpenFirmware and was
intensely disappointed when the RISC-V community didn't take the
opportunity to resurrect it.  Forth code seems a lot more auditable to
me than machine language blobs.
...
  Maybe the correct solution for my archeology is to
just use the
 simpler FPGA SoC as a target. 
Maybe it's the correct solution for anyone who cares about a verified
boot process, too.  No one who sells silicon seems to.
...
  For a long time I have wondered why early Xenix did
not make the jump
 to a product that was split between a BIOS and a BDOS part, so that
 they could sell BDOS-part updates to the total installed base. The
 BDOS part could even be in some kind of p-code. Considering that they
 heavily invested in their “revenue bomb” C-compiler at the time, this
 type of thinking was close to their hart 
You _really have_ been writing for RISC-V lately... ;-)
...
  (the Amsterdam Compiler Kit was a similar idea). I am
talking ’81-’83
 here, thereafter it is clear that their economic interest was to focus
 on DOS.
 There are 3 possibilities:
 1. It did not make technical sense
 2. It did not make economic sense
 3. It did make sense, but it simply did not happen
 So, yes, I was conflating a lot of different thoughts into a single
 solution, without first thinking clearly about the question. 
Someone is bound to suggest that p-code was, or still is, detrimental to
boot times.  But cores are so much faster than memory buses, and JIT
compilers so far advanced over what was available in the 1980s, that I
wonder if that is simply a myth today for any implementation that didn't
go out of its way to be slow.
...
  For sure, that is undisputed. But could it have been
even more
 successful? Maybe the main reason for "no BIOS attempt" was purely
 economic: for the companies having to port to each and every machine
 created customer lock-in, and for the professionals it created an
 industry of well paying porting & maintenance jobs. The customers were
 willing to pay for it. Why kill the golden goose? 
Never compete on margins when you can extract monopoly rents, leverage
asymmetric information, impose negative externalities, and defraud your
customer or the public generally.  Everybody who stops learning
economics after their first micro class is a sheep waiting your shears.
Regards,
Branden

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: Porting the SysIII kernel: boot, config & device drivers