On Mon, Jun 10, 2024 at 8:33 AM Will Senn <will.senn(a)gmail.com> wrote:
There's an interesting dive into PID 0 linked to
from osnews:
https://blog.dave.tf/post/linux-pid0/
In the article, the author delves into the history of the scheduler a bit
- going back to Unix v4 (his assembly language skills don't go to PDP
variants).
I like the article for two reasons - 1) it's clarity 2) it points out the
self-reinforcing nature of our search ecosystem.
I'm left with the question - how did scheduling work in v0-v4? and the
observation that search really sucks these days.
It's an interesting and well-written article, but I think it's not quite
correct.
It links to sched in the V4 code [1] but there's nothing there about pid 0.
The right place to link would be the code in main that installs the
"system process" into the process table [2].
So yes, in V4, the scheduler is a process in a meaningful sense,
but I don't think pid 0 is a meaningful process identifier for it.
Nothing actually *identifies* the scheduler by using the number 0.
After a process has exited and its parent has called wait,
its process table entry is set to p_pid = 0 [3]. Surely pid 0 does
not also identify those processes at the same time that it identifies
the system process. If there are many processes in the table with
pid 0, it's difficult to see pid 0 as any kind of identifier at all!
Instead it seems pretty clear that pid 0 represents the concept "no pid".
This makes sense since the kernel memory started out zeroed,
so using the zero pid for "nothing here" avoided separate reinitialization.
The same is true for process status 0 meaning "unused".
Similarly, inode 0 is "no inode" (useful to mark the end of a directory
entry list), and disk block number 0 is "no block" (useful to mark
an unallocated block in a file).
(Go's emphasis on meaningful zero values is in the same spirit.)
Reading the V1 sources seems to confirm this hypothesis:
V1 does not have a process table for any kernel process,
and yet it still uses pid 1 for the first process [4].
In V1 the user struct has a u.uno field holding the process number
as an index into the process table. That field too is 1-indexed,
because it is convenient for u.uno==0 to mean "no process".
In particular, swap (analogous here to V4 swtch) understood that if
called when u.uno==0 the process is exiting and need not be
saved for reactivation [5]. The kernel goes out of its way to use
u.uno==0 instead of u.uno==-1: all the code that indexes an array
by u.uno has to subtract 1 (or 2 for words) from the address being
indexed to account for the first entry being 1 and not 0.
Presumably this is because of wanting to use zero value as "no uno".
(And it's not any less efficient, since the -1 or -2 can be applied to
the base address during linking.)
The obvious question to ask then is not why pids start at 1
but why, in contrast to all these examples, uids start at 0.
My guess is that there was simply no need for "no uid" and
in contrast having the zero value mean "root" worked out nicely.
Perhaps Ken will correct me if I'm reading this all wrong.
As to the question of how scheduling worked in V1, the swap code
is walking over runq looking for the highest priority runnable process [6].
Every process image except the one running was saved on disk,
so the only decision was which one to read back in.
This is in contrast to the V4 scheduler, which is juggling multiple
in-memory process images at once and split out the decisions
about what to run from the code that moved processes to and
from the disk.
Best,
Russ
[1]
https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/sl…
[2]
https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/ma…
[3]
https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/sy…
[4]
https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u0.s#L200
[5]
https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u3.s#L40
[6]
https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u3.s#L9