One of the things I really appreciate about participating in this community and studying
Unix history (and the history of other systems) is that it gives one firm intellectual
ground from which to evaluate where one is going: without understanding where one is and
where one has been, it's difficult to assert that one isn't going sideways or
completely backwards. Maybe either of those outcomes is appropriate at times (paradigms
shift; we make mistakes; etc) but generally we want to be moving mostly forward.
The danger when immersing ourselves in history, where we must consider and appreciate the
set of problems that created the evolutionary paths leading to the systems we are
studying, is that our thinking can become calcified in assuming that those systems
continue to meet the needs of the problems of today. It is therefore always important to
reevaluate our base assumptions in light of either disconfirming evidence or (in our
specific case) changing environments.
To that end, I found Timothy Roscoe's (ETH) joint keynote address at
ATC/OSDI'21 particularly compelling. He argues that what we consider the
"operating system" is only controlling a fraction of a modern computer these
days, and that in many ways our models for what we consider "the computer" are
outdated and incomplete, resulting in systems that are artificially constrained, insecure,
and with separate components that do not consider each other and therefore frequently
conflict. Further, hardware is ossifying around the need to present a system interface
that can be controlled by something like Linux (used as a proxy more generally for a
Unix-like operating system), simultaneously broadening the divide and making it ever more
entrenched.
Another theme in the presentation is that, to the limited extent the broader systems
research community is actually approaching OS topics at all, it is focusing almost
exclusively on Linux in lieu of new, novel systems; where non-Linux systems are featured
(something like 3 accepted papers between SOSP and OSDI in the last two years out of $n$),
the described systems are largely Linux-like. Here the presentation reminded me of Rob
Pike's "Systems Software Research is Irrelevant" talk (slides of which are
available in various places, though I know of no recording of that talk).
Roscoe's challenge is that all of this should be seen as both a challenge and an
opportunity for new research into operating systems specifically: what would it look like
to take a holistic approach towards the hardware when architecting a new system to drive
all this hardware? We have new tools that can make this tractable, so why don't we do
it? Part of it is bias, but part of it is that we've lost sight of the larger
picture. My own question is, have we become entrenched in the world of systems that are
"good enough"?
Things he does NOT mention are system interfaces to userspace software; he doesn't
seem to have any quibbles with, say, the Linux system call interface, the process model,
etc. He's mostly talking about taking into account the hardware. Also, in fairness,
his highlighting a "small" portion of the system and saying, "that's
what the OS drives!" sort of reminds me of the US voter maps that show vast tracts of
largely unpopulated land colored a certain shade as having voted for a particular
candidate, without normalizing for population (land doesn't vote, people do, though
in the US there is a relationship between how these things impact the overall election
for, say, the presidency).
I'm curious about other peoples' thoughts on the talk and the overall topic?
https://www.youtube.com/watch?v=36myc8wQhLo
- Dan C.
One thing I've realized as the unit of computing becomes more and more
abundant (one off
HW->mainframes->minis->micros->servers->VMs->containers) the OS
increasingly becomes less visible and other software components become
more important. It's an implementation detail like a language runtime
and software developers are increasingly ill equipped to work at this
layer. Public cloud/*aaS is a major blow to interesting general
purpose OS work in commercial computing since businesses increasingly
outsource more and more of their workloads. The embedded (which
includes phones/Fuschia, accelerator firmware/payload, RTOS etc) and
academic (i.e. Cambridge CHERI) world may have to sustain OS research
for the foreseeable future.
There is plenty of systems work going on but it takes place in
different ways, userspace systems are completely viable and do not
require switching to microkernels. Intel's DPDK/SPDK as one
ecosystem, Kubernetes as another - there is a ton of rich systems work
in this ecosystem with eBPF/XDP etc, and I used to dismiss it but it
is no longer possible to do so rationally. I would go as far as
saying Kubernetes is _the_ datacenter OS and has subsumed Linux itself
as the primary system abstraction for the next while.. even Microsoft
has a native implementation on Server 2022. It looks different and
smells different, but being able to program compute/storage/network
fabric with one abstraction is the holy grail of cluster computing and
interestingly it lets you swap the lower layer implementations out
with less risk but also less fanfare.
Regards,
Kevin