On 12/14/22, Larry McVoy <lm(a)mcvoy.com> wrote:
Wasn't there some statement that QNX dropped some
of these? Copy plus
context switch?
Yeah, QNX/L4-type IPC is usually just copying followed by a context
switch. Some such kernels (like seL4) don't even have full message
queues per endpoint (the scheduler queue sort of functions as an IPC
queue). Mach IPC is slow because of stuff like permission checking,
which I would assume involves iteration over a list of permitted
threads. QNX/L4-type kernels usually either use constant-time
permission checks (like the old "clans and chiefs" model used by L3
and early L4, or the more modern capability-oriented model used by
seL4 and some others), or lack kernel permission checking entirely
leaving it up to servers.
Another issue is that Mach-type kernels don't have what is known as
"direct process switching" AFAIK. When a synchronous message is sent
on a QNX/L4-type kernel, the kernel immediately switches to the
receiving process, bypassing the scheduler queue entirely, with the
remainder of the sender's timeslice being given to the receiver
(depending on the kernel, priorities may factor into this so it isn't
always quite that simple though). Mach-like kernels often require the
sender to wait for the kernel to decide to schedule the receiver based
on the queue, and then once the reply is sent there's another wait for
the kernel to again decide to schedule the sender again, which makes
for rather poor performance.
On 12/14/22, Bakul Shah <bakul(a)iitbombay.org> wrote:
On Dec 11, 2022, at 7:09 PM, Andrew Warkentin
<andreww591(a)gmail.com> wrote:
It's not necessarily true that microkernels are significantly slower.
uKernels are usually quite fast as they do so little. What can be slow
is emulating a Unix like OS on top due to context switches. For instance,
a user process doing read() will have the following context switches:
userProc->uK->FileSystem->uK->diskDriver->uk->FileSysem->uK->userProc
or worse (I didn't account for a few things). Because of this even some
uKernels run a few critical services + drivers in the supervisor mode.
But overall slowdown of such a unix emulation will very much depend on the
workload and also what kind of performance improvements you are willing to
try in a complex kernel vs same services running in user mode.
Yeah, excessive vertical layering can be bad for performance. QNX
normally follows a process-per-subsystem-instance architecture, so the
chain is just:
client -> (kernel) -> diskServer -> (kernel) -> client
where the disk server includes the disk driver, partition table driver
(if applicable), and disk filesystem. The VFS layer isn't even
involved at all on reads, writes, and the like (it's pretty much only
there to deal with operations that involve paths and not those that
only involve FDs), whereas some other Unix-like microkernel OSes have
it act as an intermediary on all FS operations. A lot of the time,
protection domains correspond more to subsystem instances rather than
layer instances, so there really isn't much harm in merging all layers
of a subsystem into a single process for the sake of performance. When
there is a benefit to separation of layers into different processes,
it is possible to use tap-type drivers to allow running subsystem
instances that only contain some of the layers (QNX doesn't do this
AFAIK, but my OS will).