At Tue, 23 Feb 2021 20:20:55 -0700, Warner Losh <imp(a)bsdimp.com> wrote:
Subject: Re: [TUHS] Abstractions
I booted a FreeBSD/i386 4 system, sans compilers and a few other things,
off 16MB CF card in the early 2000s. I did both static (one binary) and
dynamic and found dynamic worked a lot better for the embedded system...
I guess it may depend on your measure of "better"?
With a single static-linked binary on a modern demand paged system with
shared text pages, the effect is that almost all instructions for any
and all programs (and of course all libraries) are almost always paged
in at any given time. The result is that program startup requires so
few page-in faults that it appears to happen instantaneously.
My little i386 image feels faster at the command line (e.g. running on
an old Soekris box, even when the root filesystem is on a rather slow
flash drive) than on any of the fastest non-static-linked systems I've
ever used because of this -- that is of course until it is asked to do
any actual computing or other I/O operations. :-)
So, in an embedded system there will be many influencing factors,
including such as how many exec()s there are during normal operations.
For machines with oodles of memory and very fast and large SSDs (and
using any kernel with a decently tuneable paging system) one can simply
static-link all binaries separately and achieve similar results, at
least for programs that are run relatively often.
For example the build times of a full system build of, e.g. NetBSD, with
a fully static-linked host system and toolchain are remarkably lower
than on a fully dynamic-linked system since all the extra processing
(and esp. any extra I/Os) done by the "stupid" dynamic linker (i.e. the
one that's ubiquitous in modern unixy systems) are completely and
forever eliminated. I haven't even measured the difference in years now
because I find fully dynamic-linked systems too painful to use for
intensive development of large systems.
Taking this to the opposite extreme one need only use modern macOS on a
machine with an older spinning-rust hard drive that has a loud seek arm
to hear and feel how incredibly slow even the simplest tasks can be,
e.g. typing "man man" after a reboot or a few days of not running
"man".
This is because on top of the "stupid" dynamic linker that's needed to
start the "man" program, there's also a huge stinking pile of additional
wrappers has been added to all of the toolchain command-line tools that
require doing even more gratuitous I/O operations (as well as running
perhaps millions more gratuitous instructions) for infrequent
invocations (luckily these wrappers seem to cache some of the most
expensive overhead). (note: "man" is not in the same boat as, e.g. the
toolchain progs, and I'm not quite sure why it churns so much on first
invocations)
My little static-linked i386 system can run "man man" several (many?)
thousand times before my old iMac can display even the first line of
output. And that's for a simple small program -- just imagine the
immense atrocities necessary to run a program that links to several
dozen libraries (e.g. the typical GUI application like a web browser,
with the saving grace that we don't usually restart browsers in a loop
like we restart compilers; but, e.g. /usr/bin/php on macos links to 21
libraries, and even the linker (ld) needs 7 dynamic libraries).
BTW, a non-stupid dynamic linker would work the way Multics did (and to
some extent I think that's more how dynamic linking worked in AT&T UNIX
(SysVr3.2) on the 3B2s), but such things are so much more complicated in
a flat address space. Pre-binding, such as I think macOS and IRIX do
(and maybe can be done with the most modern binutils), are somewhat like
Multics "bound segments" (though still less flexible and perhaps less
performant).
--
Greg A. Woods <gwoods(a)acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca>
Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>