On Mon, Jan 3, 2022 at 5:26 PM Theodore Ts'o <tytso(a)mit.edu> wrote:
On Mon, Jan 03, 2022 at 04:15:08PM -0500, Dan Cross
wrote:
For that
matter, you
probably don't need to use ifconfig/route --- just let the DHCP client
server of your choice take care of setting up the network, and you're
done.
That'll work on a laptop or on my home network (where I set up the
DHCP server). In a large-scale datacenter environment, maybe not so
much.
I've seen a number of large-scale data center environments that use
DHCP; and those that don't in Linux-land generally just edit a config
file in /etc/network/interfaces or /etc/network/interfaces.d --- and
then the init scripts would just use "ifup -a" which would bring up
the network either using DHCP or the hard-coded values in
/etc/network/interaces. No need to have sysadmins manually typing
"route add ..." commands except in extreme debugging situations.
So, would you say that those are...supported by a bunch of shell
scripts that would have to be changed if those systems migrated to the
new tools for whatever reason? :-D
I would
suggest that the number of users who want to run MySQL is much
smaller than the number who want to have a functioning network. But
you're right; it's not that hard to adapt. That was kind of the point;
there have been cases where Linux users have adapted to one degree or
another.
I actually suspect the number of users who want to run MySQL is
actually larger than the number of users who need to manually
configure a network using /sbin/ifconfig and /sbin/route.... :-)
Touche.
The `apt
install net-tools` thing is a red-herring, though: that's
explicitly why I mentioned Google prod. What works on a single system
doesn't necessarily scale to O(10^6) machines supporting O(10^7)
separately running instances of O(10^4) services in O(10) globally
distributed datacenters, that's just a single organization.
Google prod's easy --- Google maintains its own minimal prod
distribution where a small set of packages (less than a few dozen)
that were originally forked from Debian a number of years ago. So
it's trivial for us, because net-tools is installed by default and
there is **no** engineering upside for us to switch to ip/ss --- so we
haven't. :-)
Recall that I was saying that the effort if Google decided, for
whatever reason, that it wanted to switch, would be multiyear. Giving
into the inertia of staying where Google is today is the easy part. As
you yourself acknowledge, the engineering effort required to switch to
the newfangled tools is non-zero. However, this creates a conundrum;
if Google wanted (again for whatever reason) to avail itself of the
new functionality provided by the new tools, it now has a choice:
either roll the new tools out, or extend the existing tools to provide
the required functionality. This is not something I would describe as
"easy." Sure, rolling the new commands out to machines would be
relatively straight-forward (though I've worked on those systems and
getting things into the production image was non-trivial politically,
if not technically), but transitioning to use them not so much.
This sort of begs the question: why not default to shipping the old
tools (route, netstat, ifconfig) and let people install the _new_
tools if they want their Linux machine to behave like a cisco router?
Google made this choice and sure: as you say, this is a distro-level
question. But it sure seems like the broad consensus around the major
distros is the opposite.
Similarly, we're still using a *very* old version
of bash (having
stayed back across multiple major releases) because it's easier to
stick with an old version of bash and keep up with the security fixes,
than it would be to audit a gazillion shell scripts and shell script
fragments for the various potential and subtle backwards-incomaptible
changes that bash has made over the past decade.
This is an aside, but one of my continual gripes about Google was that
it took monumental efforts to deprecate anything (lookin' at you, disk
device names), often because no one had paid attention to trying to
maintain portability between different systems, let alone different
versions of the same system.
But bear in mind that even Google's monorepo doesn't exist in a
hermetic bubble: third party code and upstream dependencies for
externally imposed requirements; in some cases, that forces upgrades
of libraries, toolchains, and yes, even userspace tools.
Incidentally, this was one of the big arguments for getting the Rust
toolchain pulled into google3: even though Google wasn't authoring
first-party Rust code for prod, we _knew_ there were third-party
dependencies that would force us to have a toolchain for the same
reason we have a FORTRAN toolchain. That I was one of the people who
benefited from a Rust toolchain in g3 wasn't entirely coincidental, as
our team finally forced language tools' hands there.
Larry has told a similar story about the advantages of
using a "dead"
language like Tcl for Bitkeeper, since he didn't have to deal with
gratuitous version skew problems caused by backwards incomaptible
changes in Perl or Python. But if you manage your own userspace,
including affording to do your own FEDRAMP-compliant security updates,
you can just choose *not* to upgrade to newer versions of bash or
network utilities. Who *says* you have to use bash 5.1? Or switch to
ip/ss?
The interesting thing about this is that it ignores the unseen costs
of staying with the older versions of things. The kernel was a great
example, actually, for reasons you know well (though perhaps not super
appropriate for discussion here). This creates a negative feedback
loop: we can't upgrade anything because something will break, so we
stay with the old stuff until forced to move.
On the other hand, we did spend untold engineering
hours migrating
from Python 2.7 to 3.x (and *wow* was that painful) because in that
case, that was considered less work and more secure than independently
supporting Python 2.7 ad infinitum (also, we have fewer Python
programs than we have shell script fragments in config files like
Makefiles as well actual shell scripts).
...and also because third-party dependencies started dropping support
for Python 2.
I would love to see language stats for the claim that there's more
shell scripts than Python over there. I'd be mildly though perhaps not
completely surprised if that were true. I suppose if "shell script
fragments" includes `a && b` then I could see it. I'm not cleared for
that anymore though. But I also remember when /usr/bin/python2.4 was
really python2.7. :-(
So the decision about whether to follow breaking
interfaces/language
changes is an engineering decision that's made on a case by case
basis, as it should be for all organizations. In the case of bash and
net-tools, one decision was made.
Inertia is a hell of a drug.
In the case of Python, a difference
decision was made --- although a lot of people really weren't happy
that the Python developers didn't appreciate the cost of breaking
backwards compatibility (or rather, they decided that they didn't care
and prioritized their own convenience over that of their users).
This is why I'm a big supporter of Linus's, "Thou Shalt Not Break
Userspace" rule; backwards compatibility at the lowest levels is
important! I'd draw that line higher than some people, but hey, if
you don't like the instability of Python, you can do what Bitkeeper
did and base your extension language/scripts on something like Tcl
instead.
I guess what I'm saying is that "thou shalt not break userspace" is a
bit of a misnomer; perhaps it should be rephrased as, "thou shalt not
change the kernel in such a way that userspace code no longer works."
He is, after all, referring specifically to the kernel interface,
right? But just as sufficiently advanced technology is said to be
indistinguishable from magic, a missing command like `ifconfig`
_might_ be indistinguishable from an incompatible system interface as
far as a user is concerned.
P.S. If you want to see the horror of trying to
support a Python
program that has to be able to run on a wide variety of Linux distros,
running different versions of Python (from RHEL 6 and Python 2.3 up to
the newest bleeding edge Python 3.x), take a look at the Google's
gcloud program. Alas, unlike the Google prod environment, we can't
dictate what distro our Cloud customers might choose to use.
The gcloud command (part of the Google Cloud SDK) has to play all
sorts of portability games, and people need to continually test across
a wide range of Python versions and Linux distros to provide assurance
that it continues to correctly as new features are added and bugs are
fixed. I *completely* understand why Larry chose to implement
Bitkeeper extensions in Tcl, since it avoids this problem.
No thank you! I'm good. :-)
Bt yeah, it's why I wouldn't want to change
dot-dot, since having seen
the pain the Python developers inflected on me, I wouldn't want to
inflict it on others...
Hmm. I wonder if I have any 386 a.out binaries floating around. :-)
- Dan C.