seems to have the right file (registration required, but it's free, use a
disposable email).
Beats my having to find a SCSI adaptor, a QIC-150 drive, and trying to read
my old QIC-150 tape with the source code on it...
Gilles
Le mar. 4 sept. 2018 à 13:48, Kevin Bowling <kevin.bowling(a)kev009.com> a
écrit :
On Sun, Sep 2, 2018 at 12:43 PM, Theodore Y. Ts'o
<tytso(a)mit.edu> wrote:
On Sat, Sep 01, 2018 at 10:05:06PM -0700, Kevin
Bowling wrote:
Sorry this is just bogus about being weak compared to Solaris. Are
you looking back with rosy glasses or have you scanned the code in the
past couple years? I have and there is nothing particularly special
about Solaris internals here or elsewhere.
I haven't looked at Solaris code; I had just *assumed* that if they
were selling million dollar E10k's, they would have had NUMA support
at *least* as good as SGI's Irix. And it would have been an excuse
for their pathetic performance on UP and 2-4 SMP systems.
One would hope so, but that was the strategy that got them eaten by a
grue. Another funny anecdote about this aloofness.. Linux on sparc64
uses the Relaxed Memory Order mode that the hardware offers .
Solaris.. Total Store Order. There are tons of things like this in
the code that blow my mind. I would have been pissed if I were on the
hardware side of SPARC.
Keep in mind IBM wants to sell RockHoppers and
E980s (4 drawers, 16
sockets, 768 threads) for dedicated Linux use which have similar
north/south and east/west off chip networks. They have a lot of very
talented people on the firmware, kernel, compilers to make these
things work fast, including Paul.
...
Where you start going beyond Linux-like NUMA IMO is when you get
Irix-like features of page copying, migration, and multiple advanced
placement policies.
One thing to consider is that IBM really only cared about optimizing
hardware for DB2, Oracle, and Webshpere. That's one of the reason why
you didn't see much in the way of innovative file system work, ala
ZFS. There was no business justification for pouring 100+ engineer
years to develop a next-generation file systesm --- and they had
already done that once already for GPFS, a cluster file system. As
far as local disk file system was concerned, the only real business
value it had was to serve as a program loader for DB2 and Websphere. :-)
(I'm exagerating a little for effect, but *only* a little.)
Hmm, I think they've been pretty earnest at wanting to be 2+ years
ahead of the general market with POWER for as long as I can see, lots
of HPC money has been subsidizing that. Depends on the workload but
bus and memory bandwidth right now with PCIe Gen4 and NvLink can
really cut down on server sprawl. I've met with the GM/chief
architect and they see OpenPOWER positioned as a full frontal
competitor to Intel Xeon. I'm fairly disappointed in my
contemporaries for not recognizing the value of a completely open
source firmware and on chip controller stack; especially after the
recent snafu where Intel changed the microcode license to disallow
benchmarks and claimed it was an accident.
Your statements make sense to me with respect to AIX, as Linux has
been the main effort since the 2000s. GPFS looks neat, I wish it were
open or at least internals documented well enough to study the
implementation academically.
So as far as NUMA was concerned, there was almost certainly not have
been much perceived business value in having sophisticated
auto-migration for arbitrary workloads in the kernel. Something basic
which was good enough for Oracle, DB2, etc., was all that would be
needed. (And if you needed to hire consultants from IBM Global
Services to mind-meld with the configuration documentation in order to
get the best out of your Rockhopper.... well, shucks, darn. :-)
That's probably the dirty little secret. It's long been profitable to
carefully plan software interrupt handlers, user threads, and memory
allocation even on pedestrian servers if they are running a fixed
function. I guess Google's Borg and the new workalikes could do
semi-automagic things with cgroups these days. There is evidence of
people getting pretty crazy with it when we see things like Intel
cache allocation features.
At IBM the business people really did make the
funding decisions of
what to work on. ZFS could have never happened at IBM because no one
would have thought that a even a tiny number of IBM's current or
potential customer base would abandon AIX or Linux and switch to
Solaris, or buy Sun hardware instead of IBM hardware --- just for the
sake of ZFS. And that's how decision-makers at IBM really thought.
(And to be fair to those decision-makers, IBM is still in business as
a free-standing business --- and Sun is not.)
Agreed, one of these companies is doing pretty well with a fat
dividend yield, that other has basically been dismantled for all but a
couple remaining desirable platform control points like Java and
MySQL.
Many things in tech are happy accidents and a small number of
motivated people at the right place and time. A Sun engineer admitted
on some video I've seen that the green light was really given for ZFS
because they got stumped by some UFS bugs.. once enough of ZFS was
written to test the end to end checksumming features they found out
some of these heisenbugs were LSI HBA and disk firmware issues :o)
Surveying some of these filesystems.. JFS2 is a decent, nowhere near
the capabilities of ZFS but even today it's not in dire need of
replacement.. I suspect another issue complementary to your point is
the standalone storage business is many $B of revenue. ESS/DS8000 and
the like are preferred revenue. IBM and HP were more in the SAN game
than Sun and SGI who let the customers configure systems themselves be
used as storage (Sun was using VxFS for a long time, SGI had some CXFS
things IIRC). Tru64 had a pretty interesting filesystem on paper,
curious if you ever looked at its design since they open sourced it.
Regards,
Kevin
--
*Gilles Gravier* - Gilles(a)Gravier.org
GSM : +33618347147 and +41794728437
Skype : ggravier | PGP Key : 0x8DE6D026
<http://pgp.mit.edu:11371/pks/lookup?search=0x8DE6D026&op=index>