[TUHS] Re: Perkin-Elmer Sort/Merge II vs Unix sort(1)

18 Jan 2025

Another problem with arrangements of small UNIX commands in pipelines is
that the actual arrangement in use suffers from reliability and usability
problems:
1. No way to test the whole, since in general each application has a unique
structure with a potentially different choice of components, (A shell
program executes whatever commands are on the system, not those it might
have been tested with.)
2. No comprehensive error reporting (at best, reporting from individual
commands), and
3. No way to provide support.
On a much smaller scale, imagine a component stereo setup that is
delivering bad sound. You have a turntable, an arm, a cartridge, a pre-amp,
an amp, speakers, and cables and wires, typically from seven or more
different manufacturers. Not one of them would be able to help you with
support. The dealer would, if you bought the whole lot from them. Or you
could pay a consultant. This is one reason why in the 1960s so-called
console stereos were popular. Generally, console stereos delivered inferior
sound.
This isn't a criticism of sorting with UNIX commands, it's a broader
criticism of the UNIX software tools approach for serious application
development.
Of course, one could build a single system out of components, and package
it all together as a tested and supported product. That's exactly what
object-oriented programming does, and very successfully.
Marc
On Sat, Jan 18, 2025 at 8:50 AM Paul Winalski &lt;paul.winalski(a)gmail.com&gt;
wrote:
...
  On Sat, Jan 18, 2025 at 10:17 AM Larry McVoy
&lt;lm(a)mcvoy.com&gt; wrote:
  On Sat, Jan 18, 2025 at 04:51:15PM +0200,
Diomidis Spinellis wrote:
  But I can't stop thinking that, in common
 with the mainframes these programs were running on, they represent a  mindset
  that has been surpassed by superior ideas. 
 I disagree.  Go back and read the reply where someone was talking about
 sorting datasets that spanned multiple tapes, each of which was much
 larger than local disk.  sort(1) can't begin to think about handling
 something like that.
 I have a lot of respect for how Unix does things, if the problem fits
 then the Unix answer is more simple, more flexible, it's better.  If
 the problem doesn't fit, the Unix answer is awful.
 cmd < data | cmd2 | cmd3
 is a LOT of data copying.  A custom answer that did all of that in
 one address space is a lot more efficient but also a lot more special
 purpose.  Unix wins on flexibility and simplicity, special purpose
 wins on performance.

 Another consideration:  the smaller System/360 mainframes ran DOS (Disk
 Operating System) or TOS (Tape Operating System, for shops that didn't have
 disks).  These were both single-process operating systems.  There is no way
 that the Unix method of chaining programs together could have been done.
 OS MFT (Multiprogramming with a Fixed number of Tasks) and MVT
 (Multiprogramming with a Variable number of Tasks) were multiprocess
 systems, but they lacked any interprocess communication system (such as
 Unix pipes).
 True databases in those days were rare, expensive, slow, and of limited
 capacity.  The usual way to, say, produce a list of customers who owed
 money, sorted by how much they owed would be:
 [1] scan the data set for customers who owed money and write that out to
 tape(s)
 [2] use sort/merge to sort the data on tape(s) in the desired order
 [3] run a program to print the sorted data in the desired format
 It is important in step [2] to keep the tapes moving.  Start/stop
 operations waste a ton of time.  Most of the complexity of the mainframe
 sort/merge programs was in I/O management to keep the devices busy to the
 maximum extent.  The gold standard for sort/merge in the IBM world was a
 third-party program called SyncSort.  It cost a fortune but was well worth
 it for the big shops.
 So the short, bottom line answer is that the Unix way wasn't even possible
 on the smaller mainframes and was too inefficient for the large ones.
 -Paul W.

--
Subscribe to my Photo-of-the-Week emails at my website mrochkind.com.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: Perkin-Elmer Sort/Merge II vs Unix sort(1)