On Wed, Mar 4, 2020 at 11:04 PM Random832 <random832(a)fastmail.com> wrote:
Hardly *any* commands you'd use in a pipeline
really operate on
unstructured bytes. Compression, I suppose. But other than that, you have
just as much need to know what commands operate on what structure in Unix
as in Powershell - the only difference is that the serialization is
explicitly part of the interface... and due to the typical inability to
escape delimiters, leaky.
Another difference is that probably most people on this list are extremely
familiar with the various quirks and I/O nuances of the tools many have
been using every day for decades. Just as the native speakers of a natural
language can't so easily see/appreciate its complexity (e.g., pronunciation
in English!), I suspect many of us have internalized these idiosyncrasies.
I teach occasional shell/Python courses to absolute beginners (no computing
experience at all) and came to appreciate how weird the shell is (in the
sense of having baked-in historical accidents that cannot / will not /
should not be "corrected"). Some of my appreciation of that was due to
discussions on this list (e.g., regarding comment syntax, and the :
command) - so thanks!
I know what follows won't be to everyone's taste, but I like Python and I
love shell pipelines, so I tried to write a shell that gave you both and
which allowed fairly free mixing of invoking UNIX tools and running Python.
You can send anything down its pipelines - lines of text, atoms, numbers,
Python objects, whatever (in the Python _ variable). Of course the
receiving end of the pipeline needs to know (or figure out) what it's
getting. One advantage is that you have a carefully designed programming
language (no offence intended!) underlying the shell, so you can e.g.,
write shell functions in Python (and put them in a start-up file if you
want) and just pipe regular UNIX output into them and pipe their output
into whatever's next (more Python, another UNIX command, etc). Probably
almost no one would actually want to regularly do the following on the
command line, but you could:
>> from os import stat
>> def fd(): return [name for (name, time) in sorted((f, stat(f).st_mtime)
for
f in _)]
>> ls | fd() | tail -n 3
Here I've stuck a simple (DSU - see [1]) Python function in between two
UNIX commands and use it to get the most recently modified files.
You probably wouldn't want to do this either, but you could:
>> seq 0 9 | list(map(lambda x: 2 ** int(x), _))
| tee /tmp/powers-of-two | sum(map(int, _))1023>>> cat
/tmp/powers-of-two1248163264128256512
Of course it also lets you do things you *would* want to do :-)
More at
https://github.com/terrycojones/daudin Python has fairly nice
tools for reading and evaluating Python code, which meant that getting a
first version of this implemented took only one evening of playing around.
It's pretty simple (and still has plenty of rough edges). Apologies if
this seems like self-promotion, but I very much enjoy thinking about things
in this thread and about how we work with information. I'm also constantly
blown away by how elegant UNIX is and how the core ideas have endured.
Pipelines are really wonderful, as "natural" alternative to function
composition as a mathematician or programmer would do it (see point #1 at
https://github.com/terrycojones/daudin#background--thanks) and I wanted to
build a shell that preserved that, while giving you Python. The overview of
their history on pages 67-70 of bwk's recent book [2] is very interesting.
Terry
[1]
https://en.wikipedia.org/wiki/Schwartzian_transform
[2]
https://www.amazon.com/UNIX-History-Memoir-Brian-Kernighan/dp/1695978552