[Resending as this got squashed a few days ago. Jon, sorry for the
duplicate. Again.]
On Sun, Jan 12, 2020 at 4:38 PM Jon Steinhart <jon(a)fourwinds.com> wrote:
> [snip]
> So I think that the point that you're trying to make, correct me if I'm
> wrong,
> is that if lists just knew how long they were you could just ask and that
> it
> would be more efficient.
>
What I understood was that, by translating into a lowest-common-denominator
format like text, one loses much of the semantic information implicit in a
richer representation. In particular, much of the internal knowledge (like
type information...) is lost during translation and presentation. Put
another way, with text as usually used by the standard suite of Unix tools,
type information is implicit, rather than explicit. I took this to be less
an issue of efficiency and more of expressiveness.
It is, perhaps, important to remember that Unix works so well because of
heavy use of convention: to take Doug's example, the total number of
commands might be easy to find with `wc` because one assumes each command
is presented on a separate line, with no gaudy header or footer information
or extraneous explanatory text.
This sort of convention, where each logical "record" is a line by itself,
is pervasive on Unix systems, but is not guaranteed. In some sense, those
representations are fragile: a change in output might break something else
downstream in the pipeline, whereas a representation that captures more
semantic meaning is more robust in the face of change but, as in Doug's
example, often harder to use. The Lisp Machine had all sorts of cool
information in the image and a good Lisp hacker familiar with the machine's
structures could write programs to extract and present that information.
But doing so wasn't trivial in the way that '| wc -l' in response to a
casual query is.
While that may be true, it sort of assume that this is something so common
> that
> the extra overhead for line counting should be part of every list. And it
> doesn't
> address the issue that while maybe you want a line count I may want a
> character
> count or a count of all lines that begin with the letter A. Limiting this
> example
> to just line numbers ignores the fact that different people might want
> different
> information that can't all be predicted in advance and built into every
> program.
>
This I think illustrates an important point: Unix conventions worked well
enough in practice that many interesting tasks were not just tractable, but
easy and in some cases trivial. Combining programs was easy via pipelines.
Harder stuff involving more elaborate data formats was possible, but, well,
harder and required more involved programming. By contrast, the Lisp
machine could do the hard stuff, but the simple stuff also required
non-trivial programming.
The SQL database point was similarly interesting: having written programs
to talk to relational databases, yes, one can do powerful things: but the
amount of programming required is significant at a minimum and often
substantial.
> It also seems to me that the root problem here is that the data in the
> original
> example was in an emacs-specific format instead of the default UNIX text
> file
> format.
>
> The beauty of UNIX is that with a common file format one can create tools
> that
> process data in different ways that then operate on all data. Yes, it's
> not as
> efficient as creating a custom tool for a particular purpose, but is much
> better
> for casual use. One can always create a special purpose tool if a
> particular
> use becomes so prevalent that the extra efficiency is worthwhile. If
> you're not
> familiar with it, find a copy of the Communications of the ACM issue where
> Knuth
> presented a clever search algorithm (if I remember correctly) and McIlroy
> did a
> critique. One of the things that Doug pointed out what that while Don's
> code was
> more efficient, by creating a new pile of special-purpose code he
> introduced bugs.
>
The flip side is that one often loses information in the conversion to
text: yes, there are structured data formats with text serializations that
can preserve the lost information, but consuming and processing those with
the standard Unix tools can be messy. Seemingly trivial changes in text,
like reversing the order of two fields, can break programs that consume
that data. Data must be suitable for pipelining (e.g., perhaps free-form
text must be free of newlines or something). These are all limitations.
Where I think the argument went awry is in not recognizing that very often
those problems, while real, are at least tractable.
Many people have claimed, incorrectly in my opinion, that this model fails
> in the
> modern era because it only works on text data. They change the subject
> when I
> point out that ImageMagick works on binary data. And, there are now stream
> processing utilities for JSON data and such that show that the UNIX model
> still
> works IF you understand it and know how to use it.
>
Certainly. I think you hit the nail on the head with the proviso that one
must _understand_ the Unix model and how to use it. If one does so, it's
very powerful indeed, and it really is applicable more often than not. But
it is not a panacea (not that anyone suggested it is). As an example, how
do I apply an unmodified `grep` to arbitrary JSON data (which may span more
than one line)? Perhaps there is a way (I can imagine a 'record2line'
program that consumes a single JSON object and emits it as a syntactically
valid one-liner...) but I can also imagine all sorts of ways that might go
wrong.
- Dan C.
[I originally asked the following on Twitter which was probably not the smartest idea]
I was recently wondering about the origins of Linux, i.e. Linux Torvalds doing his MSc and deciding to write Linux (the kernel) for the i386 because Minix did not support the i386 properly. While this is perfectly understandable I was trying to understand why, as he was in academia, he did not decide to write a “free X” for a different X. The example I picked was Plan 9, simply because I always liked it but X could be any number of other operating systems which he would have been exposed to in academia. This all started in my mind because I was thinking about my friends who were CompSci university students with me at the time and they were into all sorts of esoteric stuff like Miranda-based operating systems, building a complete interface builder for X11 on SunOS including sparkly mouse pointers, etc. (I guess you could define it as “the usual frivolous MSc projects”) and comparing their choices with Linus’.
The answers I got varied from “the world needed a free Unix and BSD was embroiled in the AT&T lawsuit at the time” to “Plan 9 also had a restrictive license” (to the latter my response was that “so did Unix and that’s why Linus built Linux!”) but I don’t feel any of the answers addressed my underlying question as to what was wrong in the exposure to other operating systems which made Unix the choice?
Personally I feel that if we had a distributed OS now instead of Linux we’d be better off with the current architecture of the world so I am sad that "Linux is not Plan 9" which is what prompted the question.
Obviously I am most grateful for being able to boot the Mathematics department’s MS-DOS i486 machines with Linux 0.12 floppy disks and not having to code Fortran 77 in Notepad followed by eventually taking over the department with X-Terminals based on Linux connected to the departmental servers (Sun, DEC Alpha, IBM RS/6000s). Without Linux they had been running eXeed (sp?) on Windows 3.11! In this respect Linux definitely filled in a huge gap.
Arrigo
Hi,
Have you ever used shell level, $SHLVL, in your weekly ~> daily use of Unix?
I had largely dismissed it until a recent conversation in a newsgroup.
I learned that shelling out of programs also increments the shell level.
I.e. :shell or :!/bin/sh in vim.
Someone also mentioned quickly starting a new sub-shell from the current
shell for quick transient tasks, i.e. dc / bc, mount / cp / unmount,
{,r,s}cp, etc., in an existing terminal window to avoid cluttering that
first terminals history with the transient commands.
That got me to wondering if there were other uses for shell level
($SHLVL). Hence my question.
This is more about using (contemporary) shells on Unix, than it is about
Unix history. But I suspect that TUHS is one of the best places to find
the most people that are likely to know about shell level. Feel free to
reply to COFF if it would be better there.
--
Grant. . . .
unix || die
I thought Benno Rice’s argument a bit unorganized and ultimately unconvincing, but I think the underlying point that we should from time to time step back a bit and review fundamentals has some merit. Unfortunately he does not distinguish much between a poor concept and a poor implementation.
For example, what does “everything is a file” mean in Unix?
- Devices and files are accessed through the same small API?
- All I/O is through unstructured byte streams?
- I/O is accessed via a single unified name space? etc.
Once that is clear, how can the concept then best be applied to USB devices?
Or: is there a fundamental difference between windows-style completion ports and completion signals?
Many of the underlying questions have been considered in the past, with carefully laid out arguments in various papers. In my view it is worthwhile to go back to these papers and see how the arguments pro and contra various approaches were weighed then and considering if the same still holds true today.
Interestingly, several points that Benno touches upon in his talk were also the topic of debate when Unix was transitioning to a 32 bits address space and incorporating networking in the early 80’s, as the TR/4 and TR/3 papers show. Of course, the system that CSRG delivered is different from the ambitions expressed in these papers and for sure opinions on the best choices differed as much back then as they will now - and that makes for an interesting discussion.
Rich was kind enough to look through the Joyce papers to see if it contained "CSRG Tech Report 4: Proposals for Unix on the VAX”. It did.
As list regulars will know I’ve been looking for that paper for years as it documents the early ideas for networking and IPC in what was to become 4.2BSD.
It is an intriguing paper that discusses a network API that is imo fairly different from what ended up being in 4.1a and 4.2BSD. It confirms Kirk McKusick’s recollection that the select statement was modelled after the ADA select statement. It also confirms Clem Cole’s recollection that the initial ideas for 4.2BSB were significantly influenced by the ideas of Richard Rashid (Aleph/Accent/Mach).
Besides IPC and networking, it also discusses file systems and a wide array of potential improvements in various other areas.
> If you search for "Jolitz"
Oh, I meant in the DDJ search box, not a general Web search.
> One of the items listed in WP, "Copyright, Copyleft, and Competitive
> Advantage" (Apr/1991) wasn't in the search results .. Since it's not in
> the 'releases' page, it might not really be part of the series?
Also, the last article in the series ("The Final Step") says the series was 17
articles long, not the 18 you get if you include "Copyright".
Noel
>Date: Tue, 07 Jan 2020 14:57:40 -0500.
>From: Doug McIlroy <>
>To: tuhs(a)tuhs.org, thomas.paulsen(a)firemail.de
>Subject: Re: [TUHS] screen editors
>Message-ID: <202001071957.007JveQu169574(a)coolidge.cs.dartmouth.edu>
>Content-Type: text/plain; charset=us-ascii
.. snip ..
>% wc -c /bin/vi bin/sam bin/samterm
>1706152 /bin/vi
> 112208 bin/sam
> 153624 bin/samterm
>These mumbers are from Red Hat Linux.
>The 6:1 discrepancy is understated because
>vi is stripped and the sam files are not.
>All are 64-bit, dynamically linked.
That's a real big vi in RHL. Looking at a few (commercial) unixes I get
SCO UNIX 3.2V4.2 132898 Aug 22 1996 /usr/bin/vi
- /usr/bin/vi: iAPX 386 executable
Tru64 V5.1B-5 331552 Aug 21 2010 /usr/bin/vi
- /usr/bin/vi: COFF format alpha dynamically linked, demand paged
sticky executable or object module stripped - version 3.13-14
HP-UX 11.31 748996 Aug 28 2009 /bin/vi
-- /bin/vi: ELF-32 executable object file - IA64