> Message: 7
> Date: Thu, 15 Jul 2021 10:28:04 -0400
> From: "Theodore Y. Ts'o"
> Subject: Re: [TUHS] head/sed/tail (was The Unix shell: a 50-year view)
>
> On Wed, Jul 14, 2021 at 10:38:06PM -0400, Douglas McIlroy wrote:
>> Head might not have been written if tail didn't exist. But, unlike head,
>> tail strayed from the tao of "do one thing well". Tail -r and tail -f are
>> as cringeworthy as cat -v.
>>
>> -f is a strange feature that effectively turns a regular file into a pipe
>> with memory by polling for new data, A clean general alternative
>> might be to provide an open(2) mode that makes reads at the current
>> file end block if some process has the file open for writing.
>
> OTOH, this would mean adding more functionality (read: complexity)
> into the kernel, and there has always been a general desire to avoid
> pushing <stuff> into the kernel when it can be done in userspace. Do
> you really think using a blocking read(2) is somehow more superior
> than using select(2) to wait for new data to be appended to the file?
>
> And even if we did this using a new open(2) mode, are you saying we
> should have a separate executable in /bin which would then be
> identical to cat, except that it uses a different open(2) mode?
Yes, it would put more complexity into the kernel, but maybe it is conceptually elegant.
Consider a classic pipe or a socket and the behaviour of read(2) for those objects. The behaviour of read(2) that Doug proposes for a file would make it in line with that for a classic pipe or a socket. Hence, maybe it should not be a mode, but the standard behaviour.
I often think that around 1981 the Unix community missed an opportunity to really think through how networking should integrate with the foundations of Unix. It seems to me that at that time there was an opportunity to merge files, pipes and sockets into a coherent, simple framework. If the 8th edition file-system-switch had been introduced already in V6 or V7, maybe this would have happened.
On the other hand, the installed base was probably already too large in 1981 to still make breaking changes to core concepts. V7 may have been the last chance saloon for that.
Paul
Below is a response from two of the authors with my response to it
inline. Not very impressed. Hopefully they'll get a clue and up
their game. In any case, enough time spent on it.
Jon
Michael Greenberg writes:
>
> HotOS isn't exactly a conventional venue---you might notice that many
> if not most HotOS papers don't fit the outline you've given.
I'm aware of that, and have participated in many unconventional venues
myself. I wasn't saying that papers must fit that outline, but I do
believe that they should contain that information. There's a big
difference between a discussion transcript and a paper; I believe that
papers, especially those published under the auspices of a prestigious
organization such as the ACM, should adhere to a higher standard.
> I'd definitely appreciate detailed feedback on any semantic errors
> we've made!
Unfortunately I can't help you here; that was feedback from
a reader who doesn't want to spend any more time on this.
> Your summary covers much of what we imagined!
>
> As I understand it, the primary goals of the paper were to (a) help
> other academics think about the shell as a viable area for research, (b)
> highlight some work we're doing on JIT compilation, (c) make the case
> for JIT approaches to the shell in general (as its well adapted to its
> dynamism), and (d) explore various axes on which the shell could be
> improved. It seems like we've done a good job communicating (b) to you,
> but perhaps not the rest. Sorry again to disappoint.
I certainly hope that you understand the primary goals of your own paper.
Point-by-point:
 (a) While this is a valid point I don't understand why the paper didn't
just state it in a straightforward manner. There are several common
forms. One is to list issues in the introduction while explaining
which one(s) will be addressed in the paper. Another is in the
conclusion where authors list work still to be done.
 (b) At least for me this goal was not accomplished because there were no
examples. Figure 1 by itself is insufficient given that the code
used to generate the "result" is not shown. It would have been much
more illuminating had the paper not only shown that code but also the
optimized result. Professionals don't blithely accept magic data.
 (c) The paper failed to make this case to me for several reasons.
   As I understand it, the paper is somewhat about applying JIT
   compilation techniques to interconnected processes. While most
   shells include syntax to support the construction of such, it's
   really independent of the shell. For completeness, I have a vague
   memory of shell implementations for non-multitasking systems that
   sequentially ran pipelined programs passing intermediate results
   via temporary files. The single "result" reminds me of something
that I saw at a science fair when my child was in middle school;
I looked a one team's results and asked "What makes you think that
a sample size of one is significant?" The lack of any benchmarks
or other study results that justified the work also undermined the
case. It reads more like the authors had something that they wanted
to play with rather than doing serious research. The paper does not
examine the percentage of shell scripts that actually benefit from
JIT compilation; for all the reader may know it's such a small number
that hand-optimizing just those scripts might be a better solution.
I suppose that the paper fits into the apparently modern philosophy
of expecting tools to fix up poorly written code so that programmers
don't have to understand what they're doing.
 (d) In my opinion the paper didn't do this at all. There was no
   analysis of "the shell" showing weaknesses and an explanation
   of why one particular path was taken. And there was no discussion
   of what was being done with shells to cause whatever problems you
   were addressing and possibly ameliorating problems with some up-front
   sanity. Again, being a geezer I'm reminded of past events
   that repeat themselves. I joined a start-up company decades ago
   that was going to speed up circuit simulation 100x by plugging
   custom-designed floating-point processing boards into a multiprocessor
   machine. I managed to beat that 100x just by cleverly crafting the
database and memory management code. The fact that the company founder
never verified his idea led to a big waste of resources. But, he did
manage to raise venture capital which is akin to getting DARPA funds.
Nikos Vasilakis writes:
>
> To add to Michael's points, HotOS'Â "position" papers are often
> intended as provocative, call-to-arms statements targeting primarily
> the research community (academic and industrial research). Our key
> position, which we possibly failed to communicate, is "Hey researchers
> everywhere, let's do more research on the shell! (and here's why)".
While provocative name-calling and false statements seem to have become
the political norm in America I don't think that they're appropriate in
a professional context.
In my experience a call-to-arms isn't productive unless those called
understand the nature of the call. I'm reminded of something that happened
many decades ago; my partner asked me to proof a paper that she was writing
for her masters degree. I read it over with a confused look and asked her
what she was trying to say. She responded, and I told her to write that
down and stop trying to sound like someone else. Turned her into a much
better writer. So if the paper wanted to say "Hey researchers ..." it
should have done so instead of being obtuse.
To continue on this point and Michael's (a) above, I don't see a lot of
value in proclaiming that research can be done. I think that a more
fruitful approach is to cover what has been done, what you're doing,
and what you see but aren't doing.
> For our more conventional research papers related to the shell, which
> might cover your concerns about semantics, correctness, and
> performance please see next. These three papers also provide important
> context around the HotOS paper you read:
> ...
Tracking down your other work was key to understanding this paper. It's
my opinion that my having to do is illustrative of the problems with the
paper.
> Thank you for taking the time to read our paper and comment on it.
> Could you please share the emails of everyone mentioned at the end of
> your email? We are preparing a longer report on a recent shell
> roundtable, and would love to get everyone's feedback!
While I appreciate the offer to send the longer report, it would only be
of interest if it was substantially more professional. There is no interest
in reviewing work that is not clearly presented, has not been proofread
and edited, includes unsubstantiated, pejorative, or meaningless statements,
includes incorrect examples, or statistically irrelevant results. Likewise,
there is also no interest if the homework hasn't been done to put the
report in context with regards to prior art and other current work.
Jon Steinghart <jsacm(a)fourwinds.com>
Warner Losh <imp(a)bsdimp.com>
John Cowan <cowan(a)ccil.org>
Larry McVoy <lm(a)mcvoy.com>
John Dow <jmd(a)nelefa.org>
Andy Kosela <akosela(a)andykosela.com>
Clem Cole <clemc(a)ccc.com>
Steve Bourne does not want to give out his address
On the subject of tac (concatenate and print files in reverse), I can
report that the tool was written by my late friend Jay Lepreau in the
Department of Computer Science (now, School of Computing) at the
University of Utah. The GNU coreutils distribution for src/tac.c
contains a copyright for 1988-2020.
I searched my TOPS-20 PDP-10 archives, and found no source code for
tac, but I did find an older TOPS-20 executable in Jay's personal
directory with a file date of 17-Mar-1987. There isn't much else in
that directory, so I suspect that he just copied over a needed tool
from his Department of Computer Science TOPS-20 system to ours in the
College of Science.
----------------------------------------
P.S. Jay was the first to get Steve Johnson's Portable C Compiler,
pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the
move from writing utilities in Pascal and PDP-10 assembly language to
doing them in C. The oldest C file for pcc in our PDP-10 archives is
dated 17-Mar-1981, with other pcc files dated to mid-1983, and final
compiler executables dated 12-May-1986. Four system header files are
dated as late as 4-Oct-1986, presumably patched after the compiler was
built.
Later, Kok Chen and Ken Harrenstien's kcc provided another C compiler
that added support for byte datatypes, where a byte could be anything
from 1 to 36 bits. The oldest distribution of kcc in our archives is
labeled "Fifth formal distribution snapshot" and dated 20-Apr-1988.
My info-kcc mailing list archives date from the list beginning, with
an initial post from Ken dated 27-Jul-1986 announcing the availability
of kcc at sri-nic.arpa.
By mid-1987, we had a dozen Sun workstations and NFS fileserver; they
marked the beginning of our move to a Unix workstation environment,
away from large, expensive, and electricity-gulping PDP-10 and VAX
mainframes.
By the summer of 1991, those mainframes were retired. I recall
speaking to a used-equipment vendor about our VAX 8600, which cost
about US$450K (discounted academic pricing) in 1986, and was told that
its value was depreciating about 20% per month. Although many of us
missed TOPS-20 features, I don't think anyone was sad to say goodbye
to VMS. We always felt that the VMS developers worked in isolation
from the PDP-10 folks, and thus learned nothing from them.
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
Some comments from someone (me) who tends to be pickier than
most about cramming programs together and endless sets of
options:
I, too, had always thought sed was older than head. I stand
corrected. I have a long-standing habit of typing sed 10q but
don't spend much time fussing about head.
When I arrived at Bell Labs in late summer 1984, tail -f was
in /usr/bin and in the manual, readslow was only in /usr/bin.
readslow was like tail -f, except it either printed the entire
file first or (option -e) started at the end of the file.
I was told readslow had come first, and had been invented in a
hurry because people wanted to watch in real time the moves
logged by one of Belle's chess matches. Vague memory says it
was written by pjw; the name and the code style seem consistent
with that.
Personally I feel like tail -r and tail -f both fit reasonably
well within what tail does, since both have to do with the
bottom of the file, though -r's implementation does make for
a special extra code path in tail so maybe a separate program
is better. What I think is a bigger deal is that I have
frequently missed tail -r on Linux systems, and somehow hadn't
spotted tac; thanks to whoever here (was it Ted?) pointed it
out first!
On the other hand, adding data-processing functions to cat has
never made sense to me. It seems to originate from a mistaken
notion that cat's focus is printing data on terminals, rather
than concatenating data from different places. Here is a test:
if cat -v and cat -n and all that make sense, why shouldn't
cat also subsume tr and pr and even grep? What makes converting
control characters and numbering lines so different from swapping
case and adding page headers? I don't see the distinction, and
so I think vis(1) (in later Research) makes more sense than cat -v
and nl(1) (in Linux for a long time) more sense than cat -n.
(I'd also happily argue that given nl, pr shouldn't number lines.
That a program was in V6 or V7 doesn't make it perfect.)
And all those special options to wc that amounted to doing
arithmetic on the output were always just silly. I'm glad
they were retracted.
On the other other hand, why didn't I know about tac? Because
there are so damn many programs in /usr/bin these days. When
I started with UNIX ca. 1980, the manual (even the BSD version)
was still short enough that one could sit down and read it through,
section by section, and keep track of what one had read, and
remember what all the different tools did. That hasn't been
true for decades. This could be an argument for adding to
existing programs (which many people already know about) rather
than adding new programs (which many people will never notice).
The real problem is that the system is just too damn big. On
an Ubuntu 18.04 system I run, ls /usr/bin | wc -l shows 3242
entries. How much of that is redundant? How much is rarely or
never used? Nobody knows, and I suspect few even try to find
out. And because nobody knows, few are brave enough to throw
things away, or even trim out bits of existing things.
One day in the late 1980s, I helped out with an Introduction
to UNIX talk at a DECUS symposium. One of the attendees noticed
the `total' line in the output of ls, and asked why is that there?
doesn't that contradict the principles of tools' output you've
just been talking about? I thought about it, and said yes,
you're right, that's a bit of old history and shouldn't be
there any more. When I got home to New Jersey, I took the
`total' line out of Research ls.
Good luck doing anything like that today.
Norman Wilson
Toronto ON
What was the first machine to run rogue? I understand that it was written
by Glenn Wichman and Michael Toy at UC Santa Cruz ca. 1980, using the
`curses` library (Ken Arnold's original, not Mary Ann's rewrite). I've seen
at least one place that indicates it first ran on 6th Edition, but that
doesn't sound right to me. The first reference I can find in BSD is in 2.79
("rogue.doc"), which also appears to be the first release to ship curses.
Anyone have any info? Thanks!
- Dan C.
In this week's BSDNow.tv podcast, available at
https://www.bsdnow.tv/409
there is a story about a new conference paper on the Unix shell. The
paper is available at
Unix shell programming: the next 50 years
HotOS '21: Workshop on Hot Topics in Operating Systems, Ann
Arbor, Michigan, 1 June, 2021--3 June, 2021
https://doi.org/10.1145/3458336.3465294
The tone is overall negative, though they do say nice things about
Doug McIlroy and Steve Johnson, and they offer ideas about
improvements.
List readers will have their own views of the paper. My own is that,
despite its dark corners, the Bourne shell has served us
extraordinarily well, and I have been writing in it daily for decades
without being particularly bothered by the many issues raised by the
paper's authors. Having dealt with so-called command shells on
numerous other operating systems, at least the Unix shells rarely get
in my way.
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
> From: Jon Steinhart
> I use UNIX instead of Unix as that's what I believe is the correct form.
Well, Bell documentation uses "UNIX" through V6:
https://minnie.tuhs.org//cgi-bin/utree.pl?file=V6/usr/doc/start/start
"Unix" starts to appear with V7:
https://minnie.tuhs.org//cgi-bin/utree.pl?file=V7/usr/doc/setup
As mentioned, the trademark is "UNIX".
I don't really have a fixed position on _the_ way to spell it: when I'm
wiriting about a specific version (e.g. V6) I use the capitalization as of
that version; for general text I'd probably use 'Unix', as that seems to be
general now. But I could easily be convinced otherwise.
Noel
Once again, thanks to everybody who as contributed to making this a
better letter. Many of you have asked to be co-signers. Please
let me know if I've included your name by mistake or if you'd like
your name to be added. And, of course, let me know if any more
edits are required.
BTW, except where I'm quoting the paper I use UNIX instead of Unix
as that's what I believe is the correct form. Please let me know
if that's not correct.
Thanks,
Jon
I read the "Unix Shell Programming: The Next 50 Years" paper
expecting some well thought out wisdom. I was sorely disappointed.
The paper is lacking the generally accepted form of:
o What problem are you trying to solve?
o What have others done?
o What's our approach?
o How does it do?
Some particulars:
o The paper never defines what is meant by the term "Unix shell."
I think that you're using it to mean a command interpreter as
described in the POSIX 1003.2 documents.
o The paper makes liberal use of the term "Unix" such as "... in
every Unix distribution." While systems descended from UNIX
abound few actual instances of UNIX exist today.
o There is no 50-year-old UNIX shell. I started using UNIX in the
early 1970s, and the command interpreter at the time (Ken Thompson's
shell) was nothing like later shells such as the Bourne shell (sh
since research V7 UNIX), Korn shell (ksh), C shell (csh), and the
Bourne again shell (bash). UNIX mainstreamed the notion of a
command interpreter that was not baked into the system. The paper
lacks any discussion of prior art. In practice, shell implementations
either predate the POSIX standard or were built afterwards and
include non-standard extensions.
o The paper repeatedly claims that the shell has been largely ignored by
academia and industry. Yet, it does not include any references to
support that claim. In fact, the large body of published work on
shells and ongoing work on shells such as zsh shows that claim to be
incorrect.
o The large number of pejorative statements detract from the academic
value of the paper. And, in my opinion, these statements are provably
false. It reads as if the authors are projecting their personal
opinions onto the rest of the world.
o The paper applies universal complaints such as "unmaintainable" to the
shell; it doesn't call out any shell-specific problems. It doesn't
explain whether these complaints are against scripts, implementations,
or both. One of the reasons for the longevity of the family of shells
descended from Bourne's sh is that experienced practitioners have been
able to write easily maintainable code. Scripts written in the 1980s
are still around and working fine.
o The paper seems to complain about the fact that the shell is documented.
This is astonishing. Proper documentation has always been a key
component of being a professional, at least in my decades of experience.
As a matter of fact, my boss telling me that "nobody will give a crap
about your work unless you write a good paper" when I was a teenager
at Bell Labs is what led me to UNIX and roff.
o The paper includes non-sequiturs such as discussions about Docker
and systemd that have nothing to to with the shell.
o The paper has many "no-op" statements such as "arguably improved" that
provide no useful information.
o The example on page 105 don't work as there is no input to "cut".
o The single result in Figure 1 is insufficient evidence that the
approach works on a wide variety of problems.
o The paper gives the appearance that the authors don't actually understand
the Bourne shell semantics. Not just my opinion; Steve Bourne expressed
that in an email to me after he read your paper, and I consider him to be
a subject matter expert.
o The paper confuses the performance of the shell with the performance of
external commands executed by the shell.
o Proofreading should have caught things like "improve performance
performance" on page 107 among others.
I think that the paper is really trying to say:
o Programmable command interpreters such as those found in UNIX based
systems have been around for a long time. For this paper, we're
focusing on the GNU bash implementation of the POSIX P1003.2 shell.
Other command interpreters predate UNIX.
o This implementation is used more often than many other scripting
languages because it is available and often installed as the default
command interpreter on most modern systems (UNIX-based and otherwise).
In particular, it is often the default for Linux systems.
o The shell as defined above is being used in more complex environments
than existed at the time of its creation. This exposes a new set of
performance issues.
o While much work has been done by the bash implementers, it's primarily
been in the area of expanding the functionality, usually in a
backward-compatible manner. Other shells such as the original ksh and
later ash and zsh were implemented with an eye towards the performance
of the internals and user perspectives.
o Performance optimization using modern techniques such as JIT compilation
have been applied to other languages but not to POSIX shell implementations.
This paper looks at doing that. It is unsurprising that techniques that
have worked elsewhere work here too.
It's hard to imagine that the application of this technique is all that's
required for a 50-year life extension. The title of this paper implies
that it's going to be comprehensive but instead concentrates on a couple
of projects. It ignores other active work on shells such as "fish". While
it wouldn't eliminate the issues with the paper, they would not have been
quite so glaring had it had a more modest title such as "Improving POSIX
Shell Performance with JIT Compilation".
Jon Steinhart plus John Cowan, Warner Losh,
John Dow, Steve Bourne, Larry McVoy, and Clem Cole
I not only found this paper offensive, but was more offended that
ACM would publish something like this and give it an award to boot.
I'm planning to send the authors and ACM what's below. Would
appreciate any feedback that you could provide to make it better.
Thanks,
Jon
I read your "Unix Shell Programming: The Next 50 Years" expecting
some well thought out wisdom from learned experiences. I was
sorely disappointed.
o The paper never defines what is meant by the term "Unix shell."
I think that you're using to mean a command interpreter as
described in the POSIX 1003.2 documents.
o There is no 50 year old Unix shell. I started using Unix in the
early 1970s, and the command interpreter at the time (Ken Thompson's
shell) was nothing like later shells such as the Bourne shell (sh
since research V7 Unix), Korn shell (ksh), C shell (csh), and the
Bourne again shell (bash). The paper is missing any discussion of
prior art. In practice, shell implementations either predate the
POSIX standard or were built afterwards and include non-standard
extensions.
o The paper repeats a claim that the shell has been largely ignored by
academia and industry. Yet, it does not include any references that
support that claim. My own experience and thus opinion is the
opposite making the veracity of your claim questionable. As a reader,
such unsubstantiated claims make me treat the entire content as suspect.
o The paper applies universal complaints such as "unmaintainable" to the
shell; it doesn't call out any shell-specific problems. It doesn't
explain whether these complaints are against the scripts, the
implementation, or both. One of the reasons for the longevity of the
sh/bash shells is that experienced practicioners have been able to
write easily maintainable code. Scripts written in the 1980s are
still around and working fine.
o The paper seems to complain that the fact that the shell is documented
is a problem. This is an astonishing statement. In my decades as an
acting professional, teacher, and author, proper documentation is a key
component of being a professional.
o The paper is full of non-sequiturs such as discussions about Docker
and systemd that have nothing to to with the shell.
o The paper has many "nop" statements such as "arguably improved" that
don't actually say anything.
o Examples, such as the one on page 105 don't work.
o Proofreading should have caught things like "improve performance
performance" on page 107 among others.
o The references contain many more items than the paper actually
references. Did you plagerize the bibliography and forget to
edit it?
o The single result in Figure 1 is insufficient evidence that the
approach works on a wide variety of problems.
o The paper makes it appear that the authors don't actually understand
the semantics of the original Bourne shell. Not just my opinion; I
received an email from Steve Bourne after he read your paper, and I
consider him to be a subject matter expert.
The paper is lacking the generally accepted form of:
o What problem are you trying to solve?
o What have others done?
o What's our approach?
o How does it do?
Filtering out all of the jargon added for buzzword compliance, I think
that the paper is really trying to say:
o Programmable command interpreters such as those found in Unix-based
systems have been around for a long time. For this paper, we're
focusing on the GNU bash implementation of the POSIX P1003.2 shell.
Other command interpreters predate Unix.
o This implementation is used more often than many other scripting
languages because it is available and often installed as the default
command interpreter on most modern systems (Unix-based or otherwise).
In particular, it is often the default for Linux systems.
o The shell as defined above is being used in ways that are far more
complex than originally contemplated when Bourne created the original
syntax and semantics, much less the new features added by the POSIX
standards committee. The combination of both the POSIX and bash
extensions to the Bourne shell exposes a new set of limitations and
issues such as performance.
o While much work has been done by the bash implementors, it's primarily
been in the area of expanding the functionality, usually in a
backward-compatible manner. Other shells such as the original ksh and
later ash and zsh were implemented with an eye towards the performance
of the internals and user perspectives.
o Performance optimization using modern techniques such as JIT have been
applied to other languages but not to POSIX shell implementations. This
paper looks at doing that. It is unsurprising that techniques that have
worked elsewhere work here too.
o It's hard to imagine that the application of this technique is all that's
required for a 50-year life extension. The title of this paper implies
that it's going to be comprehensive rather than just being a shameless
plus for an author's project.
Of course, this doesn't make much of a paper. Guessing that that's why it
was so "bulked up" with irrelevancies.
It appears that all of you are in academia. I can't imagine that a paper
like this would pass muster in front of any thesis committee, much less
get that far. Not only for content, but for lack of proofreading and
editing. The fact that the ACM would publish such a paper eliminates any
regret that I may have had in dropping my ACM membership.
Thanks to everyone who provided me feedback on the first pass, especially
those who suggested "shopt -u flame-off". Here's the second version.
Once again, would appreciate feedback.
Thanks,
Jon
I read your "Unix Shell Programming: The Next 50 Years" expecting
some well thought out wisdom. I was sorely disappointed.
The paper is lacking the generally accepted form of:
o What problem are you trying to solve?
o What have others done?
o What's our approach?
o How does it do?
Some particulars:
o The paper never defines what is meant by the term "Unix shell."
I think that you're using it to mean a command interpreter as
described in the POSIX 1003.2 documents.
o There is no 50 year old Unix shell. I started using Unix in the
early 1970s, and the command interpreter at the time (Ken Thompson's
shell) was nothing like later shells such as the Bourne shell (sh
since research V7 Unix), Korn shell (ksh), C shell (csh), and the
Bourne again shell (bash). Unix mainstreamed the notion of a
command interpreter that was not baked into the system. The paper
lacks any discussion of prior art. In practice, shell implementations
either predate the POSIX standard or were built afterwards and
include non-standard extensions.
o The paper repeats a claim that the shell has been largely ignored by
academia and industry. Yet, it does not include any references that
support that claim. In fact, the large body of published work on
shells and ongoing work on shells such as zsh shows that claim to be
incorrect.
o The paper applies universal complaints such as "unmaintainable" to the
shell; it doesn't call out any shell-specific problems. It doesn't
explain whether these complaints are against the scripts, the
implementation, or both. One of the reasons for the longevity of the
family of shells descended from Bourne's sh is that experienced
practicioners have been able to write easily maintainable code.
Scripts written in the 1980s are still around and working fine.
o The paper seems to complain that the fact that the shell is documented
is a problem. This is astonishing. Proper documentation has always
been a key component of being a professional in my decades of experience.
As a matter of fact, my boss telling me that "nobody will give a crap
about your work unless you write a good paper" when I was a teenager
at Bell Labs is what led me to UNIX and nroff.
o The paper includes non-sequiturs such as discussions about Docker
and systemd that have nothing to to with the shell.
o The paper has many "nop" statements such as "arguably improved" that
don't actually say anything.
o Examples, such as the one on page 105 don't work as there is no input
to "cut".
o The single result in Figure 1 is insufficient evidence that the
approach works on a wide variety of problems.
o The paper gives the appearance that the authors don't actually understand
the semantics of the original Bourne shell. Not just my opinion; I
received an email from Steve Bourne after he read your paper, and I
consider him to be a subject matter expert.
o Proofreading should have caught things like "improve performance
performance" on page 107 among others.
I think that the paper is really trying to say:
o Programmable command interpreters such as those found in Unix-based
systems have been around for a long time. For this paper, we're
focusing on the GNU bash implementation of the POSIX P1003.2 shell.
Other command interpreters predate Unix.
o This implementation is used more often than many other scripting
languages because it is available and often installed as the default
command interpreter on most modern systems (Unix-based or otherwise).
In particular, it is often the default for Linux systems.
o The shell as defined above is being used in ways that are far more
complex than originally contemplated when Bourne created the original
syntax and semantics, much less the new features from kash adopted by
the POSIX standards committee. The combination of both the POSIX and
bash extensions to the Bourne shell exposes a new set of limitations
and issues such as performance.
o While much work has been done by the bash implementors, it's primarily
been in the area of expanding the functionality, usually in a
backward-compatible manner. Other shells such as the original ksh and
later ash and zsh were implemented with an eye towards the performance
of the internals and user perspectives.
o Performance optimization using modern techniques such as JIT compilation
have been applied to other languages but not to POSIX shell implementations.
This paper looks at doing that. It is unsurprising that techniques that
have worked elsewhere work here too.
It's hard to imagine that the application of this technique is all that's
required for a 50-year life extension. The title of this paper implies
that it's going to be comprehensive but instead concentrates on a couple
of projects. It ignores other active work on shells such as "fish". While
the issues with the paper remain, they would not have been quite so glaring
had it had a more modest title such as "Applying JIT Compilation to the
POSIX Shell".