Every time someone starts spouting about how unsafe
C is, and how all the world's problems would be solved
if only people would stop using it, I think of Flon's
Axiom, for 35 years my favourite one-liner about
programming and languages:
There does not now, nor will there ever, exist a
programming language in which it is the least bit
hard to write bad programs.
Flon's Axiom comes from a short note On Research
in Structured Programming, published in SIGPLAN
Notices in October 1975. It's just as true today.
Over the years I've seen people misinterpret the
Axiom as an argument against looking for better
programming languages at all, but that's not what
it means. (Read the original note--it's a page
and a half--for full context; it is, alas, behind
ACM's Digital Library paywall.) There are certainly
languages that make certain sorts of mistakes easier
or harder, or are easier or harder to read, but in
the end most of that really is up to the programmer.
Programming well requires a lot of thought and care
and careful rereading, and often throwing half the
code out and re-doing it better, and until we can
have a programming community the majority of whom
are up to those challenges, we will continue to have
crashes and security vulnerabilities and other
embarrassing bugs aplenty, no matter what language
is used.
Norman Wilson
Toronto ON
The latest issue of the IEEE Annals of Computing was published
electronically today, and it has an article that I expect many
TUHS list readers will enjoy reading:
Notes on the History of Fork and Join
http://dx.doi.org/10.1109/MAHC.2016.34
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
All, sorry this is slightly off-topic. I'm trying to
find out what fstat(2) returns when the file descriptor
is a pipe. The POSIX/Open Group documentation doesn't
really specify what should be returned. Does anybody have
any pointers?
Thanks, Warren
P.S. Why? xv6 has fstat() but returns an error if the
file descriptor isn't associated with an i-node. I'm
trying to work out if/how to fix it.
I remember once, long ago--probably in the early 1980s--writing
a program that expected fstat on a pipe to return the amount of
data buffered in the pipe. It worked on the system on which
I wrote the code. Then I tried it on another, related but
different UNIX, and it didn't work. So if POSIX/SUS don't
prescribe a standard, I don't think one should pretend there
is one, and (as I learned back then) it's unwise to depend
on the result, except I think it's fair not to expect fstat
to fail on any valid file descriptor.
I'm pretty sure that in 7/e and earlier, fstat on a pipe
reported a regular file with zero links. There was a reason
for this: the kernel in fact allocated an i-node from a
designated pipe device (pipedev) file system, usually the
root. So the excuse that `there's no i-node' was just wrong.
In last-generation Research systems, when pipes were streams
(and en passant became full duplex, which caused no trouble
at all but simplified life elsewhere--I think I was the one
who realized that meant we didn't need pseudo-ttys any more),
the system allocated a pair of in-core i-nodes when a pipe
was created. As long as such an i-node cannot be accidentally
confused with one belonging to any disk file system, this
causes no trouble at all, and since it is possible to have
more than one disk file system this is trivially possible
just by reserving a device number. (In fact by then our
in-core i-nodes were marked with a file system type as well,
and pipes just became their own file system.) stat always
returned size 0 for (Research) stream pipes, partly because
nobody cared enough, partly because the implementation of
streams didn't keep an exact count of all the buffered data
all along the stream, just a rough one sufficient for flow
control. Besides, with a full-duplex pipe, which direction's
data should be counted?
Returning to the original question, I'd suggest that:
-- fstat(fd) where fd is a pipe should succeed
-- the file should be reported to have zero links,
since that is the case for a pipe (unless a named pipe,
but if you support those you probably have something
else to stat anyway)
-- the file type should be IFIFO if that type exists
in xv6 (which it wouldn't were it a real emulation of
6/e, but I gather that's not the goal), IFREG otherwise
-- permissions probably don't matter much, but for
sanity's sake should be some reasonable constant.
Norman Wilson
Toronto ON
> From: Warren Toomey
> xv6 is a Unix-like OS written for teaching purposes.
I'm fairly well-aware of Xv6; I too am planning to use it in a project.
But back to the original topic, it sounds like there's a huge amount of
variance in the semantics of doing fstat() on a pipe. V6 doesn't special-case
it in any way, but it sounds as if other systems do.
What V6 does (to complete the list) is grow the temporary file being used to
buffer the pipe contents up to a certain maximum size, whereupon it halts the
writer, and waits for the reader to catch up - at which point it truncates
the file, and adjusts the read and write pointers back to 0. So fstat() on
V6, which doesn't special-case pipes in any way for fstat(), apparently
returns 'waiting_to_be_read' plus 'already_read'.
>>> xv6 has fstat() but returns an error if the file descriptor isn't
>>> associated with an i-node.
>> ?? All pipe file descriptors should have an inode?
To answer my own question, after a quick look at the Xv6 sources (on my
desktop ;-); it turns out that Xv6 handles pipes completely differently;
instead of borrowing an inode, they have special 'pipe' structures. Hence the
error return in fstat() on Xv6. (That difference also limits the amount of
buffered data in a pipe to 512 bytes. So don't expect high throughput from a
pipe on Xv6! :-)
So I guess you get to pick which semantics you want fstat() on a pipe to have
there: V6's, V7's (see below), or something else! :-)
> 7th Ed seems to return the amount of free space in the pipe, if I read
> the code correctly:
I'm not sure of that (see below), but I think it would make more sense to
return the amount of un-read data (which is what I think it does do), as the
closest semantics to fstat() on a file.
It might also make sense to return the amount of free space (to a writer), and
the amount of data available to read (to a reader), since those are the
numbers users will care about. (Although then fstat() on the write side of a
pipe will have semantics which are inconsistent with fstat() on files. And if
the user code knows the maximum amount of buffering in a pipe, it could work
out the available write space from that, and the amount currently un-read.)
> fstat()
> {
> ...
> /* Call stat1() with the current offset in the pipe */
> stat1(fp->f_inode, uap->sb, fp->f_flag&FPIPE? fp->f_un.f_offset: 0);
> }
> stat1()
> {
> ...
> ds.st_size = ip->i_size - pipeadj;
I'm too lazy to go read the code (even though I already have it :-), but V7
seems to usually be very similar to V6. So, what I suspect this code does is
pass the expression:
((fp->f_flag & FPIPE) ? fp->f_un.f_offset : 0)
as 'pipeadj' (to account for the amount that's already been read), and then
returns (ip->i_size - pipeadj), i.e. the amount remaining un-read, as the
size.
Noel
> From: Warren Toomey
> I'm trying to find out what fstat(2) returns when the file descriptor
> is a pipe.
In V6, it returns information about the file (inode) used as a temporary
storage area for data which has been written into the pipe, but not yet read;
i.e. it's an un-named file with a length which varies between 0 and 4KB.
> xv6 has fstat() but returns an error if the file descriptor isn't
> associated with an i-node.
?? All pipe file descriptors should have an inode?
Noel
Hi all, I'm working on a Unix-related project, and I thought I'd ask if
anybody here might help.
There's a pared-down Unix-like system, xv6, which is inspired by 6th Edition
Unix and the Lions Commentary. Its purpose is to teach OS principles.
The website and book are here:
https://pdos.csail.mit.edu/6.828/2014/xv6.htmlhttps://pdos.csail.mit.edu/6.828/2014/xv6/book-rev8.pdf
Unfortunately, while the kernel is nice, they don't provide much of
a run-time environment, so it feels too much of a toy to use. I had the
idea of porting a small set of libraries and commands over to get it to
the point where it feels a bit like 7th Edition.
I've made a start by using the Minix 2.0 libraries and commands, see
https://github.com/DoctorWkt/xv6-minix2 and the NOTES file. I now realise
that bringing up a libc plus associated commands will involve a fair bit of
work.
So, if anybody is interested in helping, let me know.
Thanks in advance, Warren
Dave Horsfall:
Not Henry Spencer, perchance?
=====
Since the Canadian in question had been working in the US since
1964 or so, he must by now be pushing 70 years old.
I haven't seen Henry for some years, but I don't think he has
aged that much.
Norman Wilson
Toronto ON
> Date: Sat, 30 Jul 2016 15:30:36 +0000
> From: Michael Kjörling <michael(a)kjorling.se>
> To: tuhs(a)tuhs.org
> Subject: Re: [TUHS] History repeating itself
> Message-ID: <20160730153036.GI3375(a)yeono.kjorling.se>
> Content-Type: text/plain; charset=utf-8
>
> On 30 Jul 2016 10:15 -0400, from cowan(a)mercury.ccil.org (John Cowan):
>>> Who needs FedEx?
>>
>> Well, latency counts for something too, as does radius: if I want to
>> send bulk data from New York to London (a very normal thing to do),
>> your station wagon isn't going to count for much.
>
> You could, however, get an economy class flight ticket and load up
> your suitcase with either HDDs or SDXCs (I suspect SDXCs would be
> better per amount of data from the perspective of both volume and
> weight, and would take better to handling). Given FedEx's prices,
> _once you have the infrastructure set up_ (which you'll need whether
> you have someone travel with the media, by air or by stationwagon, or
> FedEx it), that _might_ even compare favorably in terms of bytes
> transferred per second per dollar. (Now that's a measurement of
> throughput I don't think I've seen before; B/s/$.) Of course, you'd
> need someone who can babysit the suitcase, which potentially adds to
> the cost, but the stationwagon traditionally hasn't been self-driving
> either, and most of a transatlantic flight isn't active time on part
> of the person travelling with the suitcase so you could go with an
> overnight flight and allow the person to sleep.
>
> If you want to reduce the risk of the bag getting handled roughly or
> lost in handling, reduce the above to carry-on luggage; it will still
> provide a quite respectable throughput.
>
> ... ...
>
> It might not be the absolute cheapest approach, but it seems rather
> hard to beat in terms of throughput per dollar for bulk data transfer,
> especially if you already have someone who would travel anyway and can
> be convinced to take a company-approved suitcase in return for having
> their ticket paid for.
>
> --
> Michael Kjörling • https://michael.kjorling.se • michael(a)kjorling.se
> “People who think they know everything really annoy
> those of us who know we don’t.” (Bjarne Stroustrup)
>
To setup the 'infrastructure might be the tricky part. Many years ago
I flew from Montreal to Amsterdam and had two stacks of 5-1/4"
diskettes with me. No papers, confiscated in Amsterdam.
Cheers,
Rudi
Hi folks,
My root partition for Unix v6 is almost full and /dev/rk0 only has 83 blocks.
The trouble is I wanted to compile bc.y and I think it needs around
300 blocks of temporary space. I was wondering if there was a way to
set up Unix v6 so that it could use one of the other drives for tmp
space. I tried to set up a link using ln but it seems I can't link
across filesystems.
The exact error is "26: Intermediate file error".
I managed to rearrange things so that /dev/rk0 had over 300 blocks of
free space and it fixed the problem, but I'm curious if there was
another solution.
Mark
Clem Cole:
Also to be fair, Dennis did symlinks before 4.2. They were part of the V8
I believe.
=======
I'm pretty sure they came from Berkeley nevertheless. I don't know
the exact order of events, but the 8th Edition kernel was essentially
that from one of the later 4.1x BSDs, hacked in 1127 to remove sockets
and FFS (were they even there yet), then to add Dennis's stream I/O
system, Tom Killian's original /proc, and Peter Weinberger's neta
network-file-system client. Perhaps a few other hooks as well.
Symlinks were already there, and although we made some limited careful
use of them, made nobody very happy because they made such a big
irregular lump in so many things: file system no longer a tree,
difference between stat and lstat, and so on.
One thing 8/e did differently from Berkeley was that ls by default
hid symlinks rather than trotting them out proudly. If f was a
symlink, ls -l f showed the state of the target file, not that of
the link; one had to do ls -lL f to see the symlink itself.
That reflected a general feeling that symlinks should be neither
seen nor heard unless necessary.
Norman Wilson
Toronto ON
William Pechter:
Only thing I can think of is add another drive or partition and mount it
as /tmp.
=====
You say that as if it's a bad thing.
Norman Wilson
Toronto ON
mount >> ln -s
Just to be clear: I don't pine at all for UUCP.
I do still think it's a mistake that e-mail addresses and
domain names run backwards from the way directories and
filenames run. That's what I miss about !norman vs
norman@.
But it's all a Beta-vs-VHS matter these days, like a lot
of unfortunate design decisions that have become standard
over the years. Like git winning out over hg, which is
sort of like the VAX/VMS command language winning out over
the Bourne shell. (To toss another pebble into the pond
to see what the ripples look like, rather in the manner
of Rob and Dave.)
Norman Wilson
Toronto ON
I recently noticed that lpr has a symlink option ("-s") on Solaris but
not on Apple. Is there anything here historically except prudence and
small drives?
N.
> I heard that Bob Morris was asked for his initials, he said “rm”, they insisted on a middle initial, which he didn’t have, so he supplied “h”, hence “rhm”.
True in principle, but when it happened and who "they" were, is lore
beyond my ken. I presume it was before he joined Bell Labs. At the
labs, interoffice communications typically used initials, so the
DMR, JFO, RHM convention was well established. Only the affectation
of lower-case only was new--and that was the fault of unicase Model
33. Who wanted to SHOUT EVERYTHING they wrote, or litter it with escapes?
doug
Google was not the first place Rob and Dave had fun with names.
At one point, Rob had a duplicate entry in /etc/passwd,
with login name r, password empty, normal userid/groupid/home
directory, special shell. The shell program checked whether
it was running on a particular host and a particular hardwired
serial line: if yes, it ran the program that started the Research
version of the window system for our bitmapped terminals;
otherwise it just exited. The idea seemed to be to let him
log in quickly in his office.
I think that by the time I arrived at Bell Labs he'd stopped
using it, because it no longer worked, because we no longer
ran serial lines directly from computers to offices--everyone
was connected via serial-port Datakit instead.
While I was there, senior management bought a Cray X-MP/24 for
the research group. (Thank you for using AT&T.) Since it too
was accessible via Datakit (using a custom hardware interface
built by Alan Kaplan, but that's another story), it had to have
a hostname. It was either Dave or Rob, I forget which, who
suggested 3k, because (a) it was a supercomputer, so `big bang'
seemed to fit; (b) it was Arno Penzias, then VP for Research,
who got us the money, so `big bang' and 3K radiation seemed
even more appropriate; and, most important, (c) it was fun to
see whether a hostname beginning with a digit broke anything.
So far as I recall, nothing broke. Some people who were
involved with TCP/IP networking at the labs were frightened
about it; I don't remember whether that Cray was ever connected
to an IP network so I don't know whether anything went wrong
there. Of course such names are not a problem today, but
in those long-lost days when nobody worried much about buffer
overflows either, such bugs were much more common. Weren't they?
Norman Wilson
Toronto ON
Time to start a new thread :-)
Back when Unix was really Unix and dinosaurs strode the earth, login names
were restricted to just 8 characters, so you had to be inventive when
signing up lots of students every term (ObUS: semester).
A wonderful Japanese girl, Eriko Kinoshita, applied for an account on some
box somewhere. Did I mention that login names defaulted to the first 8
characters of the surname?
Understandably annoyed, Plan B for assigning logins was applied, which was
the first name followed by the first letter of the surname.
Sigh...
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
One gets used to login names. In the 80ish I got 'rubl' and I'm still using it.
Of course in this age of the World Wild Web that may make me easily
trackable. Nothing to hide though :-)
Gr[aeiou]g Lehey:
And I wanted greg@, but it was taken. So I ended up with grog@, and
I've had that for nearly 30 years.
=====
I was !norman for some years, but when I left Bell
Labs for the real world 26 years ago, I was forced
to switch to norman@.
That was part of the price I paid for trading suburban
New Jersey for downtown Toronto. On the whole it was
a more-than-satisfactory trade, and emerging to the
real world broadened my perspectives in many areas,
but being stuck with Hideous Naming was certainly a
minor disadvantage.
Norman Wilson
Toronto ON
research!norman no more
On Jul 14, 2016 7:01 PM, "Peter Jeremy" <peter(a)rulingia.com> wrote:
>
> On 2016-Jul-15 08:36:56 +1000, Dave Horsfall <dave(a)horsfall.org> wrote:
> >On Thu, 14 Jul 2016, Clem Cole wrote:
> >And on the Mac and FreeBSD, they still are (as well as being builtins).
>
> FreeBSD provides a convenient list of what commands are (currently)
builtin
> to the provided shells and available externally:
> https://www.freebsd.org/cgi/man.cgi?builtin
>
Bash man page does as well along with command -v (and hash IIRC) letting
you know.
I've always been curious though - what was the reason behind implementing
/bin/[ ? IDK any shell where this isn't implemented - I always assumed it's
a POSIX compatibility stopgap older systems needed to stay compliant with
their shipped shell.
I remember hearing that originally the Unix shell had control structures
(e.g. if, while, case) implemented through external commands. However,
I can't see this reflected in the source code. The 7th Edition Bourne
shell has these commands built-in (usr/src/cmd/sh/cmd.c), while the 6th
Edition (usr/source/s2/sh.c) seems to lack them completely.
The only external command I found was glob, which performed wildcard
expansion.
Am I missing something? Was this implemented in a version that was
never released? If so, does anyone know how this implementation worked?
(Nested commands might require holding some sort of globally
accessible stack.)
> As far as I know, it [|] has always been used as 'or' on computers.
I was on the NPL (eventually PL/I) committee when IBM 'generously'
increased the 360 character set from 48 to 60. George Radin grabbed
| for OR, before IBM announced the character set. Previously
the customary use for | in logic was the "Scheffer stroke", which
we now know as NAND. So "always" is ever since it became available.
Was PL/I the first to adopt it?
Doug
Dave Horsfall:
I still remember when the pipe command was "^" (pointy hat).
====
I still remember--barely--when \136 was up-arrow, not carat!
I don't think pipe was ever only ^, but that ^ was a
synonym for | added to make it easier to use on older
upper-case terminals that had no |. Those (remaining
few) who were there at the time can perhaps clarify.
I still habitually quote shell arguments containing ^,
even though I haven't used a shell that required that
since late 1984 (Rob had removed the special meaning
from /bin/sh before I arrived at Bell Labs). On the
other hand, I still cannot be bothered to get used to
quoting arguments containing !; I just disable all
that history and editing bloatware whenever possible.
Norman Wilson
Toronto ON