>> From: Doug McIlroy <doug(a)cs.dartmouth.edu>
>> The spec below isn't hard: just hook two buffer chains together and
>> twiddle a couple of file desciptors.
> In thinking about how to implement it, I was thinking that if there was
> any buffered data in an output pipe, that the process doing the
> splice() would wait (inside the splice() system call) on all the
> buffered data being read by the down-stream process.
> ...
> As a side-benefit, if one adopted that line, one wouldn't have to deal
> with the case (in the middle of the chain) of a pipe-pipe splice with u
> buffered data in both pipes (where one would have to copy the data
> across); instead one could just use the exact same code for both cases
So a couple of days ago I suffered a Big Hack Attack and actually wrote the
code for splice() (for V6, of course :-).
It took me a day or so to get 'mostly' running. (I got tripped up by pointer
arithmetic issues in a number of places, because V6 declares just about
_everything_ to be "int *", so e.g. "ip + 1" doesn't produce the right value
for sleep() if ip is declared to be "struct inode *", which is what I did
automatically.)
My code only had one real bug so far (I forgot to mark the user's channels as
closed, which resulted in their file entries getting sub-zero usage counts
when the middle (departing) process exited).
However, now I have run across a real problem: I was just copying the system
file table entry for the middle process' input channel over to the entry for
the downstream's input (so further reads on its part would read the channel
the middle process used to be reading). Copying the data from one entry to
another meant I didn't have to go chase down file table pointers in the other
process' U structure, etc.
Alas, this simple approach doesn't work.
Using the approach I outlined (where the middle channel waits for the
downstream pipe to be empty, so it can discard it and do the splice by
copying the file table entries) doesn't work, because the downstream process
is in the middle of a read call (waiting for more data to be put in the
pipe), and it has already computed a pointer to the pipe's inode, and it's
looping waiting for that inode to have data.
So now I have to regroup and figure out how to deal with that. My most likely
approach is to copy the inode data across (so I don't have to go mess with the
downstream process to get it to go look at another inode), but i) I want to
think about it a bit first, and ii) I have to check that it won't screw
anything else up if I move the inode data to another slot.
Noel
> From: Mark Longridge <cubexyz(a)gmail.com>
> I was wondering if there might be a better way to do a shutdown on
> early unix.
Not really; I don't seem to recall our having one on the MIT V6 machine.
(We did add a 'reboot' system call so we could reboot the machine without
having to take the elevator up to the machine room [the console was on our
floor, and the reboot() call just jumped into the hardware bootstrap], but in
the source it doesn't even bother to do an update(). Well, I should't say
that: I only have the source for the kernel, which doesn't; I don't at the
moment have access to the source for the rest of the system - although I do
have some full dump tapes, once I can work out how to read them. Anyway, so
maybe the user command for rebooting the system did a sync() first.)
I suppose you could set the switch register to 173030 and send a 'kill -1 1',
which IIRC kills of all shells except the one on the console, but somehow
I doubt you're running multi-user anyway... :-)
Noel
>> the cp command seems different from all other versions, I'm not sure I
>> understand it so I used the mv command instead which worked as expected.
>
> I'm intrigued; in what way is it different?
It seems that one must first cp a file to another file then do a mv to
actually put it into a different directory:
e.g. while in /usr/src
as ctr0.s
cp a.out ctr0.o
mv ctr0.o /usr/lib
...rather than trying to just "cp a.out /usr/lib/ctr0.o"
Mark
Yes, an evil necessary to get things going.
The very definition of original sin.
Doug
Larry McVoy wrote:
>>>> For stdio, of course, one would need fsplice(3), which must flush the
>>>> in-process buffers--penance for stdio's original sin of said buffering.
>>> Err, why is buffering data in the process a sin? (Or was this just a
>>> humourous aside?)
>> Process A spawns process B, which reads stdin with buffering. B gets
>> all it deserves from stdin and exits. What's left in the buffer,
>> intehded for A, is lost. Sinful.
> It really depends on what you want. That buffering is a big win for
> some use cases. Even on today's processors reading a byte at a time via
> read(2) is costly. Like 5000x more costly on the laptop I'm typing on:
> Err, why is buffering data in the process a sin? (Or was this just a
humourous aside?)
Process A spawns process B, which reads stdin with buffering. B gets
all it deserves from stdin and exits. What's left in the buffer,
intehded for A, is lost. Sinful.
> From: Doug McIlroy <doug(a)cs.dartmouth.edu>
> The spec below isn't hard: just hook two buffer chains together and
> twiddle a couple of file desciptors.
How amusing! I was about to send a message with almost the exact same
description - it even had the exact same syntax for the splice() call! A
couple of points from my thoughts which were not covered in your message:
In thinking about how to implement it, I was thinking that if there was any
buffered data in an output pipe, that the process doing the splice() would
wait (inside the splice() system call) on all the buffered data being read by
the down-stream process.
The main point of this is for the case where the up-stream is the head of the
chain (i.e. it's reading from a file), where one more or less has to wait,
because one will want to set the down-streams' file descriptor to point to
the file - but one can't really do that until all the buffered data was
consumed (else it will be lost - one can't exactly put it into the file :-).
As a side-benefit, if one adopted that line, one wouldn't have to deal with
the case (in the middle of the chain) of a pipe-pipe splice with buffered
data in both pipes (where one would have to copy the data across); instead
one could just use the exact same code for both cases, and in that case the
wait would be until the down-stream pipe can simply be discarded.
One thing I couldn't decide is what to do if the upstream is a pipe with
buffered data, and the downstream is a file - does one discard the buffered
data, write it to the file, abort the system call so the calling process can
deal with the buffered data, or what? Perhaps there could be a flag argument
to control the behaviour in such cases.
Speaking of which, I'm not sure I quite grokked this:
> If file descriptor fd0 is associated with a pipe and fd1 is not, then
> fd1 is updated to reflect the effect of buffered data for fd0, and the
> pipe's other descriptor is replaced with a duplicate of fd1.
But what happens to the data? Is it written to the file? (That's the
implication, but it's not stated directly.)
> The same statement holds when "fd0" is exchanged with "fd1" and "write"
> is exchanged with "read".
Ditto - what happens to the data? One can't simply stuff it into the input
file? I think the 'wait in the system call until it drains' approach is
better.
Also, it seemed to me that the right thing to do was to bash the entry in the
system-wide file table (i.e. not the specific pointers in the u area). That
would automatically pick up any children.
Finally, there are 'potential' security issues (I say 'potential' because I'm
not sure they're really problems). For instance, suppose that an end process
(i.e. reading/writing a file) has access to that file (e.g. because it
executed a SUID program), but its neighbour process does not. If the end
process wants to go away, should the neighbour process be allowed access to
the file? A 'simple' implementation would do so (since IIRC file permissions
are only checked at open time, not read/write time).
I don't pretend that this is a complete list of issues - just what I managed
to think up while considering the new call.
> For stdio, of course, one would need fsplice(3), which must flush the
> in-process buffers--penance for stdio's original sin of said buffering.
Err, why is buffering data in the process a sin? (Or was this just a
humourous aside?)
Noel
Larry wrote in separate emails
> If you really think that this could be done I'd suggest trying to
> write the man page for the call.
> I already claimed splice(2) back in 1998; the Linux guys did
> implement part of it ...
I began to write the following spec without knowing that Linux had
appropriated the name "splice" for a capability that was in DTSS
over 40 years ago under a more accurate name, "copy". The spec
below isn't hard: just hook two buffer chains together and twiddle
a couple of file desciptors. For stdio, of course, one would need
fsplice(3), which must flush the in-process buffers--penance for
stdio's original sin of said buffering.
Incidentally, the question is not abstract. I have code that takes
quadratic time because it grows a pipeline of length proportional
to the input, though only a bounded number of the processes are
usefully active at any one time; the rest are cats. Splicing out
the cats would make it linear. Linear approaches that don't need
splice are not nearly as clean.
Doug
SPLICE(2)
SYNOPSIS
int splice(int fd0, int fd1);
DESCRIPTION
Splice connects the source for a reading file descriptor fd0
directly to the destination for a writing file descriptor fd1
and closes both fd0 and fd1. Either the source or the destination
must be another process (via a pipe). Data buffered for fd0 at
the time of splicing follows such data for fd1. If both source
and destination are processes, they become connected by a pipe. If
the source (destination) is a process, the file descriptor
in that process becomes write-only (read-only).
If file descriptor fd0 is associated with a pipe and fd1 is not,
then fd1 is updated to reflect the effect of buffered data for fd0,
and the pipe's other descriptor is replaced with a duplicate of fd1.
The same statement holds when "fd0" is exchanged with "fd1" and
"write" is exchanged with "read".
Splice's effect on any file descriptor propagates to shared file
descriptors in all processes.
NOTES
One file must be a pipe lest the spliced data stream have no
controlling process. It might seem that a socket would suffice,
ceding control to a remote system; but that would allow the
uncontrolled connection file-socket-socket-file.
The provision about a file descriptor becoming either write-only or
read-only sidesteps complications due to read-write file descriptors.
> From: Dave Horsfall <dave(a)horsfall.org>
> crt0.s -> C Run Time (support). It jiggers the stack pointer in some
> obscure manner
It's the initial startup; it sets up the arguments into the canonical C form,
and then calls main(). (It does not do the initial stack frame, a canonical
call to CSV from inside main() will do that.) Here are the exact details:
On an exec(), once the exec() returns, the arguments are available at the
very top of memory: the arguments themselves are at the top, as a sequence of
zero-terminated byte strings. Below them is an array of word pointers to the
arguments, with a -1 in the last entry. (I.e. if there are N arguments, the
array of pointers has N+1 entries, with the last being -1.) Below that is a
word containing the size of that array (i.e. N+1).
The Stack Pointer register points to that count word; all other registers
(including the PC) are cleared.
All CRT0.s does is move that argument count word down one location on the
stack, adjust the SP to point to it, and put a pointer to the argument
pointer table in the now-free word (between the argument count, and the first
element of the argument pointer table). Hence the canonical C main() argument
list of:
int argc;
int **argv;
If/when main() returns, it takes the return value (passed in r0) and calls
exit() with it. (If using the stdio library, that exit() flushes the buffers
and closes all open files.) Should _that_ return, it does a 'sys exit'.
There are two variant forms: fcrt0.s arranges for the floating point
emulation to be loaded, and hooked up; mcrt0.s (much more complicated)
arranges for process monitoring to be done.
Noel
Hi folks,
Yes I have managed to compile Hello World on v1/v2.
the cp command seems different from all other versions, I'm not sure I
understand it so I used the mv command instead which worked as
expected.
I had to "as crt0.s" and put crt0.o in /usr/lib and then it compiled
without issue.
Is the kernel in /etc? I saw a core file in /etc that looked like it
would be about the right size. No unix file in the root directory
which surprised me.
At least I know what crt0.s does now. I guess a port of unirubik to
v1/v2 is in the cards (maybe).
Mark
Hi folks,
I'm interested in comparing notes with C programmers who have written
programs for Unix v5, v6 and v7.
Also I'm interested to know if there's anything similar to the scanf
function for unix v5. Stdio and iolib I know well enough to do file IO
but v5 predates iolib.
Back in 1988 I tried to write a universal rubik's cube program which I
called unirubik and after discovering TUHS I tried to backport it to
v7 (which was easy) and v6 (which was a bit harder) and now I'm trying
to backport it to v5. The v5 version currently doesn't have the any
file IO capability as yet. Here are a few links to the various
versions:
http://www.maxhost.org/other/unirubik.c.v7http://www.maxhost.org/other/unirubik.c.v6http://www.maxhost.org/other/unirubik.c.v5
Also I've compiled the file utility from v6 in v5 and it seemed to
work fine. Once I got /dev/mt0 working for unix v5 (thanks to Warren's
help) I transferred the binary for the paging utility pg into it. This
version of pg I believe was from 1BSD.
I did some experimenting with math functions which can be seen here:
http://www.maxhost.org/other/math1.c
This will compile on unix v5.
My initial impression of Unix v5 was that it was a primitive and
almost unusable version of Unix but now that I understand it a bit
better it seems a fairly complete system. I'm a bit foggy on what the
memory limits are with v5 and v6. Unix v7 seems to run under simh
emulating a PDP-11/70 with 2 megabytes of ram (any more than that and
the kernel panics).
Also I'd be interested in seeing the source code for Ken Thompson's
APL interpreter for Unix v5. I know it does exist as it is referenced
in the Unix v5 manual. The earliest version I could find was dated Oct
1976 and I've written some notes on it here:
http://apl.maxhost.org/getting-apl-11-1976-to-work.txt
Ok, that's about it for now. Is there any chance of going further back
to v4, v3, v2 etc?
Mark