TUHS

tuhs@tuhs.org

6 participants
6532 discussions

by Norman Wilson

After a day and an evening of fighting with modern hardware, the modern tangle that passes for UNIX nowadays, and modern e-merchandising, I am too lazy to go look up the details. But as I remember it, two syncs was indeed probably enough. I believe that when sync(2) returned, all unflushed I/O had been queued to the device driver, but not necessarily finished, so the second sync was just a time-filling no-op. If all the disks were in view, it probably sufficed just to watch them until all the lights (little incandescent bulbs in those days, not LEDs) had stopped blinking. I usually typed sync three or four times myself. It gave me a comfortable feeling (the opposite of a syncing feeling, I suppose). I still occasionally type `sync' to the shell as a sort of comfort word while thinking about what I'm going to do next. Old habits die hard. (sync; sync; sync) Norman Wilson Toronto ON

10 years, 11 months

Re: [TUHS] the sin of buffering [offshoot of excise process from a pipeline]

by jnc＠mercury.lcs.mit.edu

> From: Doug McIlroy <doug(a)cs.dartmouth.edu> > Process A spawns process B, which reads stdin with buffering. B gets > all it deserves from stdin and exits. What's left in the buffer, > intehded for A, is lost. Ah. Got it. The problem is not with buffering as a generic approach, the problem is that you're trying to use a buffering package intended for simple, straight-forward situations in one which doesn't fall into that category! :-) Clearly, either B has to i) be able to put back data which was not for it ('ungets' as a system call), or ii) not read the data that's not for it - but that may be incompatible with the concept of buffering the input (depending on the syntax, and thus the ability to predict the approaching of the data B wants, the only way to avoid the need for ungetc() might be to read a byte at a time). If B and its upstream (U) are written together, that could be another way to deal with it: if U knows where B's syntatical boundaries are, it can give it advance warning, and B could then use a non-trivial buffering package to do the right thing. E.g. if U emits 'records' with a header giving the record length X, B could tell its buffering package 'don't read ahead more than X bytes until I tell you to go ahead with the next record'. Of course, that's not a general solution; it only works with prepared U's. Really, the only general, efficient way to deal with that situation that I can see is to add 'ungets' to the operating system... Noel

10 years, 11 months

Re: [TUHS] Excise process from a pipe

by jnc＠mercury.lcs.mit.edu

>> From: Doug McIlroy <doug(a)cs.dartmouth.edu> >> The spec below isn't hard: just hook two buffer chains together and >> twiddle a couple of file desciptors. > In thinking about how to implement it, I was thinking that if there was > any buffered data in an output pipe, that the process doing the > splice() would wait (inside the splice() system call) on all the > buffered data being read by the down-stream process. > ... > As a side-benefit, if one adopted that line, one wouldn't have to deal > with the case (in the middle of the chain) of a pipe-pipe splice with u > buffered data in both pipes (where one would have to copy the data > across); instead one could just use the exact same code for both cases So a couple of days ago I suffered a Big Hack Attack and actually wrote the code for splice() (for V6, of course :-). It took me a day or so to get 'mostly' running. (I got tripped up by pointer arithmetic issues in a number of places, because V6 declares just about _everything_ to be "int *", so e.g. "ip + 1" doesn't produce the right value for sleep() if ip is declared to be "struct inode *", which is what I did automatically.) My code only had one real bug so far (I forgot to mark the user's channels as closed, which resulted in their file entries getting sub-zero usage counts when the middle (departing) process exited). However, now I have run across a real problem: I was just copying the system file table entry for the middle process' input channel over to the entry for the downstream's input (so further reads on its part would read the channel the middle process used to be reading). Copying the data from one entry to another meant I didn't have to go chase down file table pointers in the other process' U structure, etc. Alas, this simple approach doesn't work. Using the approach I outlined (where the middle channel waits for the downstream pipe to be empty, so it can discard it and do the splice by copying the file table entries) doesn't work, because the downstream process is in the middle of a read call (waiting for more data to be put in the pipe), and it has already computed a pointer to the pipe's inode, and it's looping waiting for that inode to have data. So now I have to regroup and figure out how to deal with that. My most likely approach is to copy the inode data across (so I don't have to go mess with the downstream process to get it to go look at another inode), but i) I want to think about it a bit first, and ii) I have to check that it won't screw anything else up if I move the inode data to another slot. Noel

10 years, 11 months

Re: [TUHS] shutdown for pre-v7 unix

by jnc＠mercury.lcs.mit.edu

> From: Mark Longridge <cubexyz(a)gmail.com> > I was wondering if there might be a better way to do a shutdown on > early unix. Not really; I don't seem to recall our having one on the MIT V6 machine. (We did add a 'reboot' system call so we could reboot the machine without having to take the elevator up to the machine room [the console was on our floor, and the reboot() call just jumped into the hardware bootstrap], but in the source it doesn't even bother to do an update(). Well, I should't say that: I only have the source for the kernel, which doesn't; I don't at the moment have access to the source for the rest of the system - although I do have some full dump tapes, once I can work out how to read them. Anyway, so maybe the user command for rebooting the system did a sync() first.) I suppose you could set the switch register to 173030 and send a 'kill -1 1', which IIRC kills of all shells except the one on the console, but somehow I doubt you're running multi-user anyway... :-) Noel

10 years, 11 months

Unix v1/v2 cp command

by Mark Longridge

>> the cp command seems different from all other versions, I'm not sure I >> understand it so I used the mv command instead which worked as expected. > > I'm intrigued; in what way is it different? It seems that one must first cp a file to another file then do a mv to actually put it into a different directory: e.g. while in /usr/src as ctr0.s cp a.out ctr0.o mv ctr0.o /usr/lib ...rather than trying to just "cp a.out /usr/lib/ctr0.o" Mark

10 years, 11 months

Re: [TUHS] the sin of buffering [offshoot of excise process from a pipeline]

by Doug McIlroy

Yes, an evil necessary to get things going. The very definition of original sin. Doug Larry McVoy wrote: >>>> For stdio, of course, one would need fsplice(3), which must flush the >>>> in-process buffers--penance for stdio's original sin of said buffering. >>> Err, why is buffering data in the process a sin? (Or was this just a >>> humourous aside?) >> Process A spawns process B, which reads stdin with buffering. B gets >> all it deserves from stdin and exits. What's left in the buffer, >> intehded for A, is lost. Sinful. > It really depends on what you want. That buffering is a big win for > some use cases. Even on today's processors reading a byte at a time via > read(2) is costly. Like 5000x more costly on the laptop I'm typing on:

10 years, 11 months

the sin of buffering [offshoot of excise process from a pipeline]

by Doug McIlroy

> Err, why is buffering data in the process a sin? (Or was this just a humourous aside?) Process A spawns process B, which reads stdin with buffering. B gets all it deserves from stdin and exits. What's left in the buffer, intehded for A, is lost. Sinful.

10 years, 11 months

Re: [TUHS] Excise process from a pipe

by jnc＠mercury.lcs.mit.edu

> From: Doug McIlroy <doug(a)cs.dartmouth.edu> > The spec below isn't hard: just hook two buffer chains together and > twiddle a couple of file desciptors. How amusing! I was about to send a message with almost the exact same description - it even had the exact same syntax for the splice() call! A couple of points from my thoughts which were not covered in your message: In thinking about how to implement it, I was thinking that if there was any buffered data in an output pipe, that the process doing the splice() would wait (inside the splice() system call) on all the buffered data being read by the down-stream process. The main point of this is for the case where the up-stream is the head of the chain (i.e. it's reading from a file), where one more or less has to wait, because one will want to set the down-streams' file descriptor to point to the file - but one can't really do that until all the buffered data was consumed (else it will be lost - one can't exactly put it into the file :-). As a side-benefit, if one adopted that line, one wouldn't have to deal with the case (in the middle of the chain) of a pipe-pipe splice with buffered data in both pipes (where one would have to copy the data across); instead one could just use the exact same code for both cases, and in that case the wait would be until the down-stream pipe can simply be discarded. One thing I couldn't decide is what to do if the upstream is a pipe with buffered data, and the downstream is a file - does one discard the buffered data, write it to the file, abort the system call so the calling process can deal with the buffered data, or what? Perhaps there could be a flag argument to control the behaviour in such cases. Speaking of which, I'm not sure I quite grokked this: > If file descriptor fd0 is associated with a pipe and fd1 is not, then > fd1 is updated to reflect the effect of buffered data for fd0, and the > pipe's other descriptor is replaced with a duplicate of fd1. But what happens to the data? Is it written to the file? (That's the implication, but it's not stated directly.) > The same statement holds when "fd0" is exchanged with "fd1" and "write" > is exchanged with "read". Ditto - what happens to the data? One can't simply stuff it into the input file? I think the 'wait in the system call until it drains' approach is better. Also, it seemed to me that the right thing to do was to bash the entry in the system-wide file table (i.e. not the specific pointers in the u area). That would automatically pick up any children. Finally, there are 'potential' security issues (I say 'potential' because I'm not sure they're really problems). For instance, suppose that an end process (i.e. reading/writing a file) has access to that file (e.g. because it executed a SUID program), but its neighbour process does not. If the end process wants to go away, should the neighbour process be allowed access to the file? A 'simple' implementation would do so (since IIRC file permissions are only checked at open time, not read/write time). I don't pretend that this is a complete list of issues - just what I managed to think up while considering the new call. > For stdio, of course, one would need fsplice(3), which must flush the > in-process buffers--penance for stdio's original sin of said buffering. Err, why is buffering data in the process a sin? (Or was this just a humourous aside?) Noel

10 years, 11 months

Re: [TUHS] Excise process from a pipe

by Doug McIlroy

Larry wrote in separate emails > If you really think that this could be done I'd suggest trying to > write the man page for the call. > I already claimed splice(2) back in 1998; the Linux guys did > implement part of it ... I began to write the following spec without knowing that Linux had appropriated the name "splice" for a capability that was in DTSS over 40 years ago under a more accurate name, "copy". The spec below isn't hard: just hook two buffer chains together and twiddle a couple of file desciptors. For stdio, of course, one would need fsplice(3), which must flush the in-process buffers--penance for stdio's original sin of said buffering. Incidentally, the question is not abstract. I have code that takes quadratic time because it grows a pipeline of length proportional to the input, though only a bounded number of the processes are usefully active at any one time; the rest are cats. Splicing out the cats would make it linear. Linear approaches that don't need splice are not nearly as clean. Doug SPLICE(2) SYNOPSIS int splice(int fd0, int fd1); DESCRIPTION Splice connects the source for a reading file descriptor fd0 directly to the destination for a writing file descriptor fd1 and closes both fd0 and fd1. Either the source or the destination must be another process (via a pipe). Data buffered for fd0 at the time of splicing follows such data for fd1. If both source and destination are processes, they become connected by a pipe. If the source (destination) is a process, the file descriptor in that process becomes write-only (read-only). If file descriptor fd0 is associated with a pipe and fd1 is not, then fd1 is updated to reflect the effect of buffered data for fd0, and the pipe's other descriptor is replaced with a duplicate of fd1. The same statement holds when "fd0" is exchanged with "fd1" and "write" is exchanged with "read". Splice's effect on any file descriptor propagates to shared file descriptors in all processes. NOTES One file must be a pipe lest the spliced data stream have no controlling process. It might seem that a socket would suffice, ceding control to a remote system; but that would allow the uncontrolled connection file-socket-socket-file. The provision about a file descriptor becoming either write-only or read-only sidesteps complications due to read-write file descriptors.

10 years, 11 months

Re: [TUHS] Hello World compiled in v1/v2

by jnc＠mercury.lcs.mit.edu

> From: Dave Horsfall <dave(a)horsfall.org> > crt0.s -> C Run Time (support). It jiggers the stack pointer in some > obscure manner It's the initial startup; it sets up the arguments into the canonical C form, and then calls main(). (It does not do the initial stack frame, a canonical call to CSV from inside main() will do that.) Here are the exact details: On an exec(), once the exec() returns, the arguments are available at the very top of memory: the arguments themselves are at the top, as a sequence of zero-terminated byte strings. Below them is an array of word pointers to the arguments, with a -1 in the last entry. (I.e. if there are N arguments, the array of pointers has N+1 entries, with the last being -1.) Below that is a word containing the size of that array (i.e. N+1). The Stack Pointer register points to that count word; all other registers (including the PC) are cleared. All CRT0.s does is move that argument count word down one location on the stack, adjust the SP to point to it, and put a pointer to the argument pointer table in the now-free word (between the argument count, and the first element of the argument pointer table). Hence the canonical C main() argument list of: int argc; int **argv; If/when main() returns, it takes the return value (passed in r0) and calls exit() with it. (If using the stdio library, that exit() flushes the buffers and closes all open files.) Should _that_ return, it does a 'sys exit'. There are two variant forms: fcrt0.s arranges for the floating point emulation to be loaded, and hooked up; mcrt0.s (much more complicated) arranges for process monitoring to be done. Noel

10 years, 11 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

TUHS