TUHS

tuhs@tuhs.org

29 participants
6557 discussions

Spell - was tmac: Move macro diagnostics away from

by Doug McIlroy

`quotes' > rules used ... to create British spelling from an American > English database often leave a lot to be desired. Among the BUGS listed for spell(1) in v7 was "Britsh spelling was done by an American". Nevertheless, at least one British expat thanked me for spell -b. He had been using the original "spell", and ignoring its reports of British "misspellings". But, he said, long exposure to American writing had infected his writing. Spell -b was a blessing, for revealed where his usage wobbled between traditions.

7 years, 9 months

Re: [TUHS] Spell - was tmac: Move macro diagnostics away from `quotes\'

by Doug McIlroy

> I am curious if anyone on the list remembers much > about the development of the first spell checkers in Unix? Yes, intimately. They had no relationship to the PDP 10. The first one was a fantastic tour de force by Bob Morris, called "typo". Aside from the file "eign" of the very most common English words, it had no vocabulary. Instead it evaluated the likelihood that any particular word came from a source with the same letter-trigram frequencies as the document as a whole. The words were then printed in increasing order of likelihood. Typos tended to come early in the list. Typo, introduced in v3, was very popular until Steve Johnson wrote "spell", a remarkably short shell script that (efficiciently) looks up a document's words in the wordlist of Webster's Collegiate Dictionary, which we had on line. The only "real" coding he did was to write a simple affix-stripping program to make it possible to look up plurals, past tenses, etc. If memory serves, Steve's program is described in Kernighan and Pike. It appeared in v5. Steve's program was good, but the dictionary isn't an ideal source for real text, which abounds in proper names and terms of art. It also has a lot of rare words that don't pull their weight in a spell checker, and some attractive nuisances, especially obscure short words from Scots, botany, etc, which are more likely to arise in everyday text as typos than by intent. Given the basic success of Steve's program, I undertook to make a more useful spelling list, along with more vigorous affix stripping (and a stop list to avert associated traps, e.g. "presenation" = pre+senate+ion"). That has been described in Bentley's "Programming Pearls" and in http://www.cs.dartmouth.edu/~doug/spell.pdf. Morris's program and mine labored under space constraints, so have some pretty ingenious coding tricks. In fact Morris has a patent on the way he counted frequencies of the 26^3 trigrams in 26^3 byes, even though the counts could exceed 256. I did some heroic (and probabilistic) encoding to squeeze a 30,000 word dictionary into a 64K data space." Doug

7 years, 9 months

Re: [TUHS] Spell - was tmac: Move macro diagnostics away from

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > I see the man page for it in v6, but no executable. The MIT V6+ system had it. They probably took it out of the distro because it was useless without the dictionary, which they didn't have the rights to distribute. Here's the source: http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/spell.c http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/spell1.c http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/spell2.c http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/spell3.c Noel

7 years, 9 months

TROFF made trivial

by Leah Neukirchen

Hi, I found this paper by bwk referenced in the Unix manpages, in v4 as: TROFF Made Trivial (unpublished), in v5 as: TROFF Made Trivial (internal memorandom), also in the v6 "Unix Reading List", but not anymore in v7. Anyone have a copy or a scan? -- Leah Neukirchen <leah(a)vuxu.org> http://leah.zone

7 years, 9 months

Re: [TUHS] UNIX on S/370

by jnc＠mercury.lcs.mit.edu

> From: Larry McVoy > So tape I can see being more weird, but isn't raw disk just "don't put > it in buffer cache"? One machines/controllers which are capable of it, with raw devices DMA happens directly into the buffers in the process (which obviously has to be resident while the I/O is happening). Noel

7 years, 9 months

DEC Born 60 years ago

by Clem Cole

Cute article from a few days ago in the Telegram about DEC with some since pics that I thought I would pass on to this group. http://www.telegram.com/news/20171118/digital-equipment-born-60-years-ago-i…

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > I don't quite no how to investigate this other than to pore through the > pdp11/40 instruction manual. One of these: https://www.ebay.com/itm/Digital-pdp-Programming-Card-8-Pages/142565890514 is useful; it has a list of all the opcodes in numerical order; something none of the CPU manuals have, to my recollection. Usually there are a flock of these "pdp11 Programming Cards" on eBait, but I only see this one at the moment. If you do any amount of work with PDP-11 binary, you'll soon find yourself recognizing the common instructions. E.g. MOV is 01msmr (octal), where 'm' is a mode specifier, and s and r are source and destination register numbers. (That's why PDP-11 people are big on octal; the instructions are easy to read in octal.) More here: http://gunkies.org/wiki/PDP-11_architecture#Operands So 0127xx is a move of an immediate operand. >> You don't need to mount it on DECTape drive - it's just blocks. Mount >> it as an RK05 image, or a magtape, or whatever. > I thought disk (RK05) and tape (magtape) blocks were different... Well, you need to differentiate between DECtape and magtape - very different beasts. DECtape on a PDP-11 _only_ supports 256 word (i.e. 512 byte) blocks, the same as most disks. (Floppies are an exception when it comes to disks - sort of. The hardware supports 128/256 byte sectors, but the usual driver - not in V6 or V7 - invisibly makes them look like 512-byte blocks.) Magtapes are complicated, and I don't remember all the details of how Unix handles them, but the _hardware_ is prepared to write very long 'blocks', and there are also separate 'file marks' which the hardware can write, and notice. But a magtape written in 512-byte blocks, with no file marks, can be treated like a disk; that's what the V6 distribution tapes look like: http://gunkies.org/wiki/Installing_UNIX_Sixth_Edition#Installation_tape_con… and IIRC 'tp' format magtape tapes are written the same way, hardware-wise (so they look just like DECtapes). Noel

7 years, 9 months

Re: [TUHS] Some resources for V6/PDP/SIMH newbs like me

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > (e) UNIX assembler uses the characters $ and "*" where the DEC > assemblers use "#" and "@" respectively. Amusing: the "UNIX Assembler Reference Manual" says: The syntax of the address forms is identical to that in DEC assemblers, except that "*" has been substituted for "@" and "$" for "#"; the UNIX typing conventions make "@" and "#" rather inconvenient. What's amusing is that in almost 40 years, it had never dawned on me that _that_ was why they'd made the @->*, etc change! "Duhhhh" indeed! Interesting side note: the UNIX erase/kill characters are described as being the same as Multics', but since Bell pulled out of the Multics project fairly early, I wonder if they'd used it long enough to get '@' and '#' hardwired into their fingers. So I recently has the thought 'Multics was a follow-on to CTSS, maybe CTSS used the same characters, and that's how they got burned in'. So I looked in the "CTSS Programmer's Guide" (2nd edition), and no, according to it (pg. AC.2.02), the erase and kill characters on CTSS were '"' and '?'. So, so much for that theory! > (l) The names "_edata" and "_end" are loader pseudo variables which > define the size of the data segment, and the data segment plus the bss > segment respectively. That one threw me, too, when I first started looking at the kernel! I don't recall if I found documentation about it, or just worked it out: it is in the UPM, although not in ld(1) like one might expect (at least, not in the V6 UPM; although in V7: http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/man/man1/ld.1 it is there), but in end(3): http://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/man/man3/end.3 Noel

7 years, 9 months

redirection wildness in v7

by Will Senn

Why does the first of these incantations not present text, but the second does (word is a file)? Neither errors out. $ <word | sed 20q $ <word sed 20q Thanks, Will -- GPG Fingerprint: 68F4 B3BD 1730 555A 4462 7D45 3EAA 5B6D A982 BAAF

7 years, 9 months

Re: [TUHS] UNIX on S/370

by jnc＠mercury.lcs.mit.edu

> From: Clem Cole <clemc(a)ccc.com> > IIRC Tom Lyons started a 370 port at Princeton and finished it at > Amdahl. But I think that was using VM Maybe this is my lack of knowledge of VM showing, but how did having VM help you over running on the bare hardware? Noel

7 years, 9 months

First permanent ARPAnet link

by Dave Horsfall

https://en.wikipedia.org/wiki/Leonard_Kleinrock#ARPANET ``The first permanent ARPANET link was established on November 21, 1969, between the IMP at UCLA and the IMP at the Stanford Research Institute.'' And thus from little acorns... -- Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."

7 years, 9 months

Re: [TUHS] Some resources for V6/PDP/SIMH newbs like me

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > he is addressing an aspect that was not addressed in either of the > manual's entries and is very helpful for making the translation between > PDP-11 Macro Assembler and unix as. I'm curious - what aspect was that? Noel

7 years, 9 months

Re: [TUHS] Some resources for V6/PDP/SIMH newbs like me

by jnc＠mercury.lcs.mit.edu

> From: Will Senn <will.senn(a)gmail.com> > To bone up on assembly language, Lions's commentary is exceptionally > helpful in explaining assembly as it is implemented in V6. The manual > itself is really thin Err, which manual are you referring to there? Not the "UNIX Assembler Reference Manual": http://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/doc/as/as I would assume, but the 'as(I)' page in the UPM? Noel

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > I'm off to refreshing my pdp-11 assembly language skills... A couple of things that might help: - assemble mboot.s and 'od' the result, so when you see something that matches in the dump of the 0th block, you can look back at the assembler source, to see what the source looks like - read the boot block into a PDP-11 debugger ('db' or 'cdb' on V6, 'adb' on V7; I _think_ 'adb' was available on V7, if not, there are some BSD's that have it) and use that to disassmble the code Noel

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> The 0th block does seem to contain some PDP-11 binary - a bootstrap of > some sort. I'll look in more detail in a bit. OK, I had a quick look, and it seems to be a modified version of mboot.s: http://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/source/mdec/mboot.s I had a look through the rest of the likely files in 'mdec', and I didn't find a better match. I'm too lazy busy to do a complete dis-assembly, and work out exactly how it's different, though.. A few observations: 000: 000407 000606 000000 000000 000000 000000 000000 000001 An a.out header, with the 0407 'magic' here performing its original intended function - to branch past the header. 314: 105737 177560 002375 Some console I/O stuff - this two instruction loop waits for the input ready bit to be set. 326: 042700 177600 020027 000101 103405 020027 000132 101002 More character processing - the first instruction clears the high bits of R0, and the next two sets of two instructions compare the contents with two characters (0101 and 0132), and branch. 444: 000207 005000 021027 000407 001004 016020 460: 000020 020006 103774 012746 137000 005007 This seems like the code that checks to see if the thing is an a.out file (note the 'cmp *r0, $0407'), but the code is different from that code in mboot.s; in that, the instruction before the 'clr r0' (at 0446 here) is a 'jsr', whereas in this it's an 'rts pc'. And the code after the 'cmp r0, sp' and branch is different too. I love the '05007' - not very often you see _that_ instruction! 502: 012700 177350 012701 177342 012711 000003 105711 Clearly the code at 'taper:' (TC11 version). Noel

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > I don't see this file in the tuhs source code index OK, here it is: http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/stp.h http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/stp1.c http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/stp2.c http://ana-3.lcs.mit.edu/~jnc/tech/unix/s2/stp3.c That MIT PWB1+ tape has so many treasures on it! (We've already seen all the early networking software.) I really must getting around to curating it, and making the whole works available. Noel

7 years, 9 months

Determining what was on a tape back in the day

by Will Senn

So, I came across this tape: http://bitsavers.trailing-edge.com/bits/DEC/pdp11/dectape/TU_DECtapes/unix6… I was curious what was on it, so I read the description at: http://bitsavers.trailing-edge.com/bits/DEC/pdp11/dectape/TU_DECtapes.txt UNIX1 PURDUE UNIX TAPES UNIX2 UNIX4 UNIX6 HARBA1 HARVARD BASIC TAPE 1 HARBA2 HARVARD BASIC TAPE 2 MEGTEK MEGATEK UNIX DRIVER RAMTEK RAMTEK UNIX DRIVER Cool, sounds interesting, so I downloaded the unix6.dta file and fired up simh - after some fiddling, I figured out that I could get a boot prompt (is that actually from the tape?) if I: set cpu 11/40 set en tc att tc0 unix6.dta boot tc0 = At that point, I was stuck - the usual tmrk, htrk, and the logical corollary tcrk didn't do anything except return me to the boot prompt. I was thinking this was a sixth edition install tape of some sort, but if it is, I'm not able to figure it out. I thought I would load the tape into v7 and look at its content using tm or tp, but then I realized that I didn't have a device set up for TU56 and even if I did, I didn't know how to do a dir on a tape - yeah, I know, I will go read the manual(s) in chagrin. In the meantime, my question for y'all is similar to my other recent questions, and it goes like this: When you received an unmarked tape back in the day, how did you go about figuring out what was on it? What was your process (open the box, know by looking at it that it was an x rather than a y, load it into the tape reader and read some bytes off it and know that it was a z, use unix to read the tape using tm, tp, tar, dd, cpio or what, and so on)? What advice would you give a future archivist to help them quickly classify bit copies of tapes :). Thanks, Will -- GPG Fingerprint: 68F4 B3BD 1730 555A 4462 7D45 3EAA 5B6D A982 BAAF

7 years, 9 months

The Fourth Research Edition Unix Programmer's Manual

by Diomidis Spinellis

I don't think we had the Fourth Research Edition Unix Programmer's Manual available in typeset form. I played a bit with the troff manual pages on TUHS and managed to typeset it into PDF. You can find the PDF document at https://dspinellis.github.io/unix-v4man/v4man.pdf. I modernized the old shell scripts and corrected some minor markup glitches through commits that are recorded on a GitHub repository: https://github.com/dspinellis/unix-v4man. The process was surprisingly smooth. The scripts for generating the table of contents and the permuted index are based on the original ones. The few problems I encountered in the troff source had to do with missing spaces after requests, the ^F hyphenation character causing groff to complain, a failure of groff to honor .li requests followed by a line starting with a ., and two uses of a lowercase letter for specifying a font. I wrote from scratch a script to typeset everything into one volume. I could not find a shell script for typesetting the whole manual in any of the Research Editions. I assume the process of running the typesetter was so cumbersome, error prone, and time-consuming that it was manually performed on a page-by-page basis. Correct me if I'm wrong here. Diomidis Spinellis

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by Steve Simon

It can be hard to visualise what is on a tape when you have no idea what is on there. Attached is a simple tool I wrote "back then", shamlessly copying an idea by Paul Scorer at Leeds Poly (My video systems lecturer). It is called tm (tape mark). -Steve

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Arthur Krewat > For anyone reading old tapes, I implore you to attempt to read data past > the soft EOT ;) The guy who read my tape does in fact do that; you'll notice my program has an option for looking for data after the soft EOT. Noel

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > I think I understand- the bytes that we have on hand are not device > faithful representations, but rather are failthful representations of > what is presented to the OS. That is, back in the day, a tape would be > stored in various formats as would disks, but unix would show these > devices as streams of bytes, and those are the streams of bytes are what > have been preserved. Yes and no. To start with, one needs to differentiate three different levels; i) what's actually on the medium; ii) what the device controller presented to the CPU; and iii) what the OS (Unix in this case) presented to the users. With the exception of magtapes (which had some semantics available through Unix for larger records, and file marks, the details of which escape me - but try looking at the man page for 'dd' in V6 for a flavour of it), you're correct about what Unix presented to the users. As to what is preserved; for disks and DECtapes, I think you are broadly correct. For magtapes, it depends. E.g. SIMH apparently can consume files which _represent_ magtape contents (i, above), and which include 'in band' (i.e. part of the byte stream in the file) meta-data for things like file marks, etc. At least one of the people who reads old media for a living, when asked to read an old tape, gives you back one of these files with meta-data in it. Here: http://ana-3.lcs.mit.edu/~jnc/tech/pdp11/tools/rdsmt.c is a program which reads one of those files and convert the contents to a file containing just the data bytes. (I had a tape with a 'dd' save of a file-system on it, and wanted just the file-system image, on which I deployed a tool I wrote to grok 4.2 filesystems.) Also, for disks, it should be remembered that i) and ii) were usually quite different, as what was actually on the disk included thing like preambles, headers, CRCs, etc, none of which the CPU usually could even see. (See here: http://gunkies.org/wiki/RX0x_floppy_drive#Low-level_format for an example. Each physical drive type would have its own specific low-level hardware format.) So what's preserved is just an image of what the CPU saw, which is, for disks and DECtapes, generally the same as what was presented to the user - i.e. a pile of bytes. Noel

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Will Senn > So, I came across this tape: > ... > I was curious what was on it 'od' is your friend! If you look here: http://mercury.lcs.mit.edu/~jnc/tech/V6Unix.html#dumpf there's a thing which is basically 'od' and 'dd' rolled in together, which allows you to dump any block you want in a variety of formats (ASCII, 16-bit words in octal [very useful for PDP-11 binary], etc). I wrote it under CygWin, for Windows, but it only uses the StdIO library, and similar programs (e.g. my usassembler) written that way work fine under Losenux. Try downloading it and compiling it - if it doesn't work, please let me know; it'd be worth fixing it so it does work on Linux. > after some fiddling, I figured out that I could get a boot prompt (is > that actually from the tape?) The 0th block does seem to contain some PDP-11 binary - a bootstrap of some sort. I'll look in more detail in a bit. > I was thinking this was a sixth edition install tape of some sort, but > if it is, I'm not able to figure it out. >From what I can see, it's probably a tp-format tape: the 1st block contains some filenames which I can see in an ASCII dump of it: speakez/sbrk.s dcheck.c df.c intel/as80.c intel/optab.8080 > v7 and look at its content using tm or tp, but then I realized that I > didn't have a device set up for TU56 You don't need to mount it on DECTape drive - it's just blocks. Mount it as an RK05 image, or a magtape, or whatever. > When you received an unmarked tape back in the day, how did you go about > figuring out what was on it? Generally there would have been some prior communication, and the person sending it would have told you what it was (e.g. '800 bpi tar', or whatever). > What advice would you give a future archivist to help them quickly > classify bit copies of tapes :). Like I said: "'od' is your friend!"!! :-) Noel

7 years, 9 months

197[78] usenix conf. at columbia, magtapes "found in the street"?

by ron minnich

Random memories, possibly wrong. In 1977/78 I was at udel and had done a fair amount of work on unix but as a lowly undergrad did not get to go to the Columbia Usenix meeting. Ed Szurkowski of udel went. Ed was the grad student who did hardware design for 11s for Autotote (another story) but also stood up a lot of the early unix 11s at udel starting in 1976, starting with an 11/70. Mike Muus used to come up and visit us at udel and Mike and Ed would try to ask questions the other could not answer. Mike always had a funny story or two. Ed later went to Bell Labs and I lost track of him. The directions for the MTA were fairly clear: it listed a stop that you under no circumstances should get off at, and if you did get off at, you should not go up to the street, lest you never return. This was no joke. Some places in NY were pretty hazardous in those days. I *think* this was the meeting where Ken showed up with a bunch of magtapes, and Ed claimed that, in Ken's word, they were "... found in the street." This part I remember well: Ed returning with two magtapes and our desire to upgrade. We at udel, like many places, had done lots of our own mods to the kernel, which we wanted to keep. So we ran a diff between trees, and I wrote a merge with TECO and ed which got it all put together. I later realized this was a very early form of 'patch', as it used patterns, not line numbers, to figure out how to paste things back together. I really got to love regex in those years. Except for one file: the tools just would not merge them. Ed later realized there was one key difference that we had not noticed, a missing comment, namely, the Western Electric copyright notice ... I'm kinda sorry that our "udel Unix" is lost to the great /dev/null, it would be interesting to see it now. ron

7 years, 9 months

Re: [TUHS] Determining what was on a tape back in the day

by jnc＠mercury.lcs.mit.edu

> From: Clem Cole > stp is from the Harvard distribution. The MIT PWB1 system I have has the source; the header says: M. Ferentz Brooklyn College of CUNY September 1976 If it can't be found on TUHS, I can upload it. No man page, though. :-( Noel

7 years, 9 months

ed(1) and Pipes.

by Norman Wilson

Ralph Corderoy: ed(1) pre-dates pipes. When pipes came along, stderr was needed, and lots of new idioms were found to make use of them. Why didn't ed gain a `filter' command to accompany `r !foo' and `w !bar'? === I sometimes wonder that too. When I use `ed,' it is usually really qed, an extended ed written by the late-1970s UNIX crowd here at U of T. (Rob Pike, Tom Duff, David Tilbrook, and Hugh Redelmeier, I think.) qed is something of a kitchen sink, full of clumsy programmability features that I never use. The things that keep me using it are: -- Multiple buffers, each possibly associated with a different file or just anonymous -- The ability to copy or move text (the traditional t and m commands) between buffers as well as within one -- The ability to send part or all of a buffer to a shell command, to read data in from a shell command, or to send data out and replace it with that from the shell command: >mail user ... <ps -ef |tr a-z A-Z I use the last quite often; it makes qed sort of a workbench for manipulating and mining text. One can do the same with the shell and temporary files, but using an editor buffer is much handier. sam has similar abilities (but without all the needless programmability). Were sam less clumsy to use in its non-graphical mode, I'd probably have abandoned qed for sam. Norman Wilson Toronto ON (for real now)

7 years, 9 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

TUHS