> From: Diomidis Spinellis
> From the 2.11 BSD sources I understand that the PDP-11/70 MMU address
> and data registers, KDSA and KDSD, start at 0172360 and 0172320
> respectively ...
Expressed as 16-bit addreses, on a PDP-11 with mapping disabled, yes.
> I checked this by looking at /dev/mem.
I don't know about 2.11, but in other PDP-11 Unixes, /dev/mem gives access to
the actual CPU memory bus (which on a /34, etc, is the 18-bit address UNIBUS;
on a /70 it's a separate 22-bit address bus). In the /70 memory address
space, the 'I/O page' (which is where the PxR's live) is at the top end of it,
i.e. the registers are at 017772360 (KDSAR0), etc.
> What am I missing?
PDP-11's have a plethora of address spaces, of different sizes. You need to
always be aware of which one you're working in.
> My goal is to access from the console the kernel's u area. According to
> mem(4) and the symbols in /unix, this should be at address 0140000.
In the kernel virtual address space, yes.
> Indeed, accessing it through /dev/kmem I get the expected results for
> e.g. u_comm and u_uid.
Because /dev/kmem gives access to kernel address space for the _current_
process.
> I have been unable to find it in the machine's physical memory
By far and away the easiest thing, for the _current_ process, is to
use /dev/kmem, which automagally applies the correct mapping.
For other processes, if the process is swapped in, there's some field in the
proc structure which says where in physical memory it us. IIRC, the user
struct and the kernel stack are stored in the very bottom of that.
(This article:
http://gunkies.org/wiki/Unix_V6_dump_analysis#Memory_layouts
goes into some detail for V6. Not sure how different 2.11 is; I know it uses
one block of kernel address space to map in code overlays, but I don't know
all the details of how it works.)
Anyway, using that, one could read the user area in /dev/mem, at the
appropriate location.
For swapped-out processes, a similar algorithm applies, but you'll
have to look in the swap device (obviously).
Noel
Interesting. My "speak" program had a trivial lexer that
recognized literal tokens, many of which were prefixes
of others, by maximum-munch binary search in a list of
1600 entries. Entries gave token+translation+rewrite.
The whole thing fit in 15K.
Many years later I wrote a regex recognizer that special-cased
alternations of lots of literals. I believe Gnu's regex.c does
that, too. (My regex also supported conjunction and negation--
legitimate regular-language operations--implemented by
continuation-passing to avoid huge finite-state machines.)
We have here a case of imperfect communication in 1127. Had I
been conscious of the lex-explosion problem, I might have
thought of speak and put support for speak-like tables
into lex. As it happened, I only used yacc/lex once, quite
successfully, for a small domain-specific language.
Doug
Steve Johnson wrote:
I also gave up on lex for parsing fairly early. The problem was
reserved words. These looked like identifiers, but the state machine to
pick out a couple of dozen reserved words out of all identifiers was too
big for the PDP-11. When I wrote spell, I ran into the same problem.
I had some rules that wanted to convert plurals to singular forms that
would be found in the dictionary. Writing a rule to recognize .*ies
and convert the "ies" to "y" blew out the memory after only a handful of
patterns. My solution was to pick up words and reverse them before
passing them through lex, so I looked for the pattern "sei.*", converted
it to "y" and then reversed the word again. As it turned out, I only
owned spell for a few weeks because Doug and others grabbed it and ran
with it.
From the 2.11 BSD sources I understand that the PDP-11/70 MMU address
and data registers, KDSA and KDSD, start at 0172360 and 0172320
respectively [1]. Yet, when I read the register contents I don't get
what I would expect to see: increasing by 0200 memory values for KDSA
and the same constant value for KDSD [2]. I checked this by looking at
/dev/mem.
# od -o /dev/mem 0172360 | head -1
0172360 000002 000016 001403 012700 000400 000402 012700 000200
# od /dev/mem 0172320 | head -1
0172320 101016 005064 000026 005067 175456 016467 000006 175430
I get the same results when I examine the memory through SIMH:
sim> examine 172360
172360: 000002
sim> examine 172362
172362: 000016
sim> examine 172364
172364: 001403
sim> examine 172320
172320: 101016
sim> examine 172322
172322: 005064
The MMU kernel instruction registers, KISA and KISD, contain similarly
nonsensical values as do the registers located at a different memory
location (077320, 0772360) indicated in another source [3]. What am I
missing?
My goal is to access from the console the kernel's u area. According to
mem(4) and the symbols in /unix, this should be at address 0140000.
Indeed, accessing it through /dev/kmem I get the expected results for
e.g. u_comm and u_uid. However, I have been unable to find it in the
machine's physical memory, hence my question regarding the MMU's operation.
[1]
https://github.com/RetroBSD/2.11BSD/blob/master/usr/sys/pdpstand/M.s#L346
[2]
https://github.com/RetroBSD/2.11BSD/blob/master/usr/sys/pdpstand/M.s#L247
[3] https://gunkies.org/wiki/PDP-11_Memory_Management
Diomidis
This time looking into non-blocking file access. I realise that the term has wider application, but right now my scope is “communication files” (tty’s, pipes, network connections).
As far as I can tell, prior to 1979 non-blocking access did not appear in the Spider lineage, nor did it appear in the NCP Unix lineage. First appearance of non-blocking behaviour seems to have been with Chesson’s multiplexed files where it is marked experimental (an experiment within an experiment, so to say) in 1979.
The first appearance resembling the modern form appears to have been with SysIII in 1980, where open() gains a O_NDELAY flag and appears to have had two uses: (i) when used on TTY devices it makes open() return without waiting for a carrier signal (and subsequent read() / write() calls on the descriptor return with 0, until the carrier/data is there); and (ii) on pipes and fifo’s, read() and write() will not block on an empty/full pipe, but return 0 instead. This behaviour seems to have continued into SysVR1, I’m not sure when EAGAIN came into use as a return value for this use case in the SysV lineage. Maybe with SysVR3 networking?
In the Research lineage, the above SysIII approach does not seem to exist, although the V8 manual page for open() says under BUGS "It should be possible [...] to optionally call open without the possibility of hanging waiting for carrier on communication lines.” In the same location for V10 it reads "It should be possible to call open without waiting for carrier on communication lines.”
The July 1981 design proposals for 4.2BSD note that SysIII non-blocking files are a useful feature and should be included in the new system. In Jan/Feb 1982 this appears to be coded up, although not all affected files are under SCCS tracking at that point in time. Non-blocking behaviour is changed from the SysIII semantics, in that EWOULDBLOCK is returned instead of 0 when progress is not possible. The non-blocking behaviour is extended beyond TTY’s and pipes to sockets, with additional errors (such as EINPROGRESS). At this time EWOULDBLOCK is not the same error number as EGAIN.
It would seem that the differences between the BSD and SysV lineages in this area persisted until around 2000 or so.
Is that a fair summary?
- - -
I’m not quite sure why the Research lineage did not include non-blocking behaviour, especially in view of the man page comments. Maybe it was seen as against the Unix philosophy, with select() offering sufficient mechanism to avoid blocking (with open() the hard corner case)?
In the SysIII code base, the FNDELAY flag is stored on the file pointer (i.e. with struct file). This has the effect that the flag is shared between processes using the same pointer, but can be changed in one process (using fcntl) without the knowledge of others. It seems more logical to me to have made it a per-process flag (i.e. with struct user) instead. In this aspect the SysIII semantics carry through to today’s Unix/Linux. Was this semantic a deliberate design choice, or simply an overlooked complication?
> I am now writing code in assembly for the PDP-11. I remember reading
> somewhere that the output from "AS" (my caps) is a bit meagre. I can't find
> an option to produce a text listing. Is it possible from AS, using command
> options (I can't see one) or perhaps from "LD"?
>
> Paul
>
> *Paul Riley*
I had the same problem. As I was porting to a different mini I had to write a new assembler. As you have undoubtedly seen, early ‘as’ was written in assembler and not so easy to use as a base. Hence I used Richard’s Miller’s AS for the Interdata as a base (available on Tuhs):
https://www.tuhs.org/cgi-bin/utree.pl?file=Interdata732/usr/source/as
Later I discovered that the TUHS archive has source code for the original ‘as’ rewritten in C, a work by Roger Jaeger:
https://minnie.tuhs.org/Archive/Distributions/USDL/Mini-Unix/
Maybe adding a listing module to this version of ‘as’ is another possible route.
below...
On Thu, Jun 11, 2020 at 9:04 AM Paul Riley <pdr0663(a)icloud.com> wrote:
> Clem,
>
> Thanks for that. So this would compile on modern machines to a cross
> compiler for V6 also running on a modern machine? I note you say macro11,
> so not a Unix “as” style syntax, is that right?
>
Yes - the AT&T syntax was much simpler/less sugar than the DEC assembler.
But the differences are pretty easy to see. IIRC that assembler generates
DEC style linker objects and there is a companion linker that can create
DEC binary objects (*i.e.* 'obj' files) as well as traditional UNIX a.out
format. The entire tool suite was created originally to move code from
RT-11 to UNIX at Harvard and passed around the nascent USENIX community.
IIRC that version was forked from a BSD 2.x/NetBSD source repository and
folks were adding some fields/features in the DEC obj format that RSX
supported that RT-11 did not.
Go hunting and see what you find. My memory was that with the BSD 2.x
project, somebody added a DEC obj to UNIX binary (a.out) converter tool, so
that you could use ld(1) instead of using the DEC style linker that had
been included in the original.
It has been >>years<< since I was really familiar with any of this stuff.
A question about it came up last fall/winter on the simh mailing listing,
which is where I found the the URL.
FWIW: I offered the modern port, assuming you might want to run some of it
as a cross-systems on a newer OS with a modern compiler. But if you are
content running this on V6, then you might just want to go back to the
original. As I said, my memory is that's in the original USENIX Harvard
tape. All that should be Warner's archives if not other places on the
Internet.
Just remember that a big problem with the original code is that it will be
written in pre-'White Book' C (that many of us learned years ago - not
even ANSI of Second edition - this used Lesk's portable C library etc.).
It sometimes looks a little strange to modern eyes. Also if you go
looking, IIRC, someone at Harvard ported the DEC Macro RT-11 library to
UNIX v6. In the late 1970s, I remember tjk, Danny Klein, Tron McConnell
and I, plus some of the folks over in the bio-med group (whose names I have
forgotten) had to a number assembler codes that had been written for the
earlier RT-11 systems to Unix for one of the projects we had. Some of it
got re-written in C, but I do remember we managed to use the Harvard
assembler somehow for parts of it. If my memory is correct, early VMS and
messing with BLISS compatibility could have been mixed up in the project
somehow, but I've long forgotten the details of what we were doing at the
time.
Have fun.
Team,
I am now writing code in assembly for the PDP-11. I remember reading
somewhere that the output from "AS" (my caps) is a bit meagre. I can't find
an option to produce a text listing. Is it possible from AS, using command
options (I can't see one) or perhaps from "LD"?
Paul
*Paul Riley*
I'm seeding this URL to TUHS as one would expect them to be interested in
the work from Warren and friends. FWIW: I tried to browse their archives
and was not impressed (I couldn't find anything).
https://www.softwareheritage.org/
> Steve Johnson's position paper on optimising compilers may amuse you:
> https://dl.acm.org/doi/abs/10.1145/567532.567542
Indeed. This passage struck a particular chord:
"I contend that the class of applications that depend on, for example, loop
optimization and dead code elimination for their efficient solution is of
modest size, growing smaller, and often very susceptible to expression in
applicative languages where the optimization is built into the individual
applicative operators."
I don't know whether I saw that note at the time, but since then I've
come to believe, particularly in regard to C, that one case of dead-code
elmination should be guaranteed. That case is if(0), where 0 is the
value of a constant expression.
This guarantee would take the place of many--possibly even
most--ifdefs. Every ifdef is an ugly intrusion and a pain to read.
Syntactically it occurs at top level completely out of sync with the
indentation and flow of text. Conversion to if would be a big win.
Doug
Does anybody have any good resources on the history of the popularity of C?
I'm looking for data to resolve a claim that C is so prolific and
influential because it's so easy to write a C compiler.
Tyler