As I mentioned in another post, I'm writing an invited paper for an
upcoming issue of IEEE Transactions on Software Engineering that will be a
50-year retrospective of my original 1975 SCCS paper (
mrochkind.com/aup/talks/SCCS-Slideshow.pdf) Can some people here review a
couple of paragraphs for accuracy?
*Decentralized Version Control (DVCS)*
*While VCSs like CVS and Subversion were centralized and had
pre-commit merging, a further advance was towards decentralization, with
post-commit merging. Probably the first DVCS was Sun WorkShop TeamWare,
created by Larry McVoy and announced in 1992 [sun]. It was implemented as a
layer on top of SCCS. McVoy later commercialized a successor system called
BitKeeper [Bitkeeper], which was layered on a re-implementation of SCCS,
which he called BitSCCS. TeamWare and BitKeeper took advantage of the
interleaved delta algorithm, also known as a weave, to implement an
efficient way to represent merged deltas by reference, instead of
reproducing code inside the repository. This is a lot more complicated to
do with reverse deltas, introduced by RCS.*
*In 2005 Linus Torvalds, creator of Linux [linux], invented the DVCS Git
[git] for Linux development, and since then Git has become widely used and
has supplanted BitKeeper.*
[more about DVCS follows]
I don't want to add more detail that would make these paragraphs any
longer, but I do want them to be accurate. Thanks!
Marc Rochkind
--
*My new email address is mrochkind(a)gmail.com <mrochkind(a)gmail.com>*
Rob Pike:
According to the Unix room fortunes file, the actual quote is
SCCS: the source-code motel -- your code checks in but it never checks out. Ken Thompson
====
As a Unix-room-culture aside: I believe this quote was what
inspired Andrew Hume to call his backup system the File Motel.
Norman Wilson
Toronto ON
>> Does anyone know whether there are implementations of mmap that
>> do transparent file sharing? It seems to me that should be possible by
>> making the buffer cache share pages with mmapping processes.
> These days they all do. The POSIX rationale says:
> ... When multiple processes map the same memory object, they can
> share access to the underlying data.
Notice the weasel word "can". It is not guaranteed that they will do so
automatically without delay. Apparently each process may have a physically
distinct copy of the data, not shared access to a single location.
The Linux man page mmap(2), for example, makes it very clear that mmap
has a cache-coherence problem, at least in that system. The existence
of msync(2) is a visible symptom of the problem.
[Weasel words of my own: I have not read the POSIX definition of mmap.]
Doug
On Mon, 16 Dec 2024, Konstantin Belousov wrote:
> On Mon, Dec 16, 2024 at 02:08:43PM -0500, John Levine wrote:
>> PS: I can believe there are some versions of linux that screwed up disk cache
>> coherency, but that just means they don't properly implement the spec, not for
>> the first time. I mean, it's not *that* hard to make all the maps point to the
>> same physical page frame, even on a machine like POWER with reverse page maps.
>
> This is not enough. There are (were ?) architectures, typically with the
> virtually addressed caches, which require all mappings of the same page
> to be suitably aligned, at least. ...
>
> If addresses of different mappings are not aligned, caches were not coherent.
I think we're in "so don't do that" territory. mmap() normally lets the
system pick the memory address to map so it can pick something suitably
aligned. You can pass the MAP_FIXED flag to tell it to map at a
particular address, but it can return EINVAL if the address doesn't work.
The POSIX description says "The use of MAP_FIXED is discouraged, as it may
prevent an implementation from making the most effective use of
resources."
It's not always trivial to make this work. On systems with reverse maps,
a physical page can only be mapped to one virtual address at a time, so
for shared pages it has to mark all of the aliases nonresident and on a
fault remap the page into the map of the process that is running. But
it's not rocket science, either.
R's,
John
> "John Levine" <johnl(a)taugh.com> wrote:
>> M4 was written in the 1970s by Kernighan and Ritchie in C ...
> In private mail, BWK told me that it was DMR who wrote m4. He
t> hen reimplemented it in Ratfor for "Software Tools".
> Arnold
The book says colorfully, "... [and] we are grateful to him for
letting us steal it."
Doug
> well after Unix had fledged, its developers at CSRC found it necessary
> and/or desirable to borrow back a Multics concept: they named it mmap().
As far as I know no Research version of Unix ever had mmap.
Multics had a segmented universal memory. A process incorporated
segments into its address space The universal memory was normally
addressed via a hierachical segment-name directory. With enhancement
to provide for multisegment "files", the directory could serve as a file
system and file I/O became data transfer between segments.
Unix originally imitated the Multics file system, but not the universal
memory. mmap(2) weakly imitates universal memory by allowing a process
to nominally incorporate a portion of a file into the process address space
at page-level granularity. However, an update is guaranteed to be visible
to the file and other processes only upon specific request.
Does anyone know whether there are implementations of mmap that
do transparent file sharing? It seems to me that should be possible by
making the buffer cache share pages with mmapping processes.
Doug
I'm curious if anyone has any history they can share about the BSD
"talk" program.
I was fond of this back when it was still (relatively) common, but
given the way it's architected I definitely see why it fell out of use
as the Internet grew. Still, does anybody know what the history behind
it is? Initially, I thought it was written by Mike Karels, but that
was just my speculation from SCCS spelunking, and looking at the
sources from 4.2, I see RCS header strings that indicate it was
written by "moore" (Peter Moore?). talk.c says, "Written by Kipp
Hickman".
It seems to have arrived pretty early on with respect to the
introduction of TCP/IP in BSD: the README alludes to some things
coming up in 4.1c. Clem, you seem to have had a hand in it, and are
credited (along with Peter Moore) for making it work on 4.1a.
So I guess the question is, what was the motivation? Was it just to
have a more pleasing user-to-user communications experience, or was
discussion across the network an explicit goal? There's a note in
talk.c ("Modified to run between hosts by Peter Moore, 8/19/82") that
suggests this wasn't the original intent. Who thought up the
character-at-a-time display mode?
Thanks for any insights.
- Dan C.
IEEE Transactions on Software Engineering has asked me to write a
retrospective on the influence of SCCS over the last 50 years, as my SCCS
paper was published in 1975. They consider it one of the most influential
papers from TSE's first decade.
There's a funny quote from Ken Thompson that circulates from time-to-time:
"SCCS, the source motel! Programs check in and never check out!"
But nobody seems to know what it means exactly. As part of my research, I
asked Ken what the quote meant, sunce I wanted to include it. He explained
that it refers to SCCS storing binary data in its repository file,
preventing UNIX text tools from operating on the file.
Of course, this is only one of SCCS's many weaknesses. If you have anything
funny about any of the others, post it here. I already have all the boring
usual stuff (e.g., long-term locks, file-oriented, no merging).
Marc Rochkind
mrochkind.com
I was thinking about this some more.
IIRC: Peter and I sketched out the protocol for the sockets version on a
whiteboard in our office one night after a beer and pizza run. Rick
Spicklemeir, Tom Quarles, and Jim Kleckner also participated in those bull
sessions. I started writing the program soon after that and had it working
to a point in a couple of hours. I don't remember the issues, but a couple
of them were when I left for the USENIX conference later that week. When I
got back Peter had finished it and put it into RCS. The key is that the
coding was primarily Peter and myself, but Rick, TQ, and Jim all had
contributed in some manner, too,
Although the famous bug of using a vax integer, you can squarely blame me —
and as I said, having worked on networking for several years before my time
at UCB, I should have known better. But did not even think about it. I
failed Henry's ten programming commandments and concluded that the world
was a Vax. Mei culpa.
ᐧ
On Fri, Dec 13, 2024 at 10:03 AM Mark Seiden <mseiden(a)gmail.com> wrote:
> (I know there is a special place in hell for those who explain a joke,
> but, you asked…)
>
> it’s just an allusion to the Black Flag Roach Motel product (still being
> produced)
> which has a trademark on the phrase “Roaches Check in… But they Don’t
> Check Out”.
>
> Yeah, I knew that much. My question to Ken was about what this was saying
about SCCS.
Marc