[looping in TUHS so my historical mistakes can be corrected]
Hi Alex,
At 2025-02-13T00:59:33+0100, Alejandro Colomar wrote:
Just wondering... why not build a new PDF from
source, instead of
scanning the book?
A. I don't think we know for sure which version of troff was used to
format the V10 manual. _Probably_ Kernighan's research version,
which was similar to a contemporaneous DWB troff...but what
"contemporaneous" means in the 1989-1990 period is a little fuzzy.
Also, Kernighan may not have a complete source history of his
version of troff, it is presumably still encumbered by AT&T
copyrights, and he's been using groff for at least his last two
books (his Unix memoir and the 2nd edition of the AWK book).
B. It is hard to recreate a Research Unix V10 installation. My
understanding is that Unix V8-V10 were not full distributions but
patches. And because troff was commercial/proprietary software at
that (the aforementioned DWB troff), I don't know if Kernighan's
"Research troff" escaped Bell Labs or how consistently it could be
expected to be present on a system. Presumably any of a variety of
DWB releases would have "worked fine". How much they would have
varied in extremely fiddly details of typesetting is an open
question. I can say with some confidence that the mm package saw
fairly significant development. Of troff itself (and the
preprocessors one bumps into in the Volume 2 white papers) I'm much
more in the dark.
C. Getting a scan out there tells us at least what one software
configuration deemed acceptable by producers of the book generated,
even if it's impossible to identify details of that software
configuration. That in turn helps us to judge the results of
_known_ software configurations--groff, and other troffs too.
D. troff is not TeX. Nothing like trip.tex has ever existed. A golden
platonic ideal of formatter behavior does not exist except in the
collective, sometimes contentious minds of its users.
Doesn't groff(1) handle the Unix sources?
Assuming the full source of a document is available, and no part of its
toolchain requires software that is unavailable (like Van Wyk's "ideal"
preprocessor) then if groff cannot satisfactorily render a document
produced by the Bell Labs CSRC, then I'd consider that presumptively a
bug in groff. It's a rebuttable presumption--if one document in one
place relied upon a _bug_ in AT&T troff to produce correct rendering, I
think my inclination would be to annotate the problem somewhere in
groff's documentation and leave it unresolved.
For a case where groff formats a classic Unix document "better" (in
the sense of not unintentionally omitting a formatted equation) than
AT&T troff, see the following.
https://github.com/g-branden-robinson/retypesetting-mathematics
I expect the answer is not licenses (because I expect
redistributing
the scanned original will be as bad as generating an apocryphal PDF in
terms of licensing).
I've opined before that the various aspects of Unix "IP" ownership
appear to be so complicated and mired in the details of decades-old
contracts in firms that have changed ownership structures multiple
times, that legally valid answers to questions like this may not exist.
Not until a firm that thinks it holds the rights decides it's worth the
money to pay a bunch of archivists and copyright attorneys to go on a
snipe hunt.
And that decision won't be made unless said firm thinks the probability
is high that they can recover damages from infringers in excess of their
costs. Otherwise the decision simply sets fire to a pile of money.
...which isn't impossible. Billionaires do it every day.
I sometimes wondered if I should run the Linux
man-pages build system
on the sources of Unix manual pages to generate an apocryphal PDF book
of Volume 1 of the different Unix systems. I never ended up doing so
for fear of AT&T lawyers (or whoever owns the rights to their manuals
today), but I find it would be useful.
It's the kind of thing I've thought about doing. :)
If you do, I very much want to know if groff appears to misbehave.
Regards,
Branden