The EFF just published an article on the rise and fall of Gopher on
their Deeplinks blog.
"Gopher: When Adversarial Interoperability Burrowed Under the
Gatekeepers' Fortresses"
https://www.eff.org/deeplinks/2020/02/gopher-when-adversarial-interoperabil…
I thought it might be of interest to people here.
--
Michael Kjörling • https://michael.kjorling.se • michael(a)kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”
The tab/detab horse was still twitching, so I decided to beat it a little
more.
Doug's claim that tabs saving space was an urban legend didn't ring true,
but betting again Doug is a good way to get poor quick. So I tossed
together a perl script (version run through col -x is at the end of this
note) to measure savings. A simpler script just counted tabs,
distinguishing leading tabs, which I expected to be very common, from
embedded tabs, which I expected to be rare. In retrospect, embedded tabs
are common in (my) C code, separating structure types from the element
names and trailing comments. As Norman pointed out, genuine tabs often
preserve line to line alignment in the presence of small changes. So the
fancier script distinguishes between leading tabs and embedded tabs for
various possible tab stops. Small tab stops keep heavily indented code
lines short, large tab stops can save more space when tabbing past leading
blanks. My coding style uses "set-width" of 4, which vi turns into spaces
or tabs, with "standard" tabs every 8 columns. My code therefore benefits
most with tabstops every 4 columns. A lot of code is indented 4 spaces,
which saves 3 bytes when replaced by a tab, but there is no saving with
tabstops at 8. Here's the output when run on itself (before it was
detabbed) and on a largish C program:
/home/jpl/bin/tabsave.pl /home/jpl/bin/tabsave.pl rsort.c
/home/jpl/bin/tabsave.pl, size 1876
2: Leading 202, Embedded 3, Total 205
4: Leading 303, Embedded 4, Total 307
8: Leading 238, Embedded 5, Total 243
rsort.c, size 209597
2: Leading 13186, Embedded 4219, Total 17405
4: Leading 19776, Embedded 5990, Total 25766
8: Leading 16506, Embedded 6800, Total 23306
The bytes saved by using tabs compared to the (detabbed) original size are
not chump change, with 2, 4 or 8 column tabstops. On ordinary text, savings
are totally unimpressive, usually 0. Your savings may vary. I think the
horse is now officially deceased. -- jpl
===
#!/usr/bin/perl -w
use strict;
my @Tab_stops = ( 2, 4, 8 );
sub check_stop {
my ($line, $stop_at) = @_;
my $pos = length($line);
my ($leading, $embedded) = (0,0);
while ($pos >= $stop_at) {
$pos -= ($pos % $stop_at); # Get to previous tab stop
my $blanks = 0;
while ((--$pos >= 0) && (substr($line, $pos, 1) eq ' ')) {
++$blanks; }
if ($blanks > 1) {
my $full = int($blanks/$stop_at);
my $partial = $blanks - $full * $stop_at;
my $savings = (--$partial > 0) ? $partial : 0;
$savings += $full * ($stop_at - 1);
if ($pos < 0) {
$leading += $savings;
} else {
$embedded += $savings;
}
}
}
return ($leading, $embedded);
}
sub dofile {
my $file = shift;
my $command = "col -x < $file";
my $notabsfh;
unless (open($notabsfh, "-|", $command)) {
printf STDERR ("Open failed on '$command': $!");
return;
}
my $size = 0;
my ($leading, $embedded) = (0,0);
my @savings;
for (my $i = 0; $i < @Tab_stops; ++$i) { $savings[$i] = [0,0]; }
while (my $line = <$notabsfh>) {
my $n = length($line);
$size += $n;
$line =~ s/(\s*)$//;
for (my $i = 0; $i < @Tab_stops; ++$i) {
my @l_e = check_stop($line, $Tab_stops[$i]);
for (my $j = 0; $j < @l_e; ++$j) {
$savings[$i][$j] += $l_e[$j];
}
}
}
print("$file, size $size\n");
for (my $i = 0; $i < @Tab_stops; ++$i) {
print(" $Tab_stops[$i]: ");
my $l = $savings[$i][0];
my $e = $savings[$i][1];
my $t = $l + $e;
print("Leading $l, Embedded $e, Total $t\n");
}
print("\n");
}
sub main {
for my $file (@ARGV) {
dofile($file);
}
}
main();
On Mar 11, 2021, at 10:08 AM, Warner Losh <imp(a)bsdimp.com> wrote:
>
> On Thu, Mar 11, 2021 at 10:40 AM Bakul Shah <bakul(a)iitbombay.org> wrote:
>> From https://www.freebsd.org/cgi/man.cgi?hosts(5)
>> For each host a single line should be present with the following information:
>> Internet address
>> official host name
>> aliases
>> HISTORY
>> The hosts file format appeared in 4.2BSD.
>
> While this is true wrt the history of FreeBSD/Unix, I'm almost positive that BSD didn't invent it. I'm pretty sure it was picked up from the existing host file that was published by sri-nic.arpa before DNS.
A different and more verbose format. See RFCs 810 & 952. Possibly because it had to serve more purposes?
> Warner
>
>>> On Mar 11, 2021, at 9:14 AM, Grant Taylor via TUHS <tuhs(a)minnie.tuhs.org> wrote:
>>> Hi,
>>>
>>> I'm not sure where this message best fits; TUHS, COFF, or Internet History, so please forgive me if this list is not the best location.
>>>
>>> I'm discussing the hosts file with someone and was wondering if there's any historical documentation around it's format and what should and should not be entered in the file.
>>>
>>> I've read the current man page on Gentoo Linux, but suspect that it's far from authoritative. I'm hoping that someone can point me to something more authoritative to the hosts file's format, guidelines around entering data, and how it's supposed to function.
>>>
>>> A couple of sticking points in the other discussion revolve around how many entries a host is supposed to have in the hosts file and any ramifications for having a host appear as an alias on multiple lines / entries. To whit, how correct / incorrect is the following:
>>>
>>> 192.0.2.1 host.example.net host
>>> 127.0.0.1 localhost host.example.net host
>>>
>>>
>>>
>>> --
>>> Grant. . . .
>>> unix || die
>> _______________________________________________
>> COFF mailing list
>> COFF(a)minnie.tuhs.org
>> https://minnie.tuhs.org/cgi-bin/mailman/listinfo/coff
Hi,
I'm not sure where this message best fits; TUHS, COFF, or Internet
History, so please forgive me if this list is not the best location.
I'm discussing the hosts file with someone and was wondering if there's
any historical documentation around it's format and what should and
should not be entered in the file.
I've read the current man page on Gentoo Linux, but suspect that it's
far from authoritative. I'm hoping that someone can point me to
something more authoritative to the hosts file's format, guidelines
around entering data, and how it's supposed to function.
A couple of sticking points in the other discussion revolve around how
many entries a host is supposed to have in the hosts file and any
ramifications for having a host appear as an alias on multiple lines /
entries. To whit, how correct / incorrect is the following:
192.0.2.1 host.example.net host
127.0.0.1 localhost host.example.net host
--
Grant. . . .
unix || die
I am currently reading "Memoirs of a Computer Pioneer" by Maurice
Wilkes, MIT press. The following text from p. 145 may amuse readers.
[p. 145] By June 1949 people had begun to realize that it was not so
easy to get a program right as had at one time appeared. I well
remember then this realization first came on me with full force. The
EDSAC was on the top floor of the building and the tape-punching and
editing equipment one floor below [...]. I was trying to get working my
first non-trivial program, which was one for the numerical integration
of Airy's differential equation. It was on one of my journeys between
the EDSAC room and the punching equipment that "hesitating at the angles
of stairs" the realization came over me with full force that a good part
of the remainder of my life was going to spent in finding errors in my
own programs.
N.