And then I hacked the altavista desktop search the files using Apache to filter content inline.
I know I'd love to feed it more data, the utzoo stuff is massive for 1991, but it's really trivial for 2019. It's around 10GB decompressed.
From: TUHS <tuhs-bounces@minnie.tuhs.org> on behalf of Larry McVoy <lm@mcvoy.com>
Sent: Thursday, November 21, 2019, 11:53 AM
To: Bakul Shah
Cc: tuhs@tuhs.org
Subject: Re: [TUHS] Steve Bellovin recounts the history of USENET
On Wed, Nov 20, 2019 at 07:50:53PM -0800, Bakul Shah wrote:
> On Wed, 20 Nov 2019 19:14:23 -0800 Larry McVoy
wrote:
> > Yeah, I'd be super happy if he joined the list. I enjoyed reading
> > those, wished he had gone into more detail.
> >
> > On the Usenet topic, does anyone remember dejanews? Searchable
> > archive of all the posts to Usenet. Google bought them and then,
> > so far as I know, the searchable part went away.
> >
> > If someone knows how to search back to the beginnings of Usenet,
> > my early tech life is all there, I'd love to be able to show my kids
> > that. Big arguing with Mash on comp.arch, following Guy Harris on
> > comp.unix-wizards, etc.
>
> I have occasionally downloaded some mbox.zip files from
> https://archive.org/details/usenet
> But there are too many files there. Would be nice if there
> was a collaborative effort to organize them in a more usable,
> searchable state. Pretty much all of it (minus binaries
> groups) can be stored locally (or using some global
> namespace.
So is that all of Usenet?
--
---
Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm