On Thu, Nov 21, 2019 at 04:58:01PM +0100, Leah Neukirchen wrote:
>
> arnold@skeeve.com writes:
>
> > Jason Stevens <jsteve@superglobalmegacorp.com> wrote:
> >
> >> I keep a copy of the utzoo files.
> >
> > Any chance of getting them to Warren for storage? Or are they
> > generally available somewhere?
>
> They are also on archive.org:
> https://archive.org/details/utzoo-wiseman-usenet-archive
>
> --
> Leah Neukirchen <leah@vuxu.org> https://leahneukirchen.org/
I'm half tempted to take the archive.org Usenet files and throw them
into Elasticsearch and create a web front end for searching. Storage
would be expensive, but search would rock!
Has anyone definitely proven that any of the contents of these files are not in the searchable Google Groups interface? I don't really see any need to duplicate their efforts. I am 100% certain that Google got Deja News's entire archive and 99% certain that it was fairly quickly supplemented with the University of Toronto material provided by Henry Spencer. Certainly the headers in a thread like this would seem to indicate that the material all came from utzoo:
https://groups.google.com/forum/#!msg/net.unix-wizards/krbEHGQ95_o/QaV2LNSeMlgJ (see "show original" for any message in the dropdown box in the upper right hand corner by the date). While Google has not shown a tremendous deal of interest in Groups over the years - notably, the search was very lacking/incomplete at various points - I would think that there is now enough acknowledgement of the historical importance of these messages that Google would at the very least do their best to preserve what they have. I would also imagine that if someone else had approached them with a substantial enough private archive that they would have accepted it, and not necessarily done a huge press release depending on the time frame, but that's pure supposition on my part. It would be fascinating to look through messages from before 1995 (when Deja News started archiving) to see if any clues can be unearthed about message sources other than utzoo.
As somewhat of an aside, my father was the head sysadmin at Deja News at the time of their purchase by Google and I may have recounted this story before but it's worth sharing again. Google's entire purchase of Deja News involved a couple of Google engineers flying to Austin with a large disk array, letting it mirror over a weekend, and then flying back to California. Google did not, as far as I recall, take possession of any physical assets whatsoever.
-Henry