Re: [TUHS] The most surprising Unix programs

19 Mar 2020

On Thu, Mar 19, 2020 at 02:57:59PM -0600, Nelson H. F. Beebe wrote:
[...]
...

 If you want to tackle raw HTML from abitrary source, then I agree with
 you: most HTML on the Web is not grammar conformant, there are
 numerous vendor extensions, and the HTML is hideously idiosynchratic
 and irregularly formatted.
 The solution that I adopted 25 years ago was to write a grammar
 recognizing, but violation lenient, prettyprinter for HTML.  It has
 served well and I use it many times daily for my work in the BibNet
 Project and TeX User Group bibliography archives, now approaching 1.55
 million entries.  The latest public release is available here:
        http://www.math.utah.edu/pub/sgml/ 
Thank you, I will have a longer look at those archives. My plan so far
was to explore html files with CL and Slime (interactive mode for CL
inside Emacs), which would allow me to actually find out what I want
to be looking for - well, hopefully :-).
--
Regards,
Tomasz Rola
--
** A C programmer asked whether computer had Buddha's nature.      **
** As the answer, master did "rm -rif" on the programmer's home    **
** directory. And then the C programmer became enlightened...      **
**                                                                 **
** Tomasz Rola          mailto:tomasz_rola@bigfoot.com             **

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

Re: [TUHS] The most surprising Unix programs