At Thu, 17 Oct 2024 13:37:35 -0700, Bakul Shah via TUHS <tuhs(a)tuhs.org> wrote:
Subject: [TUHS] Re: On computerese
Unfortunately things are a bit more complicated now!
On freebsd:
$ apropos ls | wc 1761 11158 121413
Someone forgot the "word" part of "keyword" I guess (but see below --
nothing new here). The same brain-damage happens on NetBSD (where all
the manual page processing has had a rather obnoxious and unnecessary
overhaul, for lack of a better description).
BTW, the first keyword I might search for when looking to find something
that does what ls(1) does would probably be "list", or maybe "files",
but not "ls" itself obviously! (Thus my earlier complaint that
"files"
does not appear in the synopsis for ls(1).)
$ apropos '\<ls\>' | wc 9 187 1260
So it seems FreeBSD's apropos(1) now allows regular expressions for the
keyword argument!
On my rather stock plain FreeBSD machine there are only two lines output
for '\<ls\>', and searching for '\<list\>' generates only
27 lines, all
quite reasonable.
At least this support for REs is well documented, assuming one would
think to read the manual page for apropos(1) before using it, so knowing
to use the RE word delimiters isn't too much of a stretch:
... uses case-insensitive extended regular expression matching
over manual names and descriptions
Use of word delimiters are even shown in some of the examples given.
I still fail to see why the default isn't/wasn't to treat the keyword
argument as only matching a whole word (+/- any suffixes).
The new NetBSD implementation doesn't document what its arguments do,
though a quick experiment shows it doesn't parse regular expressions.
Sadly it doesn't handle its '-s' option properly either.
Seems word search on unix for such things needs to be
beefed
up....
Indeed, though "beefed down" might be the better direction.
It looks, on first glance, that the 4.4BSD apropos(1) was also very lax
in matching keywords as well:
Each word is considered separately and case of letters is
ignored. Words which are part of other words are considered;
when looking for “compile”, apropos will also list all instances
of “compiler”.
I think proper exclusion of normal word suffixes (and maybe prefixes)
would suffice for a reasonable definition of "word", but a quick glance
at the source suggests that's not what it does
Note that all of this mess is partially because the makewhatis.sh script
didn't make it into 4.4BSD (even though getNAME.c did), and furthermore
the ed(1) script in it won't work with modern BSD ed(1) implementations.
There was a makewhatis.sed script that is in 4.4BSD doesn't seem to do
anything useful with modern "nroff -man" output either. Sigh.
While looking up information about the different implementations I ran
across the following slightly amusing but mostly sad description of
ptx(1): <https://wiki.debian.org/WhyTheName#coreutils>
ptx: an inscrutable abbreviation for a word-salad generator.
PermuTed indeXes were tortuous concordances for manual pages
back in the days before tools like apropos. The GNU version was
created in 1999 as some sort of exercise in medieval
reenactment.
There's an embedded link therein to a somewhat more sedate description
of the UNIX Reference Manual's permuted index:
https://docstore.mik.ua/orelly/unix/upt/ch50_09.htm
--
Greg A. Woods <gwoods(a)acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca>
Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>