[TUHS] Re: A fuzzy awk.

21 May 2024

I like this anecdote because it points out the difference between being
able to handle and process bizarre conditions, as if they were something
that should work, which is maybe not that helpful, vs. detecting them
and doing something reasonable, like failiing with a "limit exceeded"
message. A silent, insidious failure down the line because a limit was
exceeded is never good. If "fuzz testing" helps exercise limits and
identifies places where software hasn't realized it has exceeded its
limits, has run off the end of a table, etc., that seems like a good
thing to me.
On 05/21/2024 09:59 AM, Paul Winalski wrote:
...
  On Tue, May 21, 2024 at 12:09 AM Serissa
&lt;stewart(a)serissa.com
 <mailto:stewart@serissa.com>> wrote:
         Well this is obviously a hot button topic.  AFAIK I was nearby
         when fuzz-testing for software was invented. I was the main
         advocate for hiring Andy Payne into the Digital Cambridge
         Research Lab.  One of his little projects was a thing that
         generated random but correct C programs and fed them to
         different compilers or compilers with different switches to
         see if they crashed or generated incorrect results.
         Overnight, his tester filed 300 or so bug reports against the
         Digital C compiler.  This was met with substantial pushback,
         but it was a mostly an issue that many of the reports traced
         to the same underlying bugs.
         Bill McKeemon expanded the technique and published
         "Differential Testing of Software"
https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTesting…
<https://www.cs.swarthmore.edu/%7Ebylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf>
 In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product
 teams to introduce fuzz testing into our testing process.  As with the
 C compiler work at DEC Cambridge, fuzz testing for other compilers
 (Fortran, PL/I) also found large numbers of bugs.
 The pushback from the compiler folks was mainly a matter of
 priorities.  Fuzz testing is very adept at finding edge conditions,
 but most failing fuzz tests have syntax that no human programmer would
 ever write.  As a compiler engineer you have limited time to devote to
 bug testing.  Do you spend that time addressing real customer issues
 that have been reported or do you spend it fixing problems with code
 that no human being would ever write?  To take an example that really
 happened, a fuzz test consisting of 100 nested parentheses caused an
 overflow in a parser table (it could only handle 50 nested parens).
 Is that worth fixing?
 As you pointed out, fuzz test failures tend to occur in clusters and
 many of the failures eventually are traced to the same underlying
 bug.  Which leads to the counter-argument to the pushback.  The fuzz
 tests are finding real underlying bugs.  Why not fix them before a
 customer runs into them?  That very thing did happen several times.  A
 customer-reported bug was fixed and suddenly several of the fuzz test
 problems that had been reported went away.  Another consideration is
 that, even back in the 1980s, humans weren't the only ones writing
 programs.  There were programs writing programs and they sometimes
 produced bizarre (but syntactically correct) code.
 -Paul W.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: A fuzzy awk.