[TUHS] Re: If forking is bad, how about buffering?

14 May 2024

I agree with your (as usual) perceptive analysis. Only stopping by to point
out that I took the buffering out of cat. I didn't have your perspicacity
on why it should happen, just a desire to remove all the damn flags. When I
was done, cat.c was 35 lines long. Do a read, do a write, continue until
EOF. Guess what? That's all you need if you want to cat files.
Sad to say Bell Labs's cat door was hard to open and most of the world
still has a cat with flags. And buffers.
-rob
On Mon, May 13, 2024 at 11:35 PM Douglas McIlroy <
douglas.mcilroy(a)dartmouth.edu&gt; wrote:
...
  So fork() is a significant nuisance. How about the far
more ubiquitous
 problem of IO buffering?
 On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote:
  But it does come down to the same argument as

 https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.…
 The Microsoft manifesto says that fork() is an evil hack. One of the cited
 evils is that one must remember to flush output buffers before forking, for
 fear it will be emitted twice. But buffering is the culprit, not the
 victim. Output buffers must be flushed for many other reasons: to avoid
 deadlock; to force prompt delivery of urgent output; to keep output from
 being lost in case of a subsequent failure. Input buffers can also steal
 data by reading ahead into stuff that should go to another consumer. In all
 these cases buffering can break compositionality. Yet the manifesto blames
 an instance of the hazard on fork()!
 To assure compositionality, one must flush output buffers at every
 possible point where an unknown downstream consumer might correctly act on
 the received data with observable results. And input buffering must never
 ingest data that the program will not eventually use. These are tough
 criteria to meet in general without sacrificing buffering.
 The advent of pipes vividly exposed the non-compositionality of output
 buffering. Interactive pipelines froze when users could not provide input
 that would force stuff to be flushed until the input was informed by that
 very stuff. This phenomenon motivated cat -u, and stdio's convention of
 line buffering for stdout. The premier example of input buffering eating
 other programs' data was mitigated by "here documents" in the Bourne
shell.
 These precautions are mere fig leaves that conceal important special
 cases. The underlying evil of buffered IO still lurks. The justification is
 that it's necessary to match the characteristics of IO devices and to
 minimize system-call overhead.  The former necessity requires the attention
 of hardware designers, but the latter is in the hands of programmers. What
 can be done to mitigate the pain of border-crossing into the kernel? L4 and
 its ilk have taken a whack. An even more radical approach might flow from
 the "whitepaper" at www.codevalley.com.
 In any even the abolition of buffering is a grand challenge.
 Doug

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: If forking is bad, how about buffering?