[TUHS] Re: If forking is bad, how about buffering?

19 May 2024

Yes, many classic commands -- cat, cp, and others -- were sleekly and
succinctly written.
In part because they were devoid of error checking.
I recall how annoying it was one time in the early 70s to cp a bunch of
files to a file system that was out of space.
As I grew older, my concept of what constituted elegant programming changed.
UNIX was a *research* project, not a production system!
At one of the first UNIX meetings, somebody from an OSS (operations support
system) was talking about the limitations of UNIX when Doug asked, "Why are
you using UNIX?"
Marc
On Sun, May 19, 2024, 5:54 AM Andrew Warkentin &lt;andreww591(a)gmail.com&gt; wrote:
...
  On Sat, May 18, 2024 at 8:03 PM Larry McVoy
&lt;lm(a)mcvoy.com&gt; wrote:

 On Sat, May 18, 2024 at 06:40:42PM -0700, Bakul Shah wrote:
  On May 18, 2024, at 6:21???PM, Larry McVoy
&lt;lm(a)mcvoy.com&gt; wrote:

 On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote:
> [1] This brings up a separate point: in a microkernel even a simple
> thing like "foo | bar" would require a third process - a "pipe
> service", to buffer up the output of foo! You may have reduced
> the overhead of individual syscalls but you will have more of
> cross-domain calls!
 Do any micro kernels do address space to address space bcopy()? 
 mmapping the same page in two processes won't be hard but now
 you have complicated cat (or some iolib)! 
 I recall asking Linus if that could be done to save TLB entries, as in
 multiple processes map a portion of their address space (at the same
 virtual location) and then they all use the same TLB entries for that
 part of their address space.  He said it couldn't be done because the
 process ID concept was hard wired into the TLB.  I don't know if TLB
 tech has evolved such that a single process could have multiple "process"
 IDs associated with it in the TLB.
 I wanted it because if you could share part of your address space with
 another process, using the same TLB entries, then motivation for threads
 could go away (I've never been a threads fan but I acknowledge why
 you might need them).  I was channeling Rob's "If you think you need
 threads, your processes are too fat".
 The idea of using processes instead of threads falls down when you
 consider TLB usage.  And TLB usage, when you care about performance, is
 an issue.  I could craft you some realistic benchmarks, mirroring real
 world work loads, that would kill the idea of replacing threads with
 processes unless they shared TLB entries.  Think of a N-way threaded
 application, lots of address space used, that application uses all of the
 TLB.  Now do that with N processes and your TLB is N times less  effective.

 This was a conversation decades ago so maybe TLB tech now has solved  this.
  I doubt it, if this was a solved problem I think
every OS would say screw
 threads, just use processes and mmap().  The nice part of that model
 is you can choose what parts of your address space you want to share.
 That cuts out a HUGE swath of potential problems where another thread
 can go poke in a part of your address space that you don't want poked.

 I've never been a fan of the rfork()/clone() model. With the OS I'm
 working on, rather than using processes that share state as threads, a
 process will more or less just be a collection of threads that share a
 command line and get replaced on exec(). All of the state usually
 associated with a process (e.g. file descriptor space, filesystem
 namespace, virtual address space, memory allocations) will instead be
 stored in separate container objects that can be shared between
 threads. It will be possible to share any of these containers between
 processes, or use different combinations between threads within a
 process. This would allow more control over what gets shared between
 threads/processes than rfork()/clone() because the state containers
 will appear in the filesystem and be explicitly bound to threads
 rather than being anonymous and only transferred on rfork()/clone().
 Emulating rfork()/clone on top of this will be easy enough though.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: If forking is bad, how about buffering?