On 2020-Jun-01 07:58:02 -0700, Larry McVoy <lm@mcvoy.com> wrote:
>On Mon, Jun 01, 2020 at 01:32:56PM +1000, Dave Horsfall wrote:
>> On Mon, 1 Jun 2020, Rob Pike wrote:
>>
>> > I???m not quite sure why the Research lineage did not include
>> > non-blocking behaviour, especially in view of the man page comments.
>> > Maybe it was seen as against the Unix philosophy, with select()
>> > offering sufficient mechanism to avoid blocking (with open() the hard
>> > corner case)?
>> >
>> >That's it. Select was good enough for our purposes.
>>
>> After being dragged through both Berserkley and SysVile, I never did get the
>> hang of poll()/select() etc,,,
>
>I'm sure you could, select is super handy, think a network server like
>apache.
My view may be unpopular but I've always been disappointed that Unix
implemented blocking I/O only and then had to add various hacks to cover
up for the lack of asynchonous I/O. It's trivial to build blocking I/O
operations on top of asynchonous I/O operations. It's impossible to do
the opposite without additional functionality.
I also found it disappointing that poll()/select() only worked on TTY and
network operations. HDDs are really slow compared to CPUs and it would be
really nice if a process could go and do something else whilst waiting for
a file to open.
Lest anybody think this is a theoretical concern, Netflix has spent quite a bit of effort to reduce the sources of latency in our system. The latency for open doesn't happen often, due to caching, but when it does this causes a hickup for nginx worker thread (since open is blocking). If you get enough hickups, you wind up consuming all your worker threads and latency for everybody suffers while waiting for these to complete (think flaky disk that suddenly takes a really long time for each of its I/Os for one example). We've hacked FreeBSD in various ways to reduce or eliminate this delay.... In FreeBSD we have to translate from the pathname to a vnode, and to do that we have to look up directories, indirect block tables, etc. All these are surprising places that one could bottleneck at... And if you try to access the vnode before all this is done, you'll wait for it (though in the case of sendfile it doesn't matter since that's async and only affect the one I/O)...
So I could see how having a async open could introduce a lot of hair into the mix depending on how you do it. Without a robust callback/AST mechanism, my brain is recoiling from the EALREADY errors in sockets for things that are already in progress... reads and write are easy by comparison :) The kicker is that all of the kernel is callback driven. The upper half queues the request and then sleeps until the lower half signals it to wakeup. And that signal is often just a wakeup done from the completion routine in the original request. All of that would be useful in userland for high volume activity, none of it is exposed...
Warner