[TUHS] signals and blocked in I/O
lm at mcvoy.com
Sat Dec 2 09:09:34 AEST 2017
On Fri, Dec 01, 2017 at 11:03:02PM +0000, Ralph Corderoy wrote:
> Hi Larry,
> > > So OOM code kills a (random) process in hopes of freeing up some
> > > pages but if this process is stuck in diskIO, nothing can be freed
> > > and everything grinds to a halt.
> > Yep, exactly.
> Is that because the pages have been dirty for so long they've reached
> the VM-writeback timeout even though there's no pressure to use them for
> something else? Or has that been lengthened because you don't fear
> power loss wiping volatile RAM?
I'm tinkering with the pageout daemon so I'm trying to apply memory
pressure. I have 10 25GB processes (25GB malloced) and the processes just
walk the memory over and over. This is on a 256GB main memory machine
(2 socket haswell, 28 cpus, 28 1TB SSDs, on loan from Netflix).
It's the old "10 pounds of shit in a 5 pound bag" problem, same old stuff,
just a bigger bag.
The problem is that OOM can't kill the processes that are the problem,
they are stuck in disk wait. That's why I started asking why can't you
kill a process that's in the middle of I/O.
More information about the TUHS