On Fri, Mar 05, 2021 at 01:50:32AM -0800, John Gilmore wrote:
John P. Linderman <jpl.jpl(a)gmail.com> wrote:
I have several 12 TB disks scattered about my
house. 5% of 12TB is 600GB.
At one point in hystery, ext2 performance was reported to suffer badly
if there was less than 5% of disk space available in an active
filesystem. My naive belief, probably informed by older and wiser heads
around Sun, was that when the file system was >95% full, ext2 spent a
lot of time seeking around in free lists finding single allocatable
blocks. And there were no built-in "defragmentation" programs that
could easily fix that.
I'll point out that BSD FFS, at least in BSD 4.3, reserves 10% of the
file system for reserved blocks. The reasoning gave was the same as
with ext2, which is performance suffers when the file system gets
really fully. It really depends on the workload, but for the worst
case workloads, I've seen performance degradations happen earlier than
90% or 95% full.
Defragmentation is a hard problem, and it takes a *long* time, and or
chews up a lot of write cycles on flash, persistent memory, etc. This
is especially true when the file system is very full, since there is
very little "swing space" available to copy data around, especially if
one is striving for the perfect state (e.g., a files are contiguous,
and the free space is contiguous). Defragmentation utilities was
something that was only really popular on Windows systems, and in the
last, oh, ten years or so, even in the Windows world defrag is not
something you'll see in their disk utilities.
Ext4 has a defrag program, but it's really primitive, and it only
works by attempting to defrag *files*. It doesn't try to defrag *free
space*, which is what you really need if you want to try to keep
performance up (and have space so as to keep files defragmented). The
main reason why is that no one has sponsored work in this space ---
probably because it's way cheaper just to over-provision disk space.
If someone wants to try to implement it, I have a rough design about
how it might be done for ext4 --- but someone has to volunteer $$$ or
their own personal time in order to implement it. And if it's only
for yourself, it's probably cheaper just to spend a extra few bucks to
buy a bigger disk. I suspect it's for similar reasons that none of
the legacy Unix systems have defragmentation implemented either. It's
not something which makes business case.
But hey, if someone is interested in working on it, I'll give the
standard open source answer --- patches gratefully accepted. I'll
even give you a rough design as a starting point. (And there's no
guarantee it will be a good idea for flash, given that seeks are
mostly free for flash, and write cycles are a consideration, and how
many people are still using HDD's today; newer operating systems, such
as Fuschia, have started designing for a world where they only need to
care about file systems for flash.)
Cheers,
- Ted