On Mon, Aug 17, 2020 at 4:16 PM Dibyendu Majumdar <mobile(a)majumdar.org.uk>
wrote:
On Mon, 17 Aug 2020 at 17:13, Dan Cross
<crossd(a)gmail.com> wrote:
From my light skimming of V10 sources, it appears
that the various
components of the default C compiler (that is, not LCC) either
use
malloc/free or call `sbrk` directly.
Yes, it only uses sbrk().
No, that's not true; or at least it's only partially true. Note that I'm
talking specifically about 10th Edition here.
The compiler is implemented as several cooperating processes, some of which
employ different memory management strategies. Notice, for example, that
`ccom` uses `malloc` but not `sbrk`, while `c2` does the opposite and uses
`sbrk` but not `malloc`. On the other hand, `cpp` does no dynamic memory
allocation and instead has fixed-size symbol tables for preprocessor
definitions etc. Finally, the assembler uses `sbrk` but `ld` uses `malloc`.
One consequence I think is that sbrk()
expands the process memory without invalidating existing use of memory
- so the code is able to periodically expand heap while retaining all
existing allocations.
A simple workaround I used was to preallocate a heap and just stub out
sbrk() calls - so that works. So in essence given a single chunk of
memory (if large enough - which is still quite small by today's
standards) the compiler manages fine.
However I find this unsatisfactory and would like to improve it. But
it is a bit difficult to understand how the memory is being used.
So in a nutshell, `sbrk` works by setting the "break" pointer: that is, the
highest usable virtual address in the process's data segment. Stacks may be
at the top of the user portion of the address space; but I'd have to double
check the details. When the process starts up and gets going, one can
discover the initial value of that pointer by executing `sbrk(0)`: since
`sbrk` takes a delta and returns the old value of the break, the return
value will be (basically) the current end of the data segment.
By manipulating the break pointer via `sbrk`, one can cause the kernel to
allocate memory to a process. That memory can then be managed internally
(that is, within the process) via e.g. malloc _or_ a hand-rolled allocator
(e.g., a "bump" allocator): when you want to put something in memory, you
know where the next available pointer is (after the last one you
allocated), you know how much space is available on the heap (since you
allocated it using `sbrk`) and you presumably know how big the thing you
want to store is; if the current pointer + alignment + size of next object
is > break, you allocate more memory using `sbrk` and continue.
There are a couple of approaches you can take to modernize this. The first
you've already discovered: preallocate a heap that you manage with a stub
`sbrk` and be done with it. Another might be to try and find some otherwise
unallocated region of the address space and using `mmap()` to allocate
successive pages to that portion of the address space to simulate `sbrk`;
this may fail if you run into something that's already allocated in one of
those regions.
The most robust way might be the hardest, which is to replace the
`sbrk`-based allocator with calls to e.g. `malloc`. However, looking a
little more closely at 10th Ed's `c2`, for example`, this doesn't seem like
it would be all that bad: `sbrk` is mostly wrapped by a function called
`alloc` that takes an integer number of bytes to allocate and returns a
`struct node *`; the return value is cast to whatever type is needed
though. The use of `struct node *` as the return value is almost certainly
because the compiler was written before `void *` had achieved primacy in
the ANSI C world.
Anyway, I suspect if you simply replaced the guts of `alloc` with `return
malloc(an);` you'd get approximately what you want.
Memory can be used for declarations, trees (for expressions) and
strings as far as I can tell. Strings actually use the
tree
allocation, and just pretend that a node is a string.
Not exactly. The return type of `alloc` is `struct node *`, but that's just
a detail; in the dialect of C that e.g. 10th Ed's PCC is written in, the
return type doesn't matter: it's just a pointer. You can cast it to `char
*` and copy string data there if you want to. I don't know why `struct node
*` was chosen for `alloc`, but I suspect simply convenience.
It seems that tree memory is allocated in a stack discipline. But what
puzzled me is that when a tree starts, about 512 bytes
of memory are
left as gap for declarations to use. I have been trying to think in
what circumstances would you encounter a declaration while parsing an
expression - perhaps cast expressions? Anyway - if a declaration
occurs inside an expression (i.e. tree) then it only has 512 bytes
available. Of course this could be made bigger ... but at the least I
would like to have separate heaps for declarations, trees and strings.
This sounds like it's something independent of the allocator and more about
the program's strategy around allocation in general. It's a common trick in
C programs to e.g., allocate the space for some structs plus things that
may be pointed to by those structs in a single allocation; not only does
this cut down on overhead with respect to calling into the allocator, it
also gives you locality.
I don't think you want separate _heaps_ for the things you mention, though
you may want to allocate them separately using independent calls to e.g.
`malloc`. That's fine.
I guess no one really dug into this code - as presumably once PCC came
along the original compiler by Dennis stopped being
used.
10th Edition `cc` _is_ `pcc`. I haven't looked at the 7th Ed compiler in
detail.
- Dan C.