On Mon, 17 Aug 2020 at 17:13, Dan Cross <crossd@gmail.com> wrote:
> From my light skimming of V10 sources, it appears that the various components of the default C compiler (that is, not LCC) either use malloc/free or call `sbrk` directly.
Yes, it only uses sbrk().
No, that's not true; or at least it's only partially true. Note that I'm talking specifically about 10th Edition here.
The compiler is implemented as several cooperating processes, some of which employ different memory management strategies. Notice, for example, that `ccom` uses `malloc` but not `sbrk`, while `c2` does the opposite and uses `sbrk` but not `malloc`. On the other hand, `cpp` does no dynamic memory allocation and instead has fixed-size symbol tables for preprocessor definitions etc. Finally, the assembler uses `sbrk` but `ld` uses `malloc`.
One consequence I think is that sbrk()
expands the process memory without invalidating existing use of memory
- so the code is able to periodically expand heap while retaining all
existing allocations.
A simple workaround I used was to preallocate a heap and just stub out
sbrk() calls - so that works. So in essence given a single chunk of
memory (if large enough - which is still quite small by today's
standards) the compiler manages fine.
However I find this unsatisfactory and would like to improve it. But
it is a bit difficult to understand how the memory is being used.
So in a nutshell, `sbrk` works by setting the "break" pointer: that is, the highest usable virtual address in the process's data segment. Stacks may be at the top of the user portion of the address space; but I'd have to double check the details. When the process starts up and gets going, one can discover the initial value of that pointer by executing `sbrk(0)`: since `sbrk` takes a delta and returns the old value of the break, the return value will be (basically) the current end of the data segment.
By manipulating the break pointer via `sbrk`, one can cause the kernel to allocate memory to a process. That memory can then be managed internally (that is, within the process) via e.g. malloc _or_ a hand-rolled allocator (e.g., a "bump" allocator): when you want to put something in memory, you know where the next available pointer is (after the last one you allocated), you know how much space is available on the heap (since you allocated it using `sbrk`) and you presumably know how big the thing you want to store is; if the current pointer + alignment
+ size of next object is > break, you allocate more memory using `sbrk` and continue.
There are a couple of approaches you can take to modernize this. The first you've already discovered: preallocate a heap that you manage with a stub `sbrk` and be done with it. Another might be to try and find some otherwise unallocated region of the address space and using `mmap()` to allocate successive pages to that portion of the address space to simulate `sbrk`; this may fail if you run into something that's already allocated in one of those regions.
The most robust way might be the hardest, which is to replace the `sbrk`-based allocator with calls to e.g. `malloc`. However, looking a little more closely at 10th Ed's `c2`, for example`, this doesn't seem like it would be all that bad: `sbrk` is mostly wrapped by a function called `alloc` that takes an integer number of bytes to allocate and returns a `struct node *`; the return value is cast to whatever type is needed though. The use of `struct node *` as the return value is almost certainly because the compiler was written before `void *` had achieved primacy in the ANSI C world.
Anyway, I suspect if you simply replaced the guts of `alloc` with `return malloc(an);` you'd get approximately what you want.
Memory can be used for declarations, trees (for expressions) and
strings as far as I can tell. Strings actually use the tree
allocation, and just pretend that a node is a string.
Not exactly. The return type of `alloc` is `struct node *`, but that's just a detail; in the dialect of C that e.g. 10th Ed's PCC is written in, the return type doesn't matter: it's just a pointer. You can cast it to `char *` and copy string data there if you want to. I don't know why `struct node *` was chosen for `alloc`, but I suspect simply convenience.
It seems that tree memory is allocated in a stack discipline. But what
puzzled me is that when a tree starts, about 512 bytes of memory are
left as gap for declarations to use. I have been trying to think in
what circumstances would you encounter a declaration while parsing an
expression - perhaps cast expressions? Anyway - if a declaration
occurs inside an expression (i.e. tree) then it only has 512 bytes
available. Of course this could be made bigger ... but at the least I
would like to have separate heaps for declarations, trees and strings.
This sounds like it's something independent of the allocator and more about the program's strategy around allocation in general. It's a common trick in C programs to e.g., allocate the space for some structs plus things that may be pointed to by those structs in a single allocation; not only does this cut down on overhead with respect to calling into the allocator, it also gives you locality.
I don't think you want separate _heaps_ for the things you mention, though you may want to allocate them separately using independent calls to e.g. `malloc`. That's fine.
I guess no one really dug into this code - as presumably once PCC came
along the original compiler by Dennis stopped being used.
10th Edition `cc` _is_ `pcc`. I haven't looked at the 7th Ed compiler in detail.
- Dan C.