English
/ 繁體中文
journal
album
book
about
Subscribe
Activities Elsewhere
This page is not available in 繁體中文, as a result the English (en_US) version is shown instead.
Title:
Body:
> Been experimenting with writing a garbage collector lately. One of the problems I faced was how to walk the stack/heap/data section for all the (potential) pointers. The solution that I come up with probably only works for 32bit x86 Linux, and even then maybe making some bad assumption about memory alignment, among other things. Still, this maybe useful for other curious minds out there. > > First there are `etext`, `edata` and `end`. Those 3 symbols are not normally defined, so you can define variables with those names if you want. But if you do: > > extern unsigned int etext, edata, end; > > `ld` (or maybe something else, I can't find where I originally read this) will define those symbols if they don't resolve to anything else. Their values are meaningless, only their addresses are useful. `&etext` is the starting address for the text section, `&edata` is the end of text section and also the beginning of data section. `&end` is the end of data section and also the beginning of the heap. The type that you declare `etext` et al with is probably unimportant. > > For memory allocators that use `sbrk()`, you can also use `sbrk(0)` to find the end of the heap. I've read that glibc uses mmap for `malloc()`s that are over certain sizes, so that may break things if you mix `sbrk(0)` and `malloc()`. > > The stack is a little bit trickier, everywhere I've looked suggests that there is no portable way to walk the stack. Here's the code snippet that I am using: > > register gulong *fp __asm ("%ebp"); > > /* skip the current frame because I don't want to walk the current > function */ > gulong *ptr = (gulong *)*fp; > do { > gulong *saved = ptr + 4; > ptr = (gulong *)*ptr; > > if (ptr == NULL) { > break; > } > > // do something from "saved" to "ptr" > } while (1); > > As you can see, I first define a variable for `%ebp`, which is the x86 way of saying `$fp`. `0(%ebp)` contains the address of the previous frame pointer, the last of which points to `NULL`. Incrementing by 4 maybe a bad assumption though, there maybe cases where 2 is more appropriate.