TWCOS Kernel

Slab
Login

Slab Allocator

Introduction

The slab allocator provides the following main features:

Object types

Custom object allocation is achieved by creating a type definition. The type definition includes:

Example:

static slab_type_t threads[1] = {SLAB_TYPE(sizeof(thread_t), thread_mark, thread_finalize)}; ... thread_t * thread = slab_calloc(threads);

Slabs

Slab types - slab_type_t

Objects have a type, which indicates the object size and any custom marking and finalization.

The type also keeps track of slabs that contain instances of this object type. The slabs are kept in a linked list, to be keep track of memory allocated to each type.

Generic objects

Generic allocation is done using a pool of sized types, with generic garbage marking and no finalizer. When allocating memory, the size of the allocation is matched up to the closest pool type, and the allocation completed using that pool type.

This is the only generic heap allocation, and thus limits heap allocated data to sizes that fit in a slab (which is limited to a single page.)

Thus, heap allocation is not currently suitable for arbitrarily large data, such as memory used for transfer buffers to/from block devices.

Slabs

Slabs are allocated on demand, when the type being allocated has no more free slabs. A slab is a single page of kernel heap data, and consists of:

Slots

Slots consist of:

Once the allocator has located a free slot, the object is removed from the free list, put onto the correct garbage collector list (currently white list, but could be grey in the case of concurrent GC), and a pointer to the object data returned as a result of the allocation.

Garbage collection

The garbage collector is currently a simple mark and sweep collector, with tri-colour marking of reachable heap objects from a set of root objects. The root objects comprise:

Objects are one of three colours:

The garbage collector will not visit black objects again, but a black object may be made grey again as a result of concurrent updates to the object, necessitating rescanning. This would be to support incremental or concurrent garbage collection, which are currently not supported.

The collector is a conservative collector, interpreting root objects as pointers, and marking any pointers that point into the heap space. The result being that a non-pointer value that happens to point into the heap will result in the heap spaces "pointed" to by that non-pointer being marked as reachable and not garbage. While this may result in some leakage, in a higher half kernel, it is unlikely that non-pointer values will appear to be valid pointers to heap area.

Garbage collector algorithm

When it is determined that garbage collection is required, garbage collection operates using the following algorithms.

Garbage collection start

The garbage collector starts by stopping the world (this is a no-op for single CPU system), doing a little housekeeping (getting a starting timestamp, clearing memory stats) then queuing the garbage collection global roots for scanning (global memory annotated with GCROOT).

Garbage collection marking

This is the meat of the garbage collection. While there is memory queued to be scanned (as done when the GCROOT data is queued) or there are grey objects to be scanned, we do a garbage collection "step".

The top level marking algorithm is thus:

void slab_gc() { while(gclevel || grey) { slab_gc_step(); } }

Garbage collection step

Each step of garbage collection comprises the steps:

Partitioning in this way means the queue of non-heap data to be scanned is bounded by the number of non-heap memory ranges a custom object marking routine may generate.

For the generic marking, which looks only for pointers into the heap, marking will only ever generate new grey objects, which are queued on the end of the grey list.

As an example of a custom marker generating non-heap data to be scanned, consider the thread structure marking, which will queue non-heap memory for the thread kernel stack for marking (from the logical top (stack pointer) to the logical bottom of the stack.)

Thus, arbitrarily nested and complex data can be safely scanned, without fear of overflowing the kernel stack (in the case of recursive marking) or exceeding the bounds of a scan queue (in the case of a single scan queue), both of which were strategies used to track objects to be scanned, and caused failure of the garbage collection, in previous versions.

Garbage collection end

Eventually, once all live data has been scanned, no more grey non-heap or grey objects will be generated, marking will be finished.

At this point, we'll have two lists of heap objects:

At this point, we need to stop the world again, and:

At this point, the mark phase of garbage collection is finished.

Garbage collection incremental sweep

The garbage heap data on the garbage list is by definition not referenced. We can therefore allow the kernel to continue its operation, and sweep the garbage list when required or convenient, thus reducing the synchronous impact of garbage collection.

Garbage is swept incrementally when:

Incremental sweep on allocation

When allocating a new object, we search for free objects in existing slabs. If there are no free objects, we invoke an incremental sweep until we sweep an object of the type we're looking for, then retry the allocation.

On the assumption that the garbage list will be an indeterminate order of objects, we will likely only sweep a subset of the garbage objects until we find garbage object of the type we're looking for.

Incremental sweep in idle

In the idle thread, we sweep all remaining garbage data. This sweeping of garbage in idle time, when there is no other useful work to do anyway, effectively makes the sweeping of garbage data free. Along with idle time garbage marking, garbage collection becomes a cheap option for automatic memory management.

Security

Importantly, user provided data would not be scanned for pointers by the garbage collector, so it is impossible for user space data to poison the garbage collector and enact a denial-of-service attack on kernel heap memory.