Project 3: Memory Allocator (v1.0)

For our third project, we will develop a custom memory allocator. As we discussed in class, malloc isn’t provided by the OS, nor is it a system call; it is a library function that uses system calls to allocate and deallocate memory. In most implementations of malloc, these system calls involve at least one of:

sbrk is the simplest of the two – simply give it a size and it will increase the program break (a.k.a. bound). On the other hand, mmap gives us more control but requires more work to use. Some malloc implementations use sbrk for small allocations and mmap for larger ones.

Our allocator will use sbrk to request memory or release memory. To reduce the number of system calls it needs to make, each call to sbrk will be a multiple of the system page size; this means that if a program executes malloc(1) for a single byte of memory and our machine has a page size of 4096, we’ll request 4096 bytes instead. Consequently, a malloc for 4097 bytes will be rounded up to 8192 bytes. We’ll call these blocks of memory.

You may wonder if this approach will waste memory, and the answer is yes… IF we don’t split memory blocks into sub-blocks. Herein lies the challenge: you will not only support allocating memory, but also use the free space management algorithms we discussed in class to split up and reuse empty regions:

You should be able to configure the active free space management algorithm via a function call, malloc_setfsm().

Managing Memory

Given this allocation strategy, we will prefix each sub-block with some metadata; this is how many malloc implementations work. Simply embed a struct at the start of each memory block that describes how large it is, whether it has been freed or not, and include a pointer to the next block in the chain. Looks like studying linked lists paid off after all! Here’s how this looks logically in memory, with each allocation prefixed with a metadata struct:



Struct prefix for each memory allocation

This means that even if we ignore our plan to round up to the nearest page, allocating a single byte of memory actually takes more space: the single byte, plus the size of the metadata.

Tracking Memory

We have to have a way to keep track of this information, using two doubly-linked lists:

Here is an example metadata struct that contains block information:

struct mem_block {
    /**
     * The name of this memory block. If the user doesn't specify a name for the
     * block, it should be left empty (a single null byte).
     */
    char name[8];

    /** Size of the block */
    uint size;

    /** Links for our doubly-linked list of blocks: */
    struct mem_block *next_block;
    struct mem_block *prev_block;
}

It seems like some information is missing:

But don’t worry. We will align allocations to 16 bytes. That means the first 3 bits of the block’s size will be unused, so we can use the 0th bit to store the free flag. Additionally, since a freed block will be always have at least 16 bytes of data available after the header, we’ll store our free pointers there. After all, they won’t be in use if the block is free.

Allocating Memory

If your program needs more memory, request additional blocks via sbrk. Place a metadata struct at the start the memory and return a pointer to the ‘data’ portion of the memory shown in the first figure. Don’t return a pointer to the struct itself, because it will be overwritten by the user!

Memory allocations must be aligned to 16 bytes; in other words, the size of the memory blocks should be evenly divisible by 16. The minimum viable block’s data portion is 16 bytes, and the overall minimum size of a block is 48 bytes.

Once basic allocation works, you can start splitting blocks that are not 100% used. For instance, if a block is 4096 bytes in size but only 96 bytes are actually used, split the block in two: one 96-byte block, and one 4000-byte block.

When implementing your free space management algorithms, ties (i.e., blocks that satisfy the algorithm and are the same size) should be broken by choosing the first allocation you found based on the linked list order.

Freeing Memory

First, set the 0th bit of size to 1. Next, use the data payload portion of the block to store a pointer to the next free block. That’s it! This approach is why you sometimes can read ‘old’ values from memory that have been freed. After freeing a block, you should also check neighboring blocks to determine whether you can merge with them or not. Merge with any free neighboring blocks.

If an entire block has been freed (i.e., 4096 bytes or more are free at the end of the address space), then you should decrease the bound of the program with sbrk to release the memory for other programs to use.

Reallocating Memory

If the user wants to realloc a pointer, first check to see if the block can be resized in place. Ways this could happen:

If none of the situations above are possible (e.g., the block is too large to resize in place), simply malloc a new, appropriately sized block, copy the data there, and then free the old block.

Edge Cases: If the pointer passed into realloc is NULL, then it should behave like malloc instead since there is nothing to resize. Additionally, if the size passed into realloc is 0, then the block should be freed.

Extra Features

Since we’re writing our own version of malloc, we might as well add some features while we’re at it.

Named Blocks: to help with debugging, you can optionally provide a name for each allocation. These names will be shown when state information is printed.

Memory State Information: your allocator should be able to print out the current memory state with the malloc_print() function. See the format below.

-- Current Memory State --
[BLOCK 0x7f0d774e7000-0x7f0d774e70a8] 168     [USED]  'Blk 1'
[BLOCK 0x7f0d774b0000-0x7f0d774b0050] 80      [USED]  'Blk 2'
[BLOCK 0x7f0d774af000-0x7f0d774af0a8] 168     [USED]  'Blk 3'
     ...
(list continues)

-- Free List --
[0x7f0d774e70a8] -> [0x7f0d774b0050] -> [0x7f0d774af0a8] -> (...) -> NULL

Each element is printed out in order, so there is an implied link between element 1 and element 2, and so on.

Leak Check: You can leverage the metadata we are tracking to find memory leaks, so add a malloc_leaks() function. malloc_leaks() will print leaks, a summary, and return true if leaks were found:

-- Leak Check --
[BLOCK 0x7f0d774e7000] 168     'Blk 1'
[BLOCK 0x7f0d774b0000] 80      'Blk 2'
     ...
  (list continues)

-- Summary --
542 blocks lost (892412 bytes)

Scribbling: C beginners often get tripped up by a seemingly strange behavior exhibited by malloc: sometimes they get a nice, clean chunk of memory to work with, and other times it will have ‘residual’ values that crash their program (usually when it’s being graded!). One solution to this, of course, is to use calloc() to clear the newly-allocated block. Since you are implementing your own memory allocator, you now understand why this happens: free() leaves old values in memory without cleaning them up.

To help find these memory errors, you will provide scribbling functionality: when scribbling is enabled, you will fill any new allocation with 0xAA (10101010 in binary). This means that if a program assumes memory allocated by malloc is zeroed out, it will be in for a rude awakening – for instance, what might’ve been assumed to be 0 in a single byte will now be 170 (10101010 instead of 00000000).

You should scribble new allocations (malloc()), reused blocks, and any new space in a realloc. Provide malloc_scribble() to toggle this feature.

Supported Functions

Grading and Submission

Check your changes into your OS repo as you work. You should test your allocator with a variety of test programs and commands to make sure it works.

This project is worth 13 points.

To receive 70% credit, implement:

To receive 80% credit, implement:

To receive 90% credit, implement:

To receive 95% credit, implement:

To receive full credit for this project:

Your grade will also include an additional 2 points from the code review / demo. Things that we will check: