Lab 9: Implementing mmap: shared, anonymous mappings

In this lab, we’ll put our paging knowledge to good use to build an mmap system call. While we won’t support all of the possibilities provided by the POSIX mmap, we’ll be able to share memory between two or more processes. That means we’ll have finally studied our last form of IPC – shared memory!

As we know, paging allows us to provide processes with a virtual address space that is mapped to actual, physical memory addresses. In theory, all we need to share memory between processes is to have their virtual address space to point to one or more of the same physical page frames. Something like this:

     Process 1               Physical Memory               Process 2
 ┌───────────────┐          ┌───────────────┐          ┌───────────────┐
 │Page 1         │─────────▶│Page Frame 1   │    ┌─────│Page 1         │
 ├───────────────┤          ├───────────────┤    │     ├───────────────┤
 │Page 2         │────┐     │Page Frame 2   │◀───┼─────│Page 2         │
 ├───────────────┤    │     ├───────────────┤    │     ├───────────────┤
 │Page 3         │───┐│     │Page Frame 3   │    │ ┌───│Page 3         │
 ├───────────────┤   ││     ╔═══════════════╗    │ │   └───────────────┘
 │Page 4         │───┼┼────▶║Page Frame 4   ║◀───┼─┘
 └───────────────┘   ││     ╚═══════════════╝    │
                     ││     │Page Frame 5   │    │
                     ││     ├───────────────┤    │
                     │└────▶│Page Frame 6   │    │
                     │      ├───────────────┤    │
                     └─────▶│Page Frame 7   │    │
                            ├───────────────┤    │
                            │Page Frame 8   │◀───┘
                            └───────────────┘

Here, Process 1 and Process 2 are both able to access Page Frame 4, although their own virtual addresses for the page may differ (perhaps Process 1 has that location mapped at virtual address 0x4000 while Process 2 accesses it at address 0x3000).

Starting Out

We need to get familiar with how the kernel allocates pages of memory, so the first stop on our journey is kernel/kalloc.c. You will notice that in kinit(), we call freerange() which unfortunately does not have anything to do with free range chickens. Take a look at that code to understand what it does, which will then hopefully inspire you to think about…

Creating a system call

You may have noticed that the kernel stores all of its free pages of physical memory in a linked list. Create a new system call that iterates through this list and returns the amount of free memory in KiB (calculated as PGSIZE bytes per page). Add a companion user space utility to report this information, like so:

$ freemem
130012 KiB

On a system that has just started up, you should get something close to the amount of memory your QEMU virtual machine has. Check the -m flag in the Makefile to confirm.

Memory Allocation

Create an mmap system call for the next part of the lab. To start out, have mmap simply allocate some memory, essentially acting like a very basic kernel-powered malloc. In fact, tracing through a user space malloc call should be fairly illuminating:

malloc -> sbrk -> growproc -> uvmalloc()

Wherein kalloc() is called and gets us back to where we started, in kernel/kalloc.. So if we want to have our system call allocate a new page of memory for the calling process, we need kalloc(). However, if you try to return the memory address from kalloc() directly to userspace, it won’t work!

To understand why, we need to think back to our real time clock lab. We were using memory-mapped I/O to access the RTC, but in order to do so, the hardware address needed to be in the kernel’s memory map. Similarly, the calling process needs a mapping to this physical page frame that we just retrieved with kalloc().

You will need to use the mappages function to accomplish this. The function signature looks like:

int mappages(
    pagetable_t pagetable, // page table of the process
    uint64 va,             // virtual memory address
    uint64 size,           // size (determines number of pages)
    uint64 pa,             // physical memory address
    int perm)              // permission bits

You should be able to figure out what to pass to mappages by looking at other calls that are made in the kernel, but to make life a little easier, a hint: use TRAPFRAME - 2 * PGSIZE for the virtual address and PTE_R | PTE_W | PTE_U for the permission bits. More on the va later.

After you have done the mapping, you can check that it worked correctly with the following:

uint64 pa = walkaddr(p->pagetable, va);

The result (pa) should be the same address as returned by kalloc. This means that the kernel was able to successfully take the virtual address, walk the page table, and determine what physical address it maps to. We can make this more robust with something like:

if (walkaddr(p->pagetable, va) != (uint64) pa) {
  panic("invalid mmap");
}

Write a small test utility to call your mmap implementation. It should store a string at the memory address returned by mmap with strcpy and then print it out with printf. If everything works correctly, you’re ready for the next step! Here’s an example output:

$ memtest
-> hello world!
panic: freewalk: leaf

Don’t worry about the ‘freewalk’ panic for now. We will address that later. If your program gets this far, it means that it successfully executed but was not cleaned up properly (yet).

Memory Mapping

Now that we can successfully allocate a page of memory and use it, it’s time to move on to sharing memory between processes. First, think about how a process gets its page table set up: all processes are created via fork(), which creates a copy of the parent process. This is why when we create a child process, it doesn’t have access to its parent’s memory, because it simply receives a copy of the memory pages.

We need a way to make our special pages (and their mappings) created with mmap survive after fork() is called. That way the new process will be able to access the same physical address instead of just a copy. But how? We need to modify kfork() in kernel/proc.c.

To do this:

Check whether the process being fork()ed has any mapped memory pages.
- A good way to start is add an mmap member to the proc struct that is set to 1 when a mapped page is present. You’ll need to set it in your mmap system call and check for it in kfork().
If it does, use walkaddr() to retrieve the physical address of each mapping.
Use mappages on the new process page table (np->pagetable) to create a mapping that points to the same physical address.

If you update your test program to fork a child process, it should be able to write a string to the shared memory location… but make sure you do not wait() for the child in the parent process, because otherwise the kernel will panic with the ‘freewalk’ error again. Instead, simply pause(10) and then print out the string in the parent. The child process will become a zombie instead of causing the panic, and the parent should be able print the string that was copied in by the child!

$ memtest
-> hello world!
-> hello from the child
panic: freewalk: leaf

Memory Layout

We mentioned TRAPFRAME - 2 * PGSIZE earlier, but what is it? This represents a memory location that is almost at the very end of the process address space. Normally in our configuration, the heap will grow starting from low memory addresses, whereas our stack is located near the end of the address space (and, as you may have noticed, does not grow at all). We have to put mapped memory pages somewhere, so we are placing them right after the stack. This means our memory layout will look something like:


    +--------------------+  0x0 (low)
    | text / data / bss  |
    +--------------------+ 
 |  | heap (sbrk)        |  
 v  |         ...        |
    +--------------------+  p->sz
    |                    |
    |                    |
    |                    |
    +--------------------+  p->mmap
 ^  |         ...        |
 |  | mmap               |
    +--------------------+
    | stack              |
    +--------------------+ 
    | trapframe          |
    +--------------------+ 
    | trampoline         |
    +--------------------+ MAXVA (high)

(Note that this diagram assumes you have added a mmap member to the proc struct in kernel/proc.h.)

However, hard-coding this location is not a good idea. While we’re doing the right thing by determining where TRAPFRAME is, it’s also possible to change the size of the stack (see USERSTACK in kernel/param.h). We need to calculate where mapped pages start when the stack gets set up… and that is in the kexec() function in kernel/exec.c.

Locate the place where the user stack is allocated in kexec(), and initialize p->mmap to hold the starting address for mapped memory pages. It should be the location of TRAPFRAME, minus the size of the stack.

You should now use p->mmap to determine where the next memory mapping will go when your mmap system call is used. Each time your you create a new mapping, subtract p->mmap by one page. You’ll be able to know how many mappings have been made based on the current value of p->mmap.

Testing Your Code

At this point, you may want to create a small test program. It should:

Start up
Create three mapped memory pages
Store test strings in each of the pages and print them
Fork a child process
Attempt to read the strings in the child, printing their values
Change the strings! Overwrite the first, append to the second, and leave the third alone
Have the parent wait(0) for its child and print the strings afterward.
The parent should be able to see the modified strings!
Once the previous steps work, update the child process so that it forks and waits for another child, which modifies the third string. Mapception!

Reference Counting

If everything goes well, your code may sort of work, but you may get a usertrap or freewalk: leaf panic after the processes run. This is because the kernel checks to make sure each page mapped by a process gets unmapped and cleaned up (freed) when the process exits. We definitely are not doing that with our mapped pages.

To solve this problem, we need to implement reference counting for kernel memory pages. In short, we’ll maintain a mapping from page numbers to a counter. Normal memory pages will have a reference count of 1, while mapped pages will have a reference count of 2 or more.

In kernel/kalloc.c, let’s add:

/* We will store a reference count for each physical page here: */
static int ref_count[(PHYSTOP - KERNBASE) / PGSIZE];

/* Convert physical address to index in reference count array */
static inline int
pa2idx(uint64 pa)
{
  return (pa - KERNBASE) / PGSIZE;
}

Now:

When kalloc() allocates a new page, set its reference count to 1
When updating mappings for new processes in kfork(), increment the reference counts for each physical address being mapped.
In kfree(), decrement the reference count and return immediately, unless…
- …the reference count reaches 0. In that case, release the page (fill with junk and add back to the linked list, as was done before).

Now pages will get cleaned up appropriately and mappings stick around until their reference counts reach zero.

Final Test

You should be able to run the 3-process test program described previously, plus to prove your reference counting works you should be able to run freemem before and after the test to demonstrate that no kernel pages were leaked.

Grading and Submission

Once you are finished, check your changes into your OS repo. Then have a member of the course staff take a look at your lab to check it.

To receive 65% credit:

Implement the system call to retrieve free memory

To receive 75% credit:

Complete all previous requirements
Implement basic mapping for a single page and demonstrate that a single process can successfully map and use the memory.

To receive 85% credit:

Complete all previous requirements
Demonstrate that two processes can read and write to the same physical page.

To receive full credit for this lab:

Complete all previous requirements
Demonstrate multiple processes (3+) sharing and modifying mapped memory as described in the test program above
Confirm a successful physical page reference counting implementation by demonstrating that no kernel pages have been leaked after running your test program