Lab 2: System Call Me Maybe

Our previous labs dealt with writing code for user space, which really isn’t much different than any other programs you’ve written. Write a program, compile it, run it. Watch it segfault.

However, kernel space is where privileged operations run. If you had to guess, did our previous labs do anything you’d consider privileged? Probably not, but in reality even the simplest “hello world” program interacts with the kernel; getting text to display on the screen requires a display driver or serial interface that user space programs do not have direct control over. Let’s take a look at the journey of a single printf call in xv6:

            User Space              │                Kernel Space
            ----------              │                ------------
                                    │
┌────────┐  ┌────────┐  ┌────────┐  │    ┌───────────────┐  ┌───────────────┐
│printf()│─▶│ putc() │─▶│write() │──┼───▶│  sys_write()  │─▶│  filewrite()  │──┐
└────────┘  └────────┘  └────────┘  │    └───────────────┘  └───────────────┘  │
                                    │ ┌────────────────────────────────────────┘
                                    │ │  ┌───────────────┐  ┌───────────────┐
                                    │ └─▶│consolewrite() │─▶│  uartputc()   │
                                    │    └───────────────┘  └───────────────┘
                                    │
                                    │
                                    │

The act of displaying a single character sends us through several layers of abstraction, ultimately making the hardware do something.

In this lab, we’re going to learn more about system calls, create our own system calls, and build companion user space utilities that will use them.

Part I: Tracing System Calls (on Linux)

On Linux, the strace utility allows us to trace system calls on running programs. We can learn quite a bit about a program just by inspecting its system calls.

Here are a few example usages of strace:

# Trace a run of 'ls':
$ strace ls

# Trace only file-related system calls
$ strace -e trace=file ls

# Get a nice summary of unique system calls used
$ strace -c ls

# Search for a specific system call (stat in this case):
$ strace ls 2>&1 | grep '^stat'
# Note that we search the start of the string (^) because the system call's
# name comes first, followed by its parameters and return value.

When you run a command, such as strace cat, each system call will be printed interactively to your terminal. So the general workflow is: run strace on a command, which will then print a list of system calls. You can run strace on any binary file; if you compile your own C code, strace a.out will display the system calls being used by your code (most likely the calls are invoked by the C library, not your code directly).

For the first part of this lab, you will trace several programs and record the results on a Linux machine. You will need to write about what happens below, so you should create a docs directory inside your OS repo if it doesn’t already exist and create a new file there called syscalls.txt.

  1. First, run a trace on ls. Record all of the unique system calls (just their names) used by ls. To avoid doing a lot of tedious work, automate most of this with a shell pipeline or command line flag (see the man page for strace).

(list the syscalls in your text file as a bulleted list, use ‘*’ before each system call)

  1. How many unique system calls are in your list?

  2. Next, trace several commands you already know and look for new system calls that you haven’t seen before (or look up some new commands if you’d like). As you experiment, list any new system calls you find as you experiment.

Command: cat /etc/passwd
New system calls:
  * mprotect64

(next command goes here)
  1. Take a look at the system calls that your CS 326 OS supports and compare them with the Linux system calls. Which calls overlap, and which do not? Note any overlapping calls in your text file.

Part II: Adding a System Call

Now that we’ve spent some time learning about system calls, it’s time to build our own.

Here’s an overview of the process for adding a new system call. You will probably do this a few times this semester. If you want to simply walk through the process, pick an existing system call (such as uptime()) and note all the locations it appears in the files below:

In the Kernel (/kernel)

In User Space (/user)

To start out, you might want to try adding a “hello world” system call that prints a greeting. You can modify it later to implement the rest of Part II.

Afterward, compile and run your OS to make sure you didn’t miss any steps.

Shutting Down

All great operating systems have a way to shut down. Except ours. Try terminating the shell (with kill or CTRL+D), and you’ll see that init simply runs a new shell instance. There’s no halt or shutdown command to be found. No reboot. No poweroff. We need to fix this so that later when you’re really frustrated with your OS you can just shut it off until it starts functioning correctly. (I’ve had limited success with this approach).

Since shutting down involves the hardware, it’s a privileged operation. On Linux and most other UNIX derivatives, it’s even restricted to the root user; try to run halt, reboot, etc. on gojira as a regular user.

The QEMU RISC-V virtual machine we’re using in class has a test interface at memory location 0x100000 that allows us to shut it down. Run your OS in QEMU and press CTRL+A, then C to get the QEMU console. Enter info mtree at the (qemu) prompt to see the memory tree:

memory-region: system
  0000000000000000-ffffffffffffffff (prio 0, i/o): system
    0000000000001000-000000000000ffff (prio 0, rom): riscv_virt_board.mrom
    0000000000100000-0000000000100fff (prio 0, i/o): riscv.sifive.test
    0000000000101000-0000000000101023 (prio 0, i/o): goldfish_rtc
    0000000002000000-0000000002003fff (prio 0, i/o): riscv.aclint.swi
    0000000002004000-000000000200bfff (prio 0, i/o): riscv.aclint.mtimer
    0000000003000000-000000000300ffff (prio 0, i/o): gpex_ioport_window
      0000000003000000-000000000300ffff (prio 0, i/o): gpex_ioport
    0000000004000000-0000000005ffffff (prio 0, i/o): platform bus

If we write 0x5555 to that memory location, the VM will shut down. If we write 0x7777, it will reboot. This is an example of memory-mapped I/O (MMIO): the memory and/or registers of the devices on our virtual machine are mapped to addresses in main memory so the CPU can interact with them directly.

However, there is one problem here: by default, our kernel can’t access the physical address 0x100000 because it is not mapped in its page table. If you try it, your OS will crash.

We will fully explore virtual memory later in the semester. For now, you just need to edit kernel/memlayout.h and kernel/vm.c:

In memlayout.h, you will see a description of the physical memory layout of our machine. Add a new entry for the test interface:

#define VIRT_TEST 0x100000

Then, in vm.c, you need to map this into the kernel’s virtual memory. Inside kvmmake(), you’ll find the mappings. Add one:

kvmmap(kpgtbl, VIRT_TEST, VIRT_TEST, PGSIZE, PTE_R | PTE_W);

Both of these additions need to be commented to explain what you are doing, just like the existing code is!

Again, you’re not expected to fully understand virtual memory for now. We’ll pretend it’s magic and come back to it.

Finally, to make the system shut down:

volatile uint32 *test_dev = (uint32 *) VIRT_TEST;
*test_dev = 0x7777;

If you don’t already know what volatile does, you should try to figure it out and understand why we need it in this situation.

Part III: Real Time Clock

In Part III of the lab, you’re going to do it all over again, except this time all on your own. If you look back at the info mtree output, you’ll notice an interesting device: goldfish_rtc. This is the real time clock for our virtual hardware; it reports the current time as a 64-bit UNIX timestamp split across two 32-bit registers. Your objective for Part III is to create a system call and user space program that will report the current UNIX timestamp, as provided by the goldfish_rtc device. You are essentially creating a very, very simple driver that interacts with the hardware.

To do this:

  1. Add your system call(s) as usual, making sure to name them something intuitive.
  2. Adjust memory mappings and the model of the machine’s memory layout.
  3. Retrieve the timestamp by making two 32-bit reads, first at the memory offset for the goldfish_rtc device, and then the second half by adding 4 bytes to the first memory address. Combine the two readings with a bitwise OR to create the final 64-bit timestamp.
    • Note: RISC-V is little endian, so you’ll do something like this: ((uint64) second << 32) | first;
  4. Add a user space utility that calls the system call and prints the current UNIX timestamp.
    • Note: pay attention to the data type you use here. You may also want to convert the raw timestamp from nanoseconds to seconds so that printf can handle it as a regular integer.

Lab Instructions

There’s a lot of information to absorb in this lab, but not a lot of code to write.

  1. Do the strace activity and fill out syscalls.txt
  2. Create a system call (or system calls) for reboot and shut down.
    • Be sure to choose a name that makes sense.
    • See the steps above, and make sure you follow all of them.
  3. Create a user space program (or programs) that uses the system call(s) to reboot or shutdown.
  4. Create a system call to read the real time clock and a corresponding user space utility to report its value.
  5. Test the programs(s) and make sure they behave correctly.

If you get stuck, don’t be afraid to ask for help. After you’re finished, take a break. You deserve it.

Grading and Submission

To receive 50% credit:

To receive 75% credit:

To receive 85% credit:

To receive full credit for this lab:

Once you are finished, check your changes into your OS repo. Then have a member of the course staff take a look at your lab to check it off.

Check off procedure: