Lab 2: System Call Me Maybe
Our previous labs dealt with writing code for user space, which really isn’t much different than any other programs you’ve written. Write a program, compile it, run it. Watch it segfault.
However, kernel space is where privileged operations run. If you had to guess, did our previous labs do anything you’d consider privileged? Probably not, but in reality even the simplest “hello world” program interacts with the kernel; getting text to display on the screen requires a display driver or serial interface that user space programs do not have direct control over. Let’s take a look at the journey of a single printf
call in xv6:
User Space │ Kernel Space
---------- │ ------------
│
┌────────┐ ┌────────┐ ┌────────┐ │ ┌───────────────┐ ┌───────────────┐
│printf()│─▶│ putc() │─▶│write() │──┼───▶│ sys_write() │─▶│ filewrite() │──┐
└────────┘ └────────┘ └────────┘ │ └───────────────┘ └───────────────┘ │
│ ┌────────────────────────────────────────┘
│ │ ┌───────────────┐ ┌───────────────┐
│ └─▶│consolewrite() │─▶│ uartputc() │
│ └───────────────┘ └───────────────┘
│
│
│
The act of displaying a single character sends us through several layers of abstraction, ultimately making the hardware do something.
In this lab, we’re going to learn more about system calls, create our own system calls, and build companion user space utilities that will use them.
Part I: Tracing System Calls (on Linux)
On Linux, the strace
utility allows us to trace system calls on running programs. We can learn quite a bit about a program just by inspecting its system calls.
Here are a few example usages of strace
:
# Trace a run of 'ls':
$ strace ls
# Trace only file-related system calls
$ strace -e trace=file ls
# Get a nice summary of unique system calls used
$ strace -c ls
# Search for a specific system call (stat in this case):
$ strace ls 2>&1 | grep '^stat'
# Note that we search the start of the string (^) because the system call's
# name comes first, followed by its parameters and return value.
When you run a command, such as strace cat
, each system call will be printed interactively to your terminal. So the general workflow is: run strace on a command, which will then print a list of system calls. You can run strace
on any binary file; if you compile your own C code, strace a.out
will display the system calls being used by your code (most likely the calls are invoked by the C library, not your code directly).
For the first part of this lab, you will trace several programs and record the
results on a Linux machine. You will need to write about what happens below, so
you should create a docs
directory inside your OS repo if it doesn’t already
exist and create a new file there called syscalls.txt
.
- First, run a trace on
ls
. Record all of the unique system calls (just their names) used byls
. To avoid doing a lot of tedious work, automate most of this with a shell pipeline or command line flag (see theman
page for strace).
(list the syscalls in your text file as a bulleted list, use ‘*’ before each system call)
-
How many unique system calls are in your list?
-
Next, trace several commands you already know and look for new system calls that you haven’t seen before (or look up some new commands if you’d like). As you experiment, list any new system calls you find as you experiment.
Command: cat /etc/passwd
New system calls:
* mprotect64
(next command goes here)
- Take a look at the system calls that your CS 326 OS supports and compare them with the Linux system calls. Which calls overlap, and which do not? Note any overlapping calls in your text file.
Part II: Adding a System Call
Now that we’ve spent some time learning about system calls, it’s time to build our own.
Here’s an overview of the process for adding a new system call. You will probably do this a few times this semester. If you want to simply walk through the process, pick an existing system call (such as uptime()
) and note all the locations it appears in the files below:
In the Kernel (/kernel)
- In
syscall.h
, give our system call a number by adding an entry for it. Use the next available number. - In
syscall.c
:- Add a function prototype for the system call. These are prefixed with
sys_
. - Add a mapping for it to the
syscalls
array. This allows us to look up a system call function by its number.- This might look odd, but it’s just setting up an array of function pointers.
- Add a function prototype for the system call. These are prefixed with
- Provide an implementation of your system call, prefixed with
sys_
as we entered above. The implementations are split acrosssysfile.c
andsysproc.c
, with the former containing anything file system related and the latter containing anything process related.
In User Space (/user)
- In
usys.pl
, add an entry for the name of the user-facing version of the system call, e.g., if you added sys_uptime() then theusys.pl
version should beuptime
. - Finally, in
user.h
, add a function prototype for the user-facing system call function.
To start out, you might want to try adding a “hello world” system call that prints a greeting. You can modify it later to implement the rest of Part II.
Afterward, compile and run your OS to make sure you didn’t miss any steps.
Shutting Down
All great operating systems have a way to shut down. Except ours. Try terminating the shell (with kill
or CTRL+D), and you’ll see that init
simply runs a new shell instance. There’s no halt
or shutdown
command to be found. No reboot
. No poweroff
. We need to fix this so that later when you’re really frustrated with your OS you can just shut it off until it starts functioning correctly. (I’ve had limited success with this approach).
Since shutting down involves the hardware, it’s a privileged operation. On Linux and most other UNIX derivatives, it’s even restricted to the root user; try to run halt
, reboot
, etc. on gojira as a regular user.
The QEMU RISC-V virtual machine we’re using in class has a test interface at memory location 0x100000
that allows us to shut it down. Run your OS in QEMU and press CTRL+A
, then C
to get the QEMU console. Enter info mtree
at the (qemu)
prompt to see the memory tree:
memory-region: system
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000001000-000000000000ffff (prio 0, rom): riscv_virt_board.mrom
0000000000100000-0000000000100fff (prio 0, i/o): riscv.sifive.test
0000000000101000-0000000000101023 (prio 0, i/o): goldfish_rtc
0000000002000000-0000000002003fff (prio 0, i/o): riscv.aclint.swi
0000000002004000-000000000200bfff (prio 0, i/o): riscv.aclint.mtimer
0000000003000000-000000000300ffff (prio 0, i/o): gpex_ioport_window
0000000003000000-000000000300ffff (prio 0, i/o): gpex_ioport
0000000004000000-0000000005ffffff (prio 0, i/o): platform bus
If we write 0x5555
to that memory location, the VM will shut down. If we write 0x7777
, it will reboot. This is an example of memory-mapped I/O (MMIO): the memory and/or registers of the devices on our virtual machine are mapped to addresses in main memory so the CPU can interact with them directly.
However, there is one problem here: by default, our kernel can’t access the physical address 0x100000
because it is not mapped in its page table. If you try it, your OS will crash.
We will fully explore virtual memory later in the semester. For now, you just need to edit kernel/memlayout.h
and kernel/vm.c
:
In memlayout.h
, you will see a description of the physical memory layout of our machine. Add a new entry for the test interface:
#define VIRT_TEST 0x100000
Then, in vm.c
, you need to map this into the kernel’s virtual memory. Inside kvmmake()
, you’ll find the mappings. Add one:
kvmmap(kpgtbl, VIRT_TEST, VIRT_TEST, PGSIZE, PTE_R | PTE_W);
Both of these additions need to be commented to explain what you are doing, just like the existing code is!
Again, you’re not expected to fully understand virtual memory for now. We’ll pretend it’s magic and come back to it.
Finally, to make the system shut down:
volatile uint32 *test_dev = (uint32 *) VIRT_TEST;
*test_dev = 0x7777;
If you don’t already know what volatile
does, you should try to figure it out and understand why we need it in this situation.
Part III: Real Time Clock
In Part III of the lab, you’re going to do it all over again, except this time all on your own. If you look back at the info mtree
output, you’ll notice an interesting device: goldfish_rtc
. This is the real time clock for our virtual hardware; it reports the current time as a 64-bit UNIX timestamp split across two 32-bit registers. Your objective for Part III is to create a system call and user space program that will report the current UNIX timestamp, as provided by the goldfish_rtc
device. You are essentially creating a very, very simple driver that interacts with the hardware.
To do this:
- Add your system call(s) as usual, making sure to name them something intuitive.
- Adjust memory mappings and the model of the machine’s memory layout.
- Retrieve the timestamp by making two 32-bit reads, first at the memory offset for the
goldfish_rtc
device, and then the second half by adding 4 bytes to the first memory address. Combine the two readings with a bitwise OR to create the final 64-bit timestamp.- Note: RISC-V is little endian, so you’ll do something like this:
((uint64) second << 32) | first;
- Note: RISC-V is little endian, so you’ll do something like this:
- Add a user space utility that calls the system call and prints the current UNIX timestamp.
- Note: pay attention to the data type you use here. You may also want to convert the raw timestamp from nanoseconds to seconds so that
printf
can handle it as a regular integer.
- Note: pay attention to the data type you use here. You may also want to convert the raw timestamp from nanoseconds to seconds so that
Lab Instructions
There’s a lot of information to absorb in this lab, but not a lot of code to write.
- Do the
strace
activity and fill outsyscalls.txt
- Create a system call (or system calls) for reboot and shut down.
- Be sure to choose a name that makes sense.
- See the steps above, and make sure you follow all of them.
- Create a user space program (or programs) that uses the system call(s) to reboot or shutdown.
- Create a system call to read the real time clock and a corresponding user space utility to report its value.
- Test the programs(s) and make sure they behave correctly.
If you get stuck, don’t be afraid to ask for help. After you’re finished, take a break. You deserve it.
Grading and Submission
To receive 50% credit:
- Finish the
strace
activity.
To receive 75% credit:
- Complete all previous requirements
- Create system call(s) to handle shut down and reboot
To receive 85% credit:
- Complete all previous requirements
- Implement the user utility (or utilities) to shut down and reboot
To receive full credit for this lab:
- Complete all previous requirements
- Implement a system call and user space program to retrieve the current UNIX timestamp as reported by the hardware (Part III).
Once you are finished, check your changes into your OS repo. Then have a member of the course staff take a look at your lab to check it off.
Check off procedure:
- Show the contents of
syscalls.txt
- Start your OS, reboot it with your user space utility
- Report the current UNIX timestamp (at least 2-3 times so we can see it is increasing)
- Shut down your OS with your user space utility
- Show the contents of
kernel/sysproc.c