CS6640 Lab5: Virtual Memory
Virtual memory is an essential building block for modern OSes. It offers better programmability, provides memory isolation and protection, and enables more effective use of memory. To implement virtual memory, RISC-V CPUs (and other modern CPUs) use paging plus page tables (i.e., radix trees).
In this lab, you will add the virtual memory support for your egos
. In particular, you will implement the following functions in earth/cpu_vm.c
:
page_table_map
: mapping a virtual address to a physical address for a processpage_table_switch
: switching from one process to another processpage_table_translate
: translating a virtual address to a physical address for a processpage_table_free
: freeing pages in the page table of a process- You will also help syscalls to coy messages across different address spaces.
Fetch lab5 from upstream repo
- fetch
lab5
branch$ git fetch upstream lab5 <you should see:> ... * [new branch] lab5 -> upstream/lab5
- create a local
lab5
branch$ git checkout --no-track -b lab5 upstream/lab5 <you should see:> Switched to a new branch 'lab5'
- confirm you’re on local branch
lab5
$ git status <you should see:> On branch lab5 nothing to commit, working tree clean
- rebase your previous labs to the current branch
<you're now on branch lab5> $ git rebase lab4
You need to resolve any conflict caused by the rebase. Read how to resolve conflict here.
- push branch
lab5
to remote repoorigin
$ git push -u origin lab5
check: after rebase/merge, your initial code should have the following:
- use
ecall
to implement syscalls - use U-Mode for user applications (privilege level switching in
grass/kernel.c
andgrass/scheduler.c
)
Your egos
should be able to run. Use ls
and echo
to confirm everything works fine.
Understanding virtual memory
Virtual memory requires software(OS)-hardware(MMU) co-design. OS is going to be setting up data structures (page tables) that the hardware sees. And it is the hardware that does the translation and protection under the hood.
Exercise 0 Read document and turn on virtual memory
- Read satp Register (Ch4.1.11) and Page-Based 32-bit Virtual-Memory Systems (Ch4.3)
- Turn on virtual memory: in
Makefile
, change the lineIFVM=VMOFF
toIFVM=VMON
.- Make and run
egos
. You will see an fatal error:[CRITICAL] ------------- Booting ------------- [SUCCESS] Finished initializing the tty device ... [CRITICAL] Enter the grass layer [INFO] Load kernel process #1: sys_proc [INFO] App file size: 0x00001410 bytes [INFO] App memory size: 0x00001820 bytes [FATAL] page_table_map is not implemented.
Next, you will implement this page_table_map
.
Creating page tables
You will implement virtual memory in the file earth/cpu_vm.c
. The cpu_vm.c
’s skeleton code has:
pid_to_pagetable_base[]
mantains the page table root for each process.pmalloc
andpfree
are used to allocate and free pages in M-mode. (all addresses are physical)fence()
will block untils all the updates to the page tables are settled.setup_identity_region
is a helper function to set the virtual addresses to be the same as the physical addresses for contiguous memory.
Exercise 1 Creating and walking page tables
- Kernel invokes
earth->mmu_map
(page_table_map
) to map a page to a process’s address space.
Readlibrary/elf/elf.c:load_app()
to see:
- how
earth->mmu_map
(page_table_map
) is used- which virtual addresses are mapped for processes
- In
earth/cpu_vm.c
, read the comments and implementpage_table_map
andwalk
.- After finished, make and run. You should see:
... [CRITICAL] Enter the grass layer [INFO] Load kernel process #1: sys_proc [INFO] App file size: 0x00001410 bytes [INFO] App memory size: 0x00001820 bytes [SUCCESS] Successfully load process #1: sys_proc [FATAL] page_table_switch is not implemented.
- The
sys_proc
should be able to load successfully.- There is a fatal error about
page_table_switch
.
Switching address spaces
After implemeing the page_table_map
, the kernel (grass
) should be able to build the page table for the first process sys_proc
, load the program (elf.c:load_app
), and then switch to run sys_proc
(in grass.c
). However, the final step of switching will fail because the page_table_switch
has not been implemented.
Exercise 2 Switching address spaces
- In
cpu_vm.c
, read and implementpage_table_switch
.- Make and run. You should see:
... [SUCCESS] Successfully load process #1: sys_proc [SUCCESS] Enter kernel process GPID_PROCESS [INFO] Load kernel process #2: sys_file [INFO] App file size: 0x000027c0 bytes [INFO] App memory size: 0x00002814 bytes [FATAL] proc_syscall: got unknown syscall type=0
- The boot process should load the
sys_file
.- The error is about a unknown syscall type (due to empty syscall args).
Copying messages across address spaces
After implementing page_table_switch
, the first system app sys_proc
should be able run: it will run its main
function, spawn the next system app sys_file
, and then call a syscall sys_recv(...)
. This syscall however fails, due to unknown syscall type =0
.
The reason of the unknown syscall
is that the sys_proc
copies its syscall args to a well-known memory SYSCALL_ARG
of its own address space. After trapping into the kernel (after ecall
), the kernel runs in M-mode who sees the physical memory. Therefore, the kernel need to copy the syscall args from the sys_proc
’s SYSCALL_ARG
to the kernel’s SYSCALL_ARG
.
Exercise 3 Addressing syscalls and privilege levels
- handling syscalls
- In
cpu_vm.c
, implementpage_table_translate
, which translates virtual addresses of a process to physical addresses.- In
grass/syscall.c
, searchVMON
to see where the kernel need to copy messages.- In
grass/syscall.c
, implementmsgcpy
to enable kernel copy message across address spaces.- Make and run. You should see:
... [CRITICAL] Welcome to the egos-2k+ shell! [INFO] process 5 killed by exception 13, mtval=0x800020a0 [INFO] process 5 killed by exception 13, mtval=0x80002094 ... // repeating
- The booting process finishes creating all system processes (shell prints welcome). The user apps however keep failing.
- handling privilege levels
- The failure is due to permission violations on page table entries (PTEs) for U-Mode processes.
- To fix this, fill in
TODO
insetup_identity_region
inearth/cpu_vm.c
.- When finished, make and run. You should see:
... [CRITICAL] Welcome to the egos-2k+ shell! [FATAL] page_table_free is not implemented.
Free page tables
Finally, when a user app exits, the kernel needs to free all its memory. For a process, its memory can be traced by traversing the page table from the root. You will need to implement page_table_free
to free all allocated pages (data pages), as well as the page table itself.
Exercise 4 Free page tables
- In
cpu_vm.c
, implementpage_table_free
.- When finished, make and run. You should see the shell prompt.
- To test memory leak, try to run a lot of cmds, for example:
(echo;echo;echo;echo;echo;echo;echo;echo;echo;echo)
to see if you run out of memory (you shouldn’t).- You can tune the memory pressure by changing
REMOVED_PAGES
inearth/page_alloc.c
.
The larger theREMOVED_PAGES
, the higher the memory pressure, the more likely you will see memory leak.
Finally, submit your work
Submitting consists of three steps:
- Executing this checklist:
- Fill in
~/cs6640/egos/slack/lab5.txt
. - Make sure that your code build with no warnings.
- Fill in
Push your code to GitHub:
$ cd ~/cs6640/egos/ $ git commit -am 'submit lab5' $ git push origin lab5 Counting objects: ... .... To github.com/NEU-CS6640-labs/egos-<YOUR_ID>.git c1c38e6..59c0c6e lab5 -> lab5
Actually commit your lab (with timestamp and git commit id):
Get the git commit id of your work. A commit id is a 40-character hexadecimal string. You can obtain the commit id for the last commit by running the command
git log -1 --format=oneline
.- Submit a file named
git.txt
to Canvas. (there will be an assignment for this lab on Canvas.) The filegit.txt
contains two lines: the first line is your github repo url; the second line is the git commit id that you want us to grade. Here is an example:git@github.com:NEU-CS6640-labs/egos-<YOUR_ID>.git 29dfdadeadbeefe33421f242b5dd8312732fd3c9
Notice: the repo address must start with
git@github.com:...
(nothttps://...
). You can get your repo address on GitHub repo page by clicking the green “Code” button, then choose “SSH”. - Note: You can submit as many times as you want; we will grade the last commit id submitted to Canvas. Also, you can submit any commit id in your pushed git history; again, we will grade the commit id submitted to Canvas.
Notice: if you submit multiple times, the file name (git.txt
) changes togit-N.txt
whereN
is an integer and represents how many times you’ve submitted. We will grade the file with the largestN
.
NOTE: Ground truth is what and when you submitted to Canvas.
A non-existent repo address or a non-existent commit id in Canvas means that you have not submitted the lab, regardless of what you have pushed to GitHub—we will not grade it. So, please double check your submitted repo and commit id!
The time of your submission for the purposes of tracking lateness is the timestamp on Canvas, not the timestamp on GitHub.
This completes the lab.