OSI Lab4: OS Protections—Exception, Privilege levels, and Memory protection

In previous labs, the OS has no protection method to prevent user programs (like helloworld and ult) from sabotaging the execution of other processes. In this lab, you will implement basic OS protections and better system calls, including:

using ecall to implement syscalls and handle different exceptions
running user applications in user-level
implementing physical memory protection (PMP)

You will write a limited amount of code—our reference implementation is only 80 lines. However, most of your effort will go into reading the RISC-V specifications to determine what to write. If you are unsure how to start on your first attempt, that is expected. Understanding detailed and seemingly disorganized information takes time. If you’re clueless, read the specification again the next day—you will likely have an aha moment!

Fetch lab4 from upstream repo

fetch lab4 branch
```
$ git fetch upstream lab4

<you should see:>
...
 * [new branch]      lab4       -> upstream/lab4
```
If you get error: No such remote 'upstream', run git remote add upstream git@github.com:NEU-CS6640-labs/egos-upstream.git.

create a local lab4 branch

$ git checkout --no-track -b lab4 upstream/lab4
<you should see:>
Switched to a new branch 'lab4'

confirm you’re on local branch lab4

$ git status
<you should see:>
On branch lab4
nothing to commit, working tree clean

rebase your previous labs to the current branch
```
<you're now on branch lab4>
$ git rebase lab3
```
You need to resolve any conflict caused by the rebase. Read how to resolve conflict here.
Note: rebase will give you a cleaner linear commit history (than git merge), which is easier to understand and is easier trace bugs.
push branch lab4 to remote repo origin
```
$ git push -u origin lab4
```
In the following, you should work on this lab4 branch, and push to the origin’s lab4 branch to submit your work. OSI staff will grade your origin:lab4 branch. If you have any updates you want to keep from previous labs, you should merge/rebase to the lab4 branch.

Section 1: Syscalls and exceptions

Motivation

Your egos implementation so far has no process isolation and is vulnerable to malicious modifications from user applications. For example, crash1 is a program that allocates a large chunk of memory that exceeds the available memory. Run crash1, and you will see:

➜ /home/cs6640 % crash1
_sbrk: heap grows too large (0x8201740 + 33554456 > 0xa000000)
_sbrk: heap grows too large
[FATAL] fatal exception (pid=6) 7

This is not ideal, as we should not panic the entire OS simply because a process allocates too much memory. Instead, OS should kill this process like what most of today’s OSes do.

To implement OS protections, we need better exception handling instead of causing a kernel panic as it does now. As a starting point, we will: (1) implement syscalls with ecall (a RISC-V instruction) and (2) handle different exceptions accordingly.

Implementing synchronous syscalls

The current syscall implementation uses software interrupt to trap to kernel. See syscall workflow and read sys_invoke() in grass/syscall.c. During a system call, sys_invoke() triggers a software interrupt which will be handled by trap_handler() in grass/kernel.c. In trap_handler, the kernel further decides to call either intr_entry() (for handling interrupts) or excp_entry() (for handling exceptions).

Now, the syscalls are handled within intr_entry because they are triggered by software interrupts. What you should do next is to replace the software interrupt with something designed for synchronous syscalls to trap to kernel.

Exercise 1 implementing syscalls with ecall

Read ecall and mret (Ch3.3.1 and Ch3.3.2) to understand how ecall works.
In Makefile, change the line SYSCALLFUNC=SOFTTIMER to SYSCALLFUNC=ECALL
Use ecall to trap to kernel in sys_invoke(), grass/syscall.c.
You will need to write assembly instructions; take a look at handout week04, panel 2
When finished, make and run your code; then, you should see an error message:
[FATAL] fatal exception (pid=1) 9
Hint: take a look at the trap reason table to see what is this “9” exception.

Handling exceptions

The reason why you see an error message above is that you don’t have an exception handler yet. Next, you will implement the exception handler in grass/kernel.c.

Exercise 2 handling exceptions

To understand the trap prologue before excp_entry(), read:
trap_entry() in earth/cpu_intr.c,
trap_handler() in grass/kernel.c,
and ctx_entry() in kernel.c.
Implement excp_entry() in grass/kerenl.c
Here are some hints:
there are multiple exceptions about ecall (see the trap reason table).
You should capture them all.
How to handle syscalls?
Read intr_entry() to see how we did that before.
How to kill a process?
Read intr_entry() about killing a process when encountering ctrl+C.
Make sure you understand which “mepc” to return to with “mret”.
When finished, your kernel should boot normally and you should see:
...
➜ /home/cs6640 % crash1
_sbrk: heap grows too large (0x8201740 + 33554456 > 0xa000000)
_sbrk: heap grows too large
[INFO] process 5 killed by exception 7
...

Section 2: RISC-V privilege levels

Motivation

Right now, all user apps have the privilege to do whatever the kernel could do. To see this, run:

➜ /home/cs6640 % crash2
[SUCCESS] Crash2 succeeds in running high-privileged instructions

Here, crash2 successfully runs an instruction that should have been only executed by the kernel. This is problemetic.

Run user apps in U-Mode

RISC-V processors provide privilege levels and protection mechanisms to enforce isolation and security. A RISC-V CPU usually has three privilege levels:

Machine (M-Mode): the highest privilege level; earth runs in Machine mode.
Supervisor (S-Mode): the second highest privilege level; grass and system processes runs in this mode.
User (U-Mode): the most limited level; user applications run in this mode.

Read Introduction to RISC-V privilege levels (Ch1). Pay attention to the encoding of each mode in “Table 1.1”: if you are familiar with x86 ring levels, note that RISC-V’s privilege level 0 is the least privileged, the opposite of x86’s ring 0.

Next, you will run user apps run in U-Mode

Exercise 3 run user apps in U-Mode

Read mstatus.MPP (Ch3.1.6.1) to understand how mret changes the privilege levels.
Fill in ctx_entry() in grass/kernel.c
Fill in proc_yield() in grass/scheduler.c
After you finish, you should see:
➜ /home/cs6640 % crash2
[INFO] process 5 killed by exception 2
...
➜ /home/cs6640 % crash2 yield
[INFO] process 5 killed by exception 2
...
Both crash2 and crash2 yield should be killed.

Section 3: Physical memory protection

Motivation

After running user apps in U-Mode, the user-level programs cannot execute high-priveleged instructions. However, they can still touch all memory, including critical kernel data structures such as the PCB array. To see this, run crash3:

➜ /home/cs6640 % crash3
[SUCCESS] Crash3 succeeds in modifying earth's code
[SUCCESS] Crash3 succeeds in modifying disk contents
[SUCCESS] Crash3 succeeds in corrupting the memory of other processes

You will see that crash3 can modifying critical memory.

To address this problem, we need to protect memory from invalid accesses. There are multiple ways to do this. In this lab, you will use a hardware feature provided by RISC-V CPUs, named Physical Memory Protection or PMP. Read physical memory protection (Ch3.7) to understand how PMP works.

Exercise 4 protect memory with PMP

Implement PMP in pmp_init() of earth/cpu_mmu.c
Note that in pmp_init, regions such as “[0x00000000, 0x20000000)” are “start inclusive to end exclusive” (including 0x00000000 and excluding 0x20000000).
After you finish, you should see:
➜ /home/cs6640 % crash3 disk
[INFO] process 5 killed by exception 7
...
➜ /home/cs6640 % crash3 earth
[INFO] process 5 killed by exception 7
Both the crash3 disk and crash3 earth should be properly killed.

Exercise 5 protecting other processes’ memory

Read TODO in apps/user/crash3.c
Read questions in slack/lab4.txt and write down your answers.

Finally, submit your work

Submitting consists of three steps:

Executing this checklist:
- Fill in ~/osi/egos/slack/lab4.txt.
- Make sure that your code build with no warnings.

Push your code to GitHub:

 $ cd ~/osi/egos/
 $ git commit -am 'submit lab4'
 $ git push origin lab4

 Counting objects: ...
  ....
  To github.com/NEU-CS6640-labs/egos-<YOUR_ID>.git
     c1c38e6..59c0c6e  lab4 -> lab4

Actually commit your lab (with timestamp and git commit id):
1. Get the git commit id of your work. A commit id is a 40-character hexadecimal string. You can obtain the commit id for the last commit by running the command git log -1 --format=oneline.
2. Submit a file named git.txt to Canvas. (there will be an assignment for this lab on Canvas.) The file git.txt contains two lines: the first line is your github repo url; the second line is the git commit id that you want us to grade. Here is an example:
```
 git@github.com:NEU-CS6640-labs/egos-<YOUR_ID>.git
 29dfdadeadbeefe33421f242b5dd8312732fd3c9
```
  Notice: the repo address must start with git@github.com:... (not https://...). You can get your repo address on GitHub repo page by clicking the green “Code” button, then choose “SSH”.
3. Note: You can submit as many times as you want; we will grade the last commit id submitted to Canvas. Also, you can submit any commit id in your pushed git history; again, we will grade the commit id submitted to Canvas.
  Notice: if you submit multiple times, the file name (git.txt) changes to git-N.txt where N is an integer and represents how many times you’ve submitted. We will grade the file with the largest N.

NOTE: Ground truth is what and when you submitted to Canvas.

A non-existent repo address or a non-existent commit id in Canvas means that you have not submitted the lab, regardless of what you have pushed to GitHub—we will not grade it. So, please double check your submitted repo and commit id!
The time of your submission for the purposes of tracking lateness is the timestamp on Canvas, not the timestamp on GitHub.

This completes the lab.