OSI Lab4: Syscalls and Protection

In previous labs, the OS has no protection method to prevent user programs (like helloworld and ult) from sabotaging the execution of other processes. In this lab, you will implement better system calls and basic OS protections, including:

implementing a new syscall sys_sleep
using ecall to implement syscalls and handle different exceptions
running user applications in user-level
implementing physical memory protection (PMP)

You will implement only a small amount of code. Most of your effort should focus on reading the RISC-V specification and determining the precise behavior required. If you feel uncertain during your initial attempt, that is normal. Understanding detailed and seemingly disorganized information takes time. If you’re clueless, read the specification again the next day—you will likely have an aha moment!

Fetch lab4 from upstream repo

fetch lab4 branch
```
$ git fetch upstream lab4

<you should see:>
...
 * [new branch]      lab4       -> upstream/lab4
```
If you get error: No such remote 'upstream', run git remote add upstream git@github.com:NEU-CS6640-labs/egos-upstream.git.

create a local lab4 branch

$ git checkout --no-track -b lab4 upstream/lab4
<you should see:>
Switched to a new branch 'lab4'

confirm you’re on local branch lab4

$ git status
<you should see:>
On branch lab4
nothing to commit, working tree clean

rebase your previous labs to the current branch
```
<you're now on branch lab4>
$ git rebase lab3
```
You need to resolve any conflict caused by the rebase. Read how to resolve conflict here.
Note: rebase will give you a cleaner linear commit history (than git merge), which is easier to understand and is easier trace bugs.
push branch lab4 to remote repo origin
```
$ git push -u origin lab4
```
In the following, you should work on this lab4 branch, and push to the origin’s lab4 branch to submit your work. OSI staff will grade your origin:lab4 branch. If you have any updates you want to keep from previous labs, you should merge/rebase to the lab4 branch.

Section 1: Implementing syscalls

System calls (i.e., syscalls) are the interface between OS kernels and applications. In egos-2k+, there are only two of them implemented so far: sys_send and sys_recv.

Next, you will allow processes to sleep. Now, if you run:

➜ /home/cs6640 % loop sleep
Sleep for 5 QUANTUM...
[FATAL] sys_sleep is not implemented

After implementing sleeping processes, we would expect the loop sleep to finish as expected.

Exercise 1 Implement sleep

Read the syscall workflow to understand how syscall works.
Implement a new syscall sys_sleep by reading library/syscall/syscall.h, library/syscall/syscall.c and modifying grass/ipc.c, and grass/scheduler.c.
the semantics of sys_sleep(5) is to sleep at least 5 QUANTUM and the process will be eventually wake up. You don’t have to wake the process exactly after 10 QUANTUM (which is hard in the current implementation).
When you finished, you should see:
<Makefile: SCHEDULER=NAIVE>
➜ /home/cs6640 % loop sleep
Sleep for 5 QUANTUM...
<wait a while, then see>
                   ...and waked up
(Optional) Extend your MLFQ implementation to support sleeping processes.
Introduce a dedicated sleep queue to track processes that block.
The kernel must not panic (e.g., FATAL) when no runnable processes exist but sleeping processes remain. Instead, the scheduler should block and resume execution once a sleeping process wakes.

Implementing synchronous syscalls

The current syscall implementation uses software interrupt to trap to kernel. Again see syscall workflow and read trap() in library/syscall/syscall.c. During a system call, trap() triggers a software interrupt which will be handled by kernel_entry() in grass/kernel.c. In kernel_etnry, the kernel further decides to call either intr_entry() (for handling interrupts) or excp_entry() (for handling exceptions).

Now, the syscalls are handled within intr_entry because they are triggered by software interrupts. What you should do next is to replace the software interrupt with something designed for synchronous syscalls to trap to kernel.

Exercise 2 implementing syscalls with ecall

Read ecall and mret (Ch3.3.1 and Ch3.3.2) to understand how ecall works.
In Makefile, change the line SYSCALLFUNC=SOFTINT to SYSCALLFUNC=ECALL
Use ecall to trap to kernel in trap(), library/syscall/syscall.c.
You will need to write assembly instructions; take a look at handout week04b, panel 1
When finished, make and run your code; then, you should see an error message:
[FATAL] excp_entry: kernel got exception 11
Hint: take a look at the trap reason table to see what is this “11” exception.

Section 2: Handling exceptions

Motivation

Your egos-2k+ implementation so far has no proper handling of exceptions. For example, crash1 is a program that allocates a large chunk of memory that exceeds the available memory. Run crash1, and you will see:

<in Makefile, change "SYSCALLFUNC=SOFTINT">
➜ /home/cs6640 % crash1
_sbrk: heap grows too large
[FATAL] excp_entry: kernel got exception 7

This is not ideal, as we should not panic the entire OS simply because a process allocates too much memory. Instead, OS should kill this process like what most of today’s OSes do. To implement OS protections, we need better exception handling instead of causing a kernel panic as it does now.

Handling exceptions

The reason why you see an error message above is that you don’t have an exception handler yet. Next, you will implement the exception handler in grass/kernel.c.

Exercise 3 handling exceptions

To understand the trap prologue before excp_entry(), read:
trap_entry() in earth/cpu_intr.c,
trap_entry defined in grass/kernel.s,
kernel_entry() defined in grass/kernel.c,
Implement excp_entry() in grass/kerenl.c, including
in Makefile, set SYSCALLFUNC=ECALL (use your ecall implementation)
handle syscall in your exception handler (in grass/kernel.c)
kill a user process when encountering exceptions other than ecall
Here are some hints:
there are multiple exceptions about ecall (see the trap reason table).
You should capture them all.
How to handle syscalls?
Read intr_entry() to see how we did that before.
How to kill a process?
Read intr_entry() about killing a process when encountering ctrl+C.
Make sure you understand which “mepc” to return to with “mret”.
When finished, your kernel should boot normally and you should see:
...
➜ /home/cs6640 % crash1
_sbrk: heap grows too large
[INFO] process 5 killed by exception 7
...

Section 3: RISC-V privilege levels

Motivation

Right now, all user apps have the privilege to do whatever the kernel could do. To see this, run:

➜ /home/cs6640 % crash2
[SUCCESS] Crash2 succeeds in running a high-privileged instruction

Here, crash2 successfully runs an instruction that should have been only executed by the kernel. This is problemetic.

Run user apps in U-Mode

RISC-V processors provide privilege levels and protection mechanisms to enforce isolation and security. A RISC-V CPU usually has three privilege levels:

Machine (M-Mode): the highest privilege level; egos kernel runs in Machine mode.
Supervisor (S-Mode): the second highest privilege level; egos does not use this level.
User (U-Mode): the most limited level; user applications run in this mode.

Read Introduction to RISC-V privilege levels (Ch1). Pay attention to the encoding of each mode in “Table 1.1”: if you are familiar with x86 ring levels, note that RISC-V’s privilege level 0 is the least privileged, the opposite of x86’s ring 0.

Next, you will run user apps run in U-Mode

Exercise 4 run user apps in U-Mode

Read mstatus.MPP (Ch3.1.6.1) to understand how mret changes the privilege levels.
Fill in kernel_entry() in grass/kernel.c
After you finish, you should see:
➜ /home/cs6640 % crash2
[INFO] process 5 killed by exception 2
...
crash2 should be killed.

Section 4: Physical memory protection

Motivation

After running user apps in U-Mode, the user-level programs cannot execute high-priveleged instructions. However, they can still touch all memory, including critical kernel data structures such as the PCB array. To see this, run crash3:

➜ /home/cs6640 % crash3
[SUCCESS] Crash3 succeeds in modifying earth's code
[SUCCESS] Crash3 succeeds in modifying disk contents
[SUCCESS] Crash3 succeeds in corrupting the memory of other processes
[FATAL] excp_entry: kernel got exception 1

You will see that crash3 can modifying critical memory.

To address this problem, we need to protect memory from invalid accesses. There are multiple ways to do this. In this lab, you will use a hardware feature provided by RISC-V CPUs, named Physical Memory Protection or PMP. Read physical memory protection (Ch3.7) to understand how PMP works.

Exercise 5 protect memory with PMP

Implement PMP in pmp_init() of earth/cpu_mmu.c
A running proc should only access its own running memory from 0x80200000 to 0x80400000 (see also library/egos.h)
You need to use NAPOT mode for PMP configuration.
After you finish, you should see:
➜ /home/cs6640 % crash3 disk
[INFO] process 5 killed by exception 7
...
➜ /home/cs6640 % crash3 earth
[INFO] process 5 killed by exception 7
...
➜ /home/cs6640 % crash3 proc
[INFO] process 5 killed by exception 7
All the crash3 disk/earth/proc should be properly killed.

Finally, submit your work

Submitting consists of three steps:

Executing this checklist:
- Fill in ~/osi/egos/slack/lab4.txt.
- Make sure that your code build with no warnings.

Push your code to GitHub:

 $ cd ~/osi/egos/
 $ git commit -am 'submit lab4'
 $ git push origin lab4

 Counting objects: ...
  ....
  To github.com/NEU-CS6640-labs/egos-<YOUR_ID>.git
     c1c38e6..59c0c6e  lab4 -> lab4

Actually commit your lab (with timestamp and git commit id):
1. Get the git commit id of your work. A commit id is a 40-character hexadecimal string. You can obtain the commit id for the last commit by running the command git log -1 --format=oneline.
2. Submit a file named git.txt to Canvas. (there will be an assignment for this lab on Canvas.) The file git.txt contains two lines: the first line is your github repo url; the second line is the git commit id that you want us to grade. Here is an example:
```
 git@github.com:NEU-CS6640-labs/egos-<YOUR_ID>.git
 29dfdadeadbeefe33421f242b5dd8312732fd3c9
```
  Notice: the repo address must start with git@github.com:... (not https://...). You can get your repo address on GitHub repo page by clicking the green “Code” button, then choose “SSH”.
3. Note: You can submit as many times as you want; we will grade the last commit id submitted to Canvas. Also, you can submit any commit id in your pushed git history; again, we will grade the commit id submitted to Canvas.
  Notice: if you submit multiple times, the file name (git.txt) changes to git-N.txt where N is an integer and represents how many times you’ve submitted. We will grade the file with the largest N.

NOTE: Ground truth is what and when you submitted to Canvas.

A non-existent repo address or a non-existent commit id in Canvas means that you have not submitted the lab, regardless of what you have pushed to GitHub—we will not grade it. So, please double check your submitted repo and commit id!
The time of your submission for the purposes of tracking lateness is the timestamp on Canvas, not the timestamp on GitHub.

This completes the lab.