CS6640 Lab3: Kernel scheduler and MLFQ
In this lab, you will explore how to better schedule processes. The current egos-2k+
is using a naive scheduler that schedules the next available process in an array (similar to Round-Robin), which is inefficient for latency-sensitive processes, like shell.
In this lab, your tasks are:
- collecting scheduling metrics (like turnaround time),
- implementing a Multi-level Feedback Queue scheduler,
- and enabling process to sleep.
Your modifications will be within two files grass/scheduler.c
and library/queue2.h
.
Fetch lab3 from upstream repo
Lab3 is pushed to the upstream branch upstream/lab3
. Note that upstream
points to the git@github.com:NEU-CS6640-labs/egos-upstream.git
. Next, you will fetch the branch and create a lab3
branch on your local repo. (note: your private GitHub repo origin
, pointing to git@github.com:NEU-CS6640-labs/egos-<YOUR_ID>.git
).
Here are steps to fetch lab3:
- fetch
lab3
branch$ git fetch upstream lab3 <you should see:> ... * [new branch] lab3 -> upstream/lab3
If you get
error: No such remote 'upstream'
, rungit remote add upstream git@github.com:NEU-CS6640-labs/egos-upstream.git
. - create a local
lab3
branch$ git checkout --no-track -b lab3 upstream/lab3 <you should see:> Switched to a new branch 'lab3'
- confirm you’re on local branch
lab3
$ git status <you should see:> On branch lab3 nothing to commit, working tree clean
update 09/29: make sure your repo has commit
cce552d23125d7e...
. The commit message isthanks Micah for finding the bug
. There was a bug in CPU timer, see details in this github issue. - rebase your previous labs to the current branch
<you're now on branch lab3> $ git rebase lab2
You need to resolve any conflict caused by the rebase. Read how to resolve conflict here.
Note:rebase
will give you a cleaner linear commit history (thangit merge
), which is easier to understand and is easier trace bugs. - push branch
lab3
to remote repoorigin
$ git push -u origin lab3
- In the following, you should work on this
lab3
branch, and push to theorigin
’slab3
branch to submit your work. CS6640 staff will grade yourorigin:lab3
branch, so if you have any updates you want to keep from previous labs, you should merge/rebase to thelab3
branch.
Tune QUANTUM on your machine
Since qemu’s performance highly depends on the machine it runs on, you need to tune the QUANTUM
(defined in library/egos.h
) to receive timer interrupts in a reasonable speed.
Here is how to tune your QUANTUM
:
- fetch
quantum_tuning
branch<in your lab repo> $ git fetch upstream quantum_tuning
- create a local
quantum_tuning
branch$ git checkout --no-track -b quantum_tuning upstream/quantum_tuning
- run
egos-2k+
$ make && make qemu ... [CRITICAL] ------------- Booting ------------- [CRITICAL] This is a simple timer example [CRITICAL] Got a timer interrupt! [CRITICAL] Got a timer interrupt! ...
We hope you see the message “Got a timer interrupt!” 1–3 times per second.
- If the frequency is too high or too low, go to
library/egos.h
and modify the value ofQUANTUM
accrodingly. - Finally, apply your tuned version of
QUANTUM
in yourlab3
branch.
Section 1: Kernel scheduler and scheduling metrics
A. Process organization
egos-2k+
manages processes mainly in file grass/process.h
and grass/process.c
. Below we introduce the basic design of egos-2k+
processes.
Process status
The status is defined in process.h
, including:
PROC_UNUSED,
PROC_LOADING, /* allocated and wait for loading elf binary */
PROC_READY, /* finished loading elf and wait for first running */
PROC_RUNNING,
PROC_RUNNABLE,
PROC_WAIT_TO_SEND,
PROC_WAIT_TO_RECV,
PROC_SLEEPING,
For a scheduler, the status can be classified as:
- the current running process:
PROC_RUNNING
- available for scheduling:
PROC_READY
,PROC_RUNNABLE
- unavaiable for scheduling:
PROC_LOADING
,PROC_WAIT_TO_SEND
,PROC_WAIT_TO_RECV
, andPROC_SLEEPING
(you will implement sleeping processes later)
Process life cycle
A life cycle of a process starts with (1) being allocated (proc_alloc()
in process.c
), then (2) being scheduled to run for multiple times (proc_yield()
in grass/scheduler.c
), and finally (3) being freed (proc_free()
in process.c
).
According, the following functions in scheduler.c
will be called:
proc_on_arrive(int pid)
: when creatingpid
proc_yield()
: when deciding next running processproc_on_stop(int pid)
: when destroyingpid
Later, in your Exercise 1, you will need to collect scheduling information in the above functions.
Process control block (PCB)
Each process is associated with one process control block, namely PCB. PCB contains the necessary information for process to context switch and metadata for syscalls and scheduling. The PCB is the struct process
defined in grass/process.h
struct process{
int pid;
int status;
int receiver_pid; /* used when waiting to send a message */
void *sp, *mepc; /* process context = stack pointer (sp)
* + machine exception program counter (mepc) */
// scheduling attributes
union {
unsigned char chars[64];
unsigned int ints[16];
float floats[16];
unsigned long long longlongs[8];
double doubles[8];
} schd_attr;
};
In this lab, you will design how to use union schd_attr
to save necessary information for your scheduler.
PCB array
To manage PCBs, egos-2k+
use a simple array, struct process proc_set[MAX_NPROCESS]
, defined in grass/kernel.c
. This means egos-2k+
can have at most MAX_NPROCESS
number of processes, then the kernel will run out of empty PCBs.
Process id (pid) vs. index of PCB array
The pid and the index in PCB array has a deterministic 1-on-1 mapping: the pid is the index plus one. For example, the first process, sys_proc
, has the pid=1 and is stored at the proc_set[0]
; the second process, sys_file
, has the pid=2 and is stored at the proc_set[1]
.
By design, the process id starts with 1 and the first four processes are belong to the OSes. Starting from pid=5 is the user’s applications.
Current running process
The scheduler keeps record of which is the current running process: proc_curr_idx
is the index to the PCB array and points to the current running process. We also provide two helper macros, curr_pid
(the current pid) and curr_status
(the current process status). Both are defined in process.h
.
B. Scheduling metrics
In the following, you will calculate four scheduling metrics and print them when a process exits:
- turnaround time: the time between process arrival and stop
- response time: the time between process arrival and the first time being scheduled
- CPU time: the actual CPU time used by the process
- the number of schedules: how many times this process has been scheduled; or, to be specific, how many times the process’s status has been set to
PROC_RUNNING
before exit
Exercise 1 Calculate scheduling metrics
- collect scheduling information in multiple places within
grass/scheduler.c
- calculate the four metrics above
- hints:
- search “[lab3-ex1]” and read the corresponding comments
- you need to plan how to use the 64B
schd_attr
in PCB- you can get the current time by
earth->gettime()
; this is a 64-bit timestamp (inunsigned long long
orm_uint64
)- the 64-bit timestamp is the CPU cycles since CPU starts. It will overflow (from
0xffffffffffffffff
to0x0
). You need to handle the overflow.- we measure the turnaround/response/CPU time in the unit of scheduling time slice
QUANTUM
, defined inegos.h
.- when finished, you should see:
➜ /home/cs6640 ls ./ ../ README [INFO] proc 5 died after 2 yields, turnaround time: 0.29, response time: 0.23, cputime: 0.03 ➜ /home/cs6640 echo [INFO] proc 5 died after 1 yields, turnaround time: 0.14, response time: 0.13, cputime: 0.01
- Note: the actual numbers for turanaround/response/CPU time may vary a lot. This depends on your local environment and your machine.
- update 10/06: if you see “3 yilelds” for
ls
and “2 yields” forecho
, that is also correct.
Section 2: Multi-level Feedback Queue
The current scheduler is inefficient for latency-sensitive applications. To see this, try run loop &
first and then run ls
: (To see what loop
does, check apps/user/loop.c
.)
<run "loop" in the background>
➜ /home/cs6640 loop &
[INFO] process 5 running in the background
➜ /home/cs6640 ls
<you will wait for a while to see the result>
The reason why ls
responses super slow is becuase the scheduler doesn’t priorities system processes (which provide essential services) and short-lived processes (like ls
) over long running jobs (like loop
).
In the following, you will implement Multi-level Feedback Queue (MLFQ) to overcome this shortcoming and make ls
repond faster. If you want to know MLFQ basics, read OSTEP Chapter 8.
You will implement a simple three-level MLFQ:
- There will be three queues: high-/mid-/low-priority queues.
- The scheduler will always run processes in higher priority queues.
- Within each queue, the processes run in round-robin fashion.
- When a process is created, it is placed at the high-priority queue.
- Once a process uses up a
QUANTUM
of CPU time, the scheduler downgrades it to a lower-level queue. So, a process will be placed on the mid-priority queue when its CPU time is > 1 and < 2. - Processes with CPU time > 2 will stay in the low-priority queue until they finish.
Exercise 2 Implement your MLFQ
- update Makefile to start using your MLFQ. Change the first line of Makefile from
SCHEDULER=NAIVE
toSCHEDULER=MLFQ
(this enables macrosMLFQ
in C code).- implement your queue in
library/queue2.h
(you should reuse functions in yourapps/user/queue.h
)- implement
mlfq_init()
,mlfq()
, and other places with#ifdef MLFQ
. You can also search “[lab3-ex2]”.- when finished, you should see
ls
runs much faster:➜ /home/cs6640 loop & [INFO] process 5 running in the background ➜ /home/cs6640 ls <you should see results immediately>
update 10/15: a clarification of “shell will block low-priority processes”
- In our labs,
sys_shell
uses “pulling” to continously check if users have typed anything. So, the process ofsys_shell
is alwaysPROC_RUNNABLE
when waiting for inputs. - Also, as requested, the process of
sys_shell
should have the highest priority. - Therefore, there is a starvation problem: for a long-running process like
loop
, when it leaves the high-queue, it will never be executed again. - To fix this, a scheduler should intentionally pick any runnable process other than
sys_shell
, when the current running process issys_shell
. - Or, another “natrual” approach is to ignore the current running process when picking the next process to run; that is, when a timer interrupt arrives, pick one process from MLFQ to run before putting the current process back to the queues.
- Note: with or without starvation will not affect your grades in lab3. Both implementaitons should pass the tests.
Section 3: Enabling sleeping processes
Next, you will allow processes to sleep. Now, if you run:
➜ /home/cs6640 loop sleep
Sleep for 10 QUANTUM...
...and waked up
<the two lines are printed instantly>
After implementing slepping processes, we would expect the loop
will stall for a while and then print the second line.
Exercise 3 Implement process sleep
- you will implement sleeping processes in
proc_on_sleep()
ofscheduler.c
.- the semantics of
grass->sys_sleep(10)
is to sleep at least 10QUANTUM
and the process will be eventually wake up. You don’t have to wake the process exactly after 10QUANTUM
(which is hard in the current implementation).- the kernel should not panic (calling
FATAL
) when there are sleeping processes but no runnable ones. The scheduler shold wait until one sleeping process is wake up.- When you finished, you should see:
➜ /home/cs6640 loop sleep Sleep for 10 QUANTUM... <wait a while, then see> ...and waked up
Finally, submit your work
Submitting consists of three steps:
- Executing this checklist:
- Fill in
~/cs6640/egos/slack/lab3.txt
. - Make sure that your code build with no warnings.
- Fill in
Push your code to GitHub:
$ cd ~/cs6640/egos/ $ git commit -am 'submit lab3' $ git push origin lab3 Counting objects: ... .... To github.com/NEU-CS6640-labs/egos-<YOUR_ID>.git c1c38e6..59c0c6e lab3 -> lab3
Actually commit your lab (with timestamp and git commit id):
Get the git commit id of your work. A commit id is a 40-character hexadecimal string. You can obtain the commit id for the last commit by running the command
git log -1 --format=oneline
.- Submit a file named
git.txt
to Canvas. (there will be an assignment for this lab on Canvas.) The filegit.txt
contains two lines: the first line is your github repo url; the second line is the git commit id that you want us to grade. Here is an example:git@github.com:NEU-CS6640-labs/egos-<YOUR_ID>.git 29dfdadeadbeefe33421f242b5dd8312732fd3c9
Notice: the repo address must start with
git@github.com:...
(nothttps://...
). You can get your repo address on GitHub repo page by clicking the green “Code” button, then choose “SSH”. - Note: You can submit as many times as you want; we will grade the last commit id submitted to Canvas. Also, you can submit any commit id in your pushed git history; again, we will grade the commit id submitted to Canvas.
Notice: if you submit multiple times, the file name (git.txt
) changes togit-N.txt
whereN
is an integer and represents how many times you’ve submitted. We will grade the file with the largestN
.
NOTE: Ground truth is what and when you submitted to Canvas.
A non-existent repo address or a non-existent commit id in Canvas means that you have not submitted the lab, regardless of what you have pushed to GitHub—we will not grade it. So, please double check your submitted repo and commit id!
The time of your submission for the purposes of tracking lateness is the timestamp on Canvas, not the timestamp on GitHub.
This completes the lab.