week 11b CS6640 03/20 2026 https://naizhengtan.github.io/26spring/ OSI exam review: 1. OS basics and C 2. RISC-V 3. egos 4. context and scheduling 5. trap to kernel 6. isolation and protection 7. device drivers 8. file system --- 1. OS basics and C - What is OS? a piece of software that (a) manages resources and (b) abstracts HW [the two major jobs of an OS] - everything is 0s and 1s * bits and bytes * instructions are 0s and 1s * data: char, int, long long * unsigned vs. signed types - little endian * what is little endian * how 0xdeadbeef looks like in little endian? * configurable in RISC-V machines * we (egos) assume little endian - a program's memory layout * text segment * data segment * stack * heap * Which RISC-V registers point to them? - C pointers * a pointer = a memory address * pointer arithmetic * a pointer of a pointer? -- "void **old_sp" in context switch - common knowledge of C * uninitialized local variables on stack * two ways to access arrays * keywords: * union * static variable * bit field * bit manipulation 10001 & 10000 = 10000 // bit-wise and 10001 | 10000 = 10001 // bit-wise or 10001 ^ 10000 = 00001 // bit-wise xor ~1000 = 0111 // bit-wise not 2. RISC-V - asm(Template : OutputOperands : InputOperands) * Template: a string that is the template for the assembler code. * OutputOperands: the C variables modified by the instructions in the Template. * InputOperands: C expressions read by the instructions in the Template. * for example, int mie asm("csrr %0, mie" : "=r"(mie)); asm("csrw mie, %0" ::"r"(mie | 0x80)); - common RISC-V registers * execution: pc, sp, ra * exceptions: mepc, mtvec, mcause * timer interrupt: mtime, mtimcmp, mstatus, mie - common RISC-V assembly instructions * execution: addi, sw, lw, mv * function calls: call, ret * syscalls: ecall, mret * CSRs: csrr, csrw - calling convention * caller-saved registers: ra, ... * callee-saved registers: sp, ... 3. OS - execution abstraction: processes and threads * difference betwen processes and threads? memory address space (i.e., %satp) * kernel threading vs. user-level threading * cooperative vs. preemptive - OS's view: PCB and TCB * pid/tid * context * program counter (%pc) * stack pointer (%sp) * registers * scheduling * process/thread status * others * memory management * root of page table * file system related * file descriptor table - mutiplexing CPUs: scheduling * classic scheduling algorithms: FCFS, RR, MLFQ - multiplexing memory: virtual memory * segmentation (x86) * PMP (RISC-V) * PT-based virtual memory (x86/ARM/RISC-V) * others - abstracting disk: file system * file-mapping structures ** global versus per-file ** tree-based versus hash-map based * hierachical file systems are dead: hFSD - OS organizations: * monolithic OS * microkernel OS (egos) * exokernel (LibOS) * multikernel 3. egos - a bird view of egos * three layers: earth, grass, and apps * egos (by of Lab7) * earth/grass run in M-Mode * system/user apps run in U-Mode * earth abstracts HW * grass provides scheduling, IPC, and syscalls * apps implement all functionalities - system services * sys_proc (GPID_PROC) * sys_terminal (GPID_TERMINAL) * sys_file (GPID_FILE) * sys_shell (GPID_SHELL) - OS booting CPU jmps to 0x80000000 +-> earth/boot.s:boot_loader +-> earth/boot.c:boot +-> grass/init.c:grass_entry +-> grass.c:main +-> ['mret' to APPS_ENTRY] +-> apps/system/sys_proc.c:main ... +-> apps/system/sys_shell.c:main - kernel ~= 3 handlers * handling interrupts (scheduling) * handling exceptions (killing apps) * handling syscalls (run syscalls) - egos exception handler: trap_entry (grass/kernel.s) [bookkeeping] +-> kernel_entry (grass/kernel.c) +-> excp_entry | +->[your exception handling code] | +->[restore app context] [your privilege level code] (end of kernel entry) | +-> [bookkeeping] mret (end of trap_entry) | +-> [user space] - handling syscalls: sys_send/recv (syscall.c) | ^ +-> trap (syscall.c) [ret] | USER SPACE | -----[trap]-------------------------------------------------|---- | KERNEL | +-> trap_entry (kernel.s) [mret] [switch to kernel stack] [switch to user stack] [save context to stack] [restore context] | | +-> kernel_entry (kernel.c) [ret] [context: stack=>PCB] [context: PCB=>stack] | | ... (intr_entry? excp_entry?) | | | +-> proc_try_syscall (ipc.c) [ret] | | +-> proc_try_send/recv (ipc.c) | +-> sys_yield (scheduler.c) [ret] | | +------>[schedule next proc]----+ - kernel debugging * gdb cmds * see handout: https://naizhengtan.github.io/26spring/notes/handout_w3b.pdf 4. Context and scheduling (a) user-level threading - TCB: thread control block * thread id * status * "main" function pointer * function arguments * context (stack pointer, program counter, general purposed registers) - ULT tasks * manages TCBs * make a new stack for each new thread * scheduling * context switch! (in user-space) - context switch in user space * how it works in egos? ctx_start()/ctx_switch() - cooperative scheduling vs. preemptive scheduling * lab2 ULT is cooperative * one can implement a preemptive ULT by using signals (b) kernel schedulers - PCB: process control block * process id * status * context (sp and mepc) * IPC (sender & receiver) * scheduling attributes - process status * Unused, Ready, Runnable, Running, "Sleeping" * transition diagram -- definition: preemptive scheduler and nonpreemptive scheduler - scheduling metrics * turnaround time * response time * CPU time * yield numbers - scheduling algorithm * round robin (~= original scheduler) * MLFQ (lab3) -- expectation: since you implemented this, you should be able to remember the details. - "loop &; ls" * the performance difference in the above two algorithms, * and why? 5. trap to kernel---interrupts, exceptions, and syscalls (a) timer interrupt - the backbone of handling timer interrupts void handler() { CRITICAL("Got a timer interrupt!"); // (4) reset timer } int main() { CRITICAL("This is a simple timer example"); // (1) register handler() as interrupt handler // (2) set a timer // (3) enable timer interrupt while(1); } - register timer handler * mtvec - set/reset CPU timer * mtime, mtimecmp - enable timer interrupt * mstatus, mie - how to trigger a timer interrupt? * set mtime + mtimecmp accordingly - other interrupts? software interrupt * set the MSIP bit of mip // how we initially implement syscall (b) exceptions - CPU: "I don't know how to progress". - Exceptions are synchronized traps. - RISC-V exception reason table Examples: * read/write invalid memory * run invalid instructions * run privileged instructions in unprivileged mode - handling exceptions: Similar to interrupts. Again, mtvec, mcause, mepc (c) syscalls - Interfaces for user applications to "talk" to kernel. - Three design questions for kernel developers Q1: how to trap to kernel? Q2: what information is transferred between apps and kernel? Q3: how to transfer the information? - how to trap to kernel * interrupt (skeleton code) * ecall (your lab4) 6. isolation and protection (a) privilege levels - defined by CPU * usually two bits, may or may not be visible * in RISC-V, it is invisible (MPP hints the privilege level after mret) * M-mode: 3; S-mode: 1; U-mode: 0 (unlike ring levels from x86) - can be a "virtualization hole" * implementing a hypervisor-egos would be a cool final project. * trap-and-emulate hypervisor * VM: simulation of a computer, accurate enough to run an O/S * running the VM's kernel in U-Mode * ordinary instructions work fine * privileged RISC-V instructions are illegal in U-Mode will cause a trap, to the hypervisor * hypervisor trap handler emulates privileged instruction - a program running in the "high" privilege level can * execute privileged instructions, * touch privileged registers (CSRs), * access memory marked as privileged - how to switch privilege levels * low-to-high: interrupts, exceptions, and ecall * high-to-low: set MPP, then mret (b) memory protection - protection = isolation + access control - the access control problem subj --[access]--> obj * for us subj: instructions + context (privilege levels and others) op: r/w/x obj: memory address - solutions: ACL * segmentation (x86-32) * PMP (RISC-V) * paging - segmentation (x86-32) * protecting granularity: a segment, defined by "base" and "limit" * segments are defined in descriptor tables in memory * access right bits are in descriptors - PMP (RISC-V) * protecting granularity: memory regions and privilege levels * regions are defined by pmpcfg and pmpaddr CSRs * access right bits are in pmpcfg registers * limitation: cannot tell different processes ("crash3 proc") - paging * isolation: different roots of page tables provide different address spaces * access control: -- protecting granularity: pages -- access right bits are in page tables, in particular, page table enries (c) virtual memory - virtual memory translation: VA --[translation]--> PA - virtual memory translation: * paging * page table * RC-32 2-level page table (lab5) - page table walk * split the VA to 10:10:12 * walk from the root of the page table (satp) * find the L1 page table page * locate the page table entry (PTE) for L2 page table page * find the L2 page table page * locate the page table entry (PTE) for data page * find the PA in the data page - page faults * invalid PTE * violating the permission bits on PTEs 7. device driver - four ways of communicating with a device * port-IO * memory mapped IO * interrupt * shared memory (e.g., DMA) - combination of mechanism: {polling, interrupt} X {Programmed IO, DMA} - SDHCI * MMIO register layout * cmds to SD card * single-block reads/writes * multi-block reads/writes 8. File system - what does a FS do? 1. provide persistence (don't go away ... ever) 2. give a way to "name" a set of bytes on the disk (files) 3. give a way to map from human-friendly-names to "names" (directories) (a) persistence comes from HW (sotrage hierarchy) * persistent memory * SSD * HDD * tape * even glass (project Silica) (b) file - file-mapping problem: {file,offset} --[inode]--> disk addrss - file-mapping data structures: * imbalanced tree (UNIX inode) * per-file extent tree * global cuckoo hash table - UNIX inode * meta data * direct pointers * indirect pointers * double indirect pointers * triple indirect pointers ... * why this design? for small files - discussion: per-file mapping versus global mapping - mmap * a syscall * map a chunk of file to memory space * use memory interface to read/write files * collaboration of page cache + page fault (c) directory - hierarchical namespace * easy for human to remember * easy to organize files * single namespace for multiple file systems - "hierarchical file systems are dead" (2009) * "Google is a verb" * search-based fs (d) egos rwfs (lab7) - disk interface * an array of blocks * block_read/write - disk layout * superblock * bitmap * inode array * data blocks - inode data structure * fs_read * fs_write