Week 7.b CS6640 02/20 2026 https://naizhengtan.github.io/26spring/ □ 1. Page fault intro □ 2. Classic use cases for page fault □ 3. Device drivers □ 4. Mechanics of communication ------ Admin: - Lab5? 1. page fault intro * functionalities of virtual memory (1) translation VA -> PA for better programability (2) protection: isolation + access control each process has its own address space (3) Virtual memory provides a level-of-indirection provides kernel with opportunity to do cool stuff - shared a page - guard page - and more that is transparent to user applications. "Any problem in computer science can be solved with another level of indirection." -- Butler Lampson and David J. Wheeler Q: how this "level-of-indirection" is done? A: by page faults * cool things you can do with vm - Better performance/efficiency e.g., on-demand allocation e.g., copy-on-write fork - New features e.g., memory-mapped files * Q: what is page fault? a type of exception a process accesses (R/W/X) a virtual address and fails -- the VA is not mapped to PA -- permission violations * Page fault is a form of a trap (like a system call) egos-2k+ fatal error on page fault But you don't have to panic! Instead: update page table instead of fatal error restart instruction Combination of page faults and updating page table is powerful! * RISC-V page faults mechanics -- three exceptions are related to page fault Exceptions cause controlled transfers to kernel -- Information we might need at page fault to do something interesting: 1) The type of violation that caused the fault See mcause register value (instruction, load, and store page fault) 2) The virtual address that caused the fault See mtval register; CPU sets it to the fault address 3) The instruction and mode where the fault occurred pc: mepc User vs. kernel: see below Q: how to tell if a page fault is within kernel? tell U/K mode: by stack pointer Q: how to tell what is the problem of page faults? A: walking the page table * Example: on-demand page allocation * Motivation [a demo of malloc] Q: What do you expect to see? [page fault!] * lazy/on-demand page allocation * sbrk() is old fashioned; applications often ask for memory they need - for example, the allocate for the largest possible input but an application will typically use less if they ask for much, sbrk() could be expensive - for example, if all memory is in use, have to wait until kernel has evicted some pages to free up memory sbrk allocates memory that may never be used. * moderns OSes allocate memory lazily allocate physical memory when application needs it adjust brk (in library/libc/malloc.c), but don't allocate when application uses that memory, it will result in page fault on pagefault allocate memory resume at the fault instruction may use less memory if not used, no fault, no allocation spreads the cost of allocation over the page faults instead of upfront in sbrk() * [a demo of crash3] Q: What do you expect to see? * [demo: implementing on-demand page allocation] Q: How to fix this problem? (mimicing normal OS) [mapping VA->PA] * a note on lab5 debugging * mepc * mtval 2. classic use cases * copy-on-write fork [ask students what is fork?] * observation: fork (should potentially) copies all pages from parent but fork is often immediately followed by exec * idea: share address space between parent and child on page fault, make copy of page and map it read/write need to refcount physical pages Q: how to decide if a page is COW or just read-only? A: many possible solutions; an elegant one: use extra available system bits (RSW) in PTEs In practice, implementing COW is non-trivial: see https://lwn.net/Articles/849638/ * Overcommitting memory: use virtual memory larger than physical memory * observation: application may need more memory than there is physical memory * idea: think of memory as a cache to disk page-in and page-out pages of the address address space transparently * works when working sets fits in physical memory most popular replacement strategy: least-recently used (LRU) the A(cess) bit in the PTE helps the kernel implementing LRU * replacement policies huge topic in OS literature * memory-mapped files * idea: allow access to files using load and store can easily read and writes part of a file e.g., don't have to change offset using lseek system call * Unix systems a new system call for m-mapped files: void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); * kernel page-in pages of a file on demand when memory is full, page-out pages of a file that are not frequently used * pgfaults for user applications * many useful kernel tricks using pgfaults allow user apps also to do such tricks * linux: read "mmap/munmap" and "sigaction" * a JVM example * DSM: distributed shared memory * idea: allow processes on different machines to share virtual memory gives the illusion of physical shared memory, across a network * replicate pages that are only read * invalidate copies on write * Virtual memory is still evolving Recent changes in Linux PKTI to handle meltdown side-channel (https://en.wikipedia.org/wiki/Kernel_page-table_isolation) Somewhat recent changes Support for 5-level page tables (57 address bits!) Support for ASIDs Less recent changes Support for large pages NX (No eXecute) PTE_X flag [acknowledgement: Frans Kaashoek] 3. Device drivers General: [show picture: CPU, Mem, I/O, connected to BUS] Device drivers in general solve a software engineering problem ... [draw a picture of different devices have different shapes and drivers fit them into kernel] expose a well-defined interface to the kernel, so that the kernel can call comparatively simple read/write calls or whatever. For example, read, write, open, close, ... this abstracts away nasty hardware details so that the kernel doesn't have to understand them. When you write a driver, you are implementing this interface, and also calling functions that the kernel itself exposes - Drivers: a piece of code talking to devices. Q: can I use a GPU driver from NVIDIA for AMD's GPU? Q: can I use NVIDIA's Windows driver for Linux? 4. Mechanics of communication between CPU and I/O devices --lots of details. --fun to play with. --registers that do different things when read vs. write [draw some registers in devices, with status and data] CPU/device interaction (can think of this as kernel/device interaction, since user-level processes classically do not interact with devices directly.) Q: if you were the IO designer, how would you design the communication between CPU and the device? (a) explicit I/O instructions [sometimes called port I/O, port-mapped IO (PMIO)] x86 instructions: outb, inb, outw, inw operands: IO address space (separate from memory address space) [show slides] (b) memory-mapped I/O physical address space is mostly ordinary RAM low-memory addresses (<1MB), sometimes called "DOS compatibility memory", actually refer to other things. You as a programmer read/write from these addresses using loads and stores. But they aren't "real" loads and stores to memory. They turn into other things: read device registers, send instructions, read/write device memory, etc. --interface is the same as interface to memory (load/store) --but does not behave like memory + Reads and writes can have "side effects" + Read results can change due to external events Example: writing to VGA or CGA memory makes things appear on the screen. To avoid confusion: this is not the same thing as virtual memory. this is talking about the *physical* address. --> is this an abstraction that the OS provides to others or an abstraction that the hardware is providing to the OS? [the latter] [if you're interested, check out these slides: https://opensecuritytraining.info/IntroBIOS_files/Day1_00_Advanced%20x86%20-%20BIOS%20and%20SMM%20Internals%20-%20Motivation.pdf] * Here is a 32−bit PC’s physical memory map: +−−−−−−−−−−−−−−−−−−+ <− 0xFFFFFFFF (4GB) | 32−bit | | memory mapped | | devices | | | /\/\/\/\/\/\/\/\/\/\ /\/\/\/\/\/\/\/\/\/\ | | | Unused | | | +−−−−−−−−−−−−−−−−−−+ <− depends on amount of RAM | | | | | Extended Memory | | | | | +−−−−−−−−−−−−−−−−−−+ <− 0x00100000 (1MB) | BIOS ROM | +−−−−−−−−−−−−−−−−−−+ <− 0x000F0000 (960KB) | 16−bit devices, | | expansion ROMs | +−−−−−−−−−−−−−−−−−−+ <− 0x000C0000 (768KB) | VGA Display | +−−−−−−−−−−−−−−−−−−+ <− 0x000A0000 (640KB) | | | Low Memory | | | +−−−−−−−−−−−−−−−−−−+ <− 0x00000000 [Credit to Frans Kaashoek, Robert Morris, and Nickolai Zeldovich for this picture] (c) interrupts Hardware can send "signals" to CPUs. These are interrupts. One example: timer interrupt (for scheduling) (d) through memory: both CPU and the device see the same memory, so they can use shared memory to communicate. --> usually, synchronization between CPU and device requires lock-free techniques, plus device-specific contracts ("I will not overwrite memory until you set a bit in one of my registers telling me to do so.") --> as usual, need to read the manual