Week 11.b CS 5600 03/22 2023 1. Last time 2. mmap 3. I/O architecture ---------------------------------- 1. Last time - meltdown and spectre - prev kernel mapping - current kernel mapping - page fault -- mechanics: collaboration of CPU and kernel - "paging" (overcommitting memory): cost and thrashing [make this brief] A. cost --What does paging from the disk cost? --let's look at average memory access time (AMAT) --AMAT = (1-p)*memory access time + p * page fault time, where p is the prob. of a page fault. memory access time ~ 100ns SSD access time ~ 1 ms = 10^6 ns --QUESTION: what does page fault probability (p) need to be to ensure that paging hurts performance by less than 10%? 1.1*t_M = (1-p)*t_M + p*t_D p = .1*t_M / (t_D - t_M) ~ 10^1 ns / 10^6 ns = 10^{-5} so only one access out of 100,000 can be a page fault!! --basically, page faults are super-expensive (good thing the machine can do other things during a page fault) Concept is much larger than OSes: need to pay attention to the slow case if it's really slow and common enough to matter. B. thrashing [The points below apply to any caching system, but for the sake of concreteness, let's assume that we're talking about page replacement in particular.] What is thrashing? Processes require more memory than system has Specifically, each time a page is brought in, another page, whose contents will soon be referenced, is thrown out Example: --one program touches 50 pages (each equally likely); only have 40 physical page frames --If we have enough physical pages, 100ns/ref --If we have too few physical pages (40 pages), assume every 5th reference leads to a page fault Question: if the SSD latency is 1ms, how many times slowdown do we have? --4refs x 100ns and 1 page fault x 1ms for SSD I/O --this gets us 5 refs per (1ms + 400ns) ~ 0.2ms/ref = 2,000x slowdown!!! --What we wanted: virtual memory the size of disk with access time the speed of physical memory --What we have here: memory with access time roughly of SSD (0.2 ms/mem_ref compare to 1 ms/SSD_access) As stated earlier, this concept is much larger than OSes: need to pay attention to the slow case if it's really slow and common enough to matter. 2. mmap --a syscall; a cool way to bring some ideas together. --recall some syscalls: fd = open(pathname, mode) write(fd, buf, sz) read(fd, buf, sz) --we've learned fds before, but what's an fd in kernel? --indexes into a table maintained by the kernel on behalf of the process --syscall: [see slides] void* mmap(void* addr, size_t len, int prot, int flags, int fd, off_t offset); --means, roughly, "map the specified open file (fd) into a region of my virtual memory (close to addr, or at a kernel-selected place if addr is 0), and return a pointer to it" NOTE: the "disk image" here is the file we've mmap()'ed, not the process's usual backing store. The idea is that mmap() lets the programmer "inject" pages from a regular file on disk into the process's backing store (which would otherwise be part of a swap file). --after this, loads and stores to addr[x] are equivalent to reading and writing to the file at offset+x. --why is this cool? - example: mmap enables copying a file to stdout without transferring data to user space [see handout] Question: mmap vs. normal read/write, which is faster? by how much? [Answer: --mmap is faster --"by how much" depends on the file size and the underlying machine. --for ~1G file on one of my machines (M1 Macbook Air), mmap is 13.5x faster. ] NOTE: the process never itself dereferences a pointer to memory containing file data. NOTE: this saves two sets of memory-to-memory copies (kernel-to-user, user-to-kernel), versus the "naive" solution of read()ing into a buffer in user space, and then write()ing [Also, a well-tuned buffer cache manages which file pages are kept in RAM, rather than leaving the app developer to have to explicitly try to manage that (and potentially have the OS page replacement algorithm underneath make conflicting decisions).] - other examples: - reading big files. map the whole thing, rely on the paging mechanism to bring the needed pieces into memory as necessary - shared data structures, when flag is MAP_SHARED - file-based data structures: - load data from file, update it, write it back - this is implemented entirely with loads/stores Question: how does the OS ensure that it's only writing back modified pages? --how's mmap implemented?! (answer: through virtual memory, with the VA being addr [or whatever the kernel selects] and the PA being what? answer: the physical address storing the given page in the kernel's buffer cache). --have to deal with eviction from buffer cache, so kernel will need a data structure that maps from: Phys page --> {list of (proc,va) pairs} note that the kernel needs this data structure anyway: when a page is evicted from RAM, the kernel needs to be able to invalidate the given virtual address in the page table(s) of the process(es) that have the page mapped. 3. I/O architecture general: [draw picture: CPU, Mem, I/O, connected to BUS] --lots of details. --fun to play with. --registers that do different things when read vs. written. [draw some registers in devices, with status and data] CPU/device interaction (can think of this as kernel/device interaction, since user-level processes classically do not interact with devices directly. A. Mechanics of communication (a) explicit I/O instructions [sometimes called port I/O, port-mapped IO (PMIO)] instructions: outb, inb, outw, inw operands: IO address space (separate from memory address space) examples: (i) reading keyboard input. --I/O ports (PS/2): IO Port Access Type Purpose 0x60 Read/Write Data Port 0x64 Read Status Register 0x64 Write Command Register [https://wiki.osdev.org/%228042%22_PS/2_Controller] --keyboard keycode --handout uses "Scan Code Set 1" --differentiate "pressed" and "released" by the most significant bit: 0x1E: A pressed 0x9E: A released (notice: the difference is 0x80, namely binary: b10000000) Question: how many keys can be mapped to the key code? [answer: 128, because a key code is 1B, indicating 256 (2^8) key code; and each key has pressed and released; making it 128 keys.] --more keys? a mode change: --two bytes received, starting with 0xE0 --a lot of multimedia keys, like volume up/down --see "Scan Code Set 1" for more details (link below) [https://wiki.osdev.org/PS/2_Keyboard#Scan_Code_Set_1] an implementation of reading a key stroke: keyboard_readc(); [see handout] (ii) setting blinking cursor. see handout console_show_cursor(); (b) memory-mapped I/O [continue next time] [Acknowledgments: Mike Walfish, David Mazieres, Mike Dahlin, Brad Karp]