Week 10.a CS 5600 03/21 2022 On the board ------------ 1. Last time 2. x86-64: addresses - virtual - physical 3. x86-64: page table structures 4. Practice ------------------------------------------------------------------ Admin: - Midterm -- private challenges: come to my office hour on Wed -- public challenge: vote if there are disagreements -- highest: 98 (out of 100) - Lab2 -- grades released on Convas -- come to us if there is something wrong - Lab3 -- will release by the end of today -- concurrent KV-store - talk on Friday, 11AM -- interesting topic on misconfiguration -- if you're interested in systems research, you should attend --------- 1. Last time Virtual Memory +--> Paging +--> multilevel page table +--> x86-64 page table Virtual Memory: VA --> PA page table conceptually implements a map from VPN --> PPN NOTE: VPN and PPN need not (and do not, in our case study) have the same number of bits review: top bits index into page table. contents at that index are the PPN. bottom bits are the offset. not changed by the mapping physical address = PPN + offset Multilevel page table --idea: represent the page table as a tree ... root node has pointers to other nodes children point to pages --the tree is sparse; example: [expand the example from last time] Given page size is 4KB, say we want to map 2MB of physical memory at virtual memory 0,...,2^{21}-1 48 bits: 9 9 9 9 (VPN) | 12 (offset) bottom one, points to physical pages. Question: how would the tree look like? 2MB = 4KB * 512 => we need 512 pages one page can have 512 pointers (WHY? we will study this today, for now assume this is true) [draw this on board] then, we have one L1 PT page, one L2 PT page, one L3 PT page, and one L4 PT page NOTICE: enormous address space, but we've used very few physical resources -- just 512 + 4 physical pages -- Question: if we need to map 2GB memory, how many PT pages do we need? 2GB = 1024 * 2MB = 1024* 512 * 4KB the last level PT (L4) have 1024 pages the 2nd last level has 2 pages the first two levels have 1 page each in total, we need 1028 PT pages (= 1 + 1 + 2 + 1024) -- Question: if the PT pages are fully mapped (meaning mapped 2^(48) Bytes memory), how large would the PT be? 1 + 512 + 512^2 + 512^3 (L1) (L2) (L3) (L4) Notice that the size of the L4 PT is equivalent to the "array" (in the sky) we talked about last time. The point is that if memory are fully mapped, multi-level PT doesn't save memory. But, it is very unlikely this can happen, whereas in normal case multi-level PT helps. 2. x86-64: addresses x86 architecture is 64-bits. registers and addresses are 64-bits wide VIRTUAL ADDRESSES on currently-available x86-64 machines, only 48 bits "matter". (conclusion: not all 64-bit patterns correspond to meaningful virtual addresses) Bit patterns that are valid addresses are called _canonical addresses_. Canonical address has all 0s or all 1s in the upper 16 bits (bits 63 through 48). Has to match whatever bit 47 is. [see 3.3.7.1 in the Intel software developer's manual] Result: address space is 2^{48} = 256 TB [ Another way to look at it: The x86-64 architecture divides canonical addresses into two groups, low and high. Low canonical addresses range from 0x0000'0000'0000'0000 to 0x0000'7FFF'FFFF'FFFF. High canonical addresses range from 0xFFFF'8000'0000'0000 to 0xFFFF'FFFF'FFFF'FFFF. Considered as signed 64-bit numbers, all canonical addresses range between -2^47 and 2^47-1. ] [Intel 5-level paging: --extend virtual addresses from 48 bits to 57 bits --increase the addressable memory from 256 TB to 128 PB --implemented in the Ice Lake processors, and Linux kernel 4.14 ] PHYSICAL ADDRESSES 52 bits Question: why 52? see handout panel 3, 4 [answer: 40bit (in PTE) + 12bit (page)] Means a single machine can address up to 4 PB of physical memory. of course, if the machine only has 16 GB (say), then physical addresses will (roughly speaking) only have 34 bits that matter, and thus the top 18 (=52-34) bits of physical addresses will generally be zero [NOTE: this is a simplification, owing to the "physical memory map"; however, we will not encounter that too much in this class.] MAPPING have to map 48-bit number (virtual address) to 52-bit number (physical address), at the granularity of ranges of 2^{12} 4. x86-64: page table structures ** walk through the handout %cr3 is the address of the top-level directory (L1 page table) Question: is that address a physical address or virtual address? [answer: it is a physical address. hardware needs to be able to follow the page table structure.] ** An example: [walk through the handout] What if OS wants to map a process's virtual address 0x0202000 to physical address 0x3000 and make it accessible to user-level but read-only? what do the page structures look like? solution: take off the bottom 12 bits of offset vpn = 0x0202. write it out in bits: 0....0 000000001 000000010 18 0 bits L1 (0th entry) --> L2 (0th entry) --> L3 ........... ........... ........... ........... [entry 1] ........... PGTABLE <40 bits> |0x00'0000'0003 | U=1,W=0,P=1| [entry 2] | | | [entry 1] | | | [entry 0] ______________________________ [see Intel reference manual for more. Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3a https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf ] ** PTE: bunch of bits includes dirty (set by hardware) acccessed (set by hardware) present (set by OS) cache disabled (set by OS) write through (set by OS) what will happen if the present bit is 0 but a program accesses the memory? [answer: page fault; we will study it later.] what do the U/S and R/W bits do? --are these for the kernel, the hardware, what? --who is setting them? what is the point? (OS is setting them to indicate protection; hardware is enforcing them) -- what if the permission is violated? [answer: again, page fault] ** Large pages: Can get 2MB (resp, 1 GB pages) on x86: each L3 (resp, L2) page table now points to the page instead of another page table + page tables smaller, less page table walking - more wasted memory to enable this, set bit 7 (PS) bit example: set bit PS in L3 table result is 2MB pages page walking is L1, L2, L3; no L4 page tables 4. Practice [skipped] A: memory that different PTEs can address --Question: how much memory can one L1 page entry address? --answer: each entry in the L1 page table corresponds to 512GB of virtual address space ("corresponds to" means "selects the next-level page tables that actually govern the mapping"). for others: --each entry in the L2 page table corresponds to 1 GB of virtual address space --each entry in the L3 page table corresponds to 2 MB of virtual address space --each entry in the L4 page table corresponds to 1 page (4 KB) of virtual address space --Question: so how much virtual memory is each L4 page *table* responsible for translating? 4KB? 2MB? 1GB? [answer: 2MB] --each page table itself consumes 4KB of physical memory, i.e., each one of these fits on a page B. Allocating memory [from cs61, 2018] https://cs61.seas.harvard.edu/site/2018/Section4/ What is the minimum number of physical pages required on x86-64 to allocate the following allocations? Draw an example pagetable mapping for each scenario (start from scratch each time). 1 byte of memory = [5 phys pages] 1 allocation of size 2^12 bytes of memory = [5 phys pages] 2^9 allocations of size of 2^12 bytes of memory each = [512 + 4 = 516 phys pages] [skipped] 2^9 + 1 allocations of size of 2^12 bytes of memory each = [512 + 4 + (1 + 1) = 518 phys pages] 2^18 + 1 allocations of size 2^12 bytes of memory each = [1 (L1) + 1 (L2) + 2 (L3) + (2^9 + 1) (L4) + (2^18 + 1) (the memory)] C. page table walk x86 page table: translate a VA to PA Practice: -- This is the standard x86 32-bit two-level page table structure (not x86-64; we use 32-bit for simplicity). -- The permission bits of page directory entries and page table entries are set to 0x7. (what does 0x7 mean? answer: page present, read-write, and user-mode; see handout week10.a (today's) This means that the virtual addresses are valid, and that user programs can read (load) from and write (store) to the virtual address.) -- The memory pages are listed below. On the left side of the pages are their addresses. (For example, the address of the "top-left" memory block (4 bytes) is 0xf0f02ffc, and its content is 0xf0f03007.) %cr3: 0xffff1000 +------------+ +------------+ 0xf0f02ffc | 0xf00f3007 | 0xff005ffc | 0xbebeebee | +------------+ +------------+ | ... | | ... | +------------+ +------------+ 0xf0f02800 | 0xff005007 | 0xff005800 | 0xf00f8000 | +------------+ +------------+ | ... | | ... | +------------+ +------------+ 0xf0f02000 | 0xffff5007 | 0xff005000 | 0xc5201000 | +------------+ +------------+ +------------+ +------------+ 0xffff1ffc | 0xd5202007 | 0xffff5ffc | 0xdeadbeef | +------------+ +------------+ | ... | | ... | +------------+ +------------+ 0xffff1800 | 0xef005007 | 0xffff5800 | 0xff005000 | +------------+ +------------+ | ... | | ... | +------------+ +------------+ 0xffff1000 | 0xf0f02007 | 0xffff5000 | 0xc5202000 | +------------+ +------------+ -- What's the output of the following C excerpt? int *ptr1 = (int *) 0x0; printf("%x\n", *ptr1); // this will be your homework // int *ptr2 = (int *) 0x200ffc; // printf("%x %x\n", *ptr1, *ptr2); [Note: %x in printf means printing out the integer in hexadecimal format.] Answer: "0xc5202000" In particular, here is walking the page tables: 0x0 => [0][0][0] (10bit, 10bit, 12bit) [note: in x86-64, 0x0 will be organized as [9bit, 9bit, 9bit, 9bit, 12bit]) (%cr3) -> 0xffff1000 (L1 PT) +--[index:0]-> 0xf0f02000 (L2 PT) +--[index:0]-> 0xffff5000 (data page) + 0 (offset) +--[PA]-> 0xffff5000 The content of PA 0xffff5000 is "0xc5202000" Why "content"? because C code "*ptr1" means _dereferencing_ the pointer "ptr1", namely fetching the memory content pointed by "ptr1" (pointer = an address). --note: all addresses in this process are physical addresses.