HW5 released: 11/9, 21:00 > Answer the following questions. > Submit your answers to Canvas assignments. There is an entry for this homework. > > > 1. Bit-addressable memory > > As we mentioned in class, x86 is byte-addressable, which supports accessing > individual bytes. For example, address 0x0 and 0x1 differ by 1 byte. > Imagine you're designing a new CPU architecture x97-64 which uses bit-addressable > addresses, meaning a program can directly access each bit; > address 0x0 and 0x1 differ by 1 bit. > > 1.a. If you will still use paging with a page size of 4KB. > What's the offset length to access all *bits* in a page? > Write down the offset length. Answer: 15 (4KB = 4096 Bytes = 4096 * 8 bits = 2^15) > 1.b. If x97-64 still uses 40-bit to represent physical page number (PPN), > how many bits does a physical address have? (was 52 bits for x86-64) > Answer: 55 (= 40 + 15) > 1.c. Given 40-bit PPN, how many bits does a page table entry need? > (was 64 bits in x96-64) > Answer: still 64-bit because paging maps VPN => PPN, the length of PPN hasn't changed. > 1.d. Will this bit-addressable design affect L1/L2/L3/L4 index length in one > virtual address? (was 9bit each in x86-64) > Why and why not? Explain in 1-2 sentences. Answer: still 9 bits because PTE length doesn't change (why? see the above question), then the number of PTE in one page doesn't change, hence index length required doesn't change. > > 2. Simulate CPU and walk page tables > > -- This is the standard x86 32-bit two-level page table structure > (not x86-64; we use 32-bit for simplicity). > -- The permission bits of page directory entries and page table entries are set to 0x7. > (what does 0x7 mean? > answer: page present, read-write, and user-mode; see handout week8.b. > This means that the virtual addresses are valid, and that user programs > can read (load) from and write (store) to the virtual address.) > > -- The memory pages are listed below. > On the left side of the pages are their addresses. > (For example, the address of the "top-left" memory block (4 bytes) is > 0xf0f02ffc, and its content is 0xf0f03007.) > > %cr3: 0xffff1000 > > +------------+ +------------+ +------------+ +------------+ > 0xf0f02ffc | 0xf00f3007 | 0xff005ffc | 0xbebeebee | 0xffff1ffc | 0xd5202007 | 0xffff5ffc | 0xdeadbeef | > +------------+ +------------+ +------------+ +------------+ > | ... | | ... | | ... | | ... | > +------------+ +------------+ +------------+ +------------+ > 0xf0f02800 | 0xff005007 | 0xff005800 | 0xf00f8000 | 0xffff1800 | 0xef005007 | 0xffff5800 | 0xff005000 | > +------------+ +------------+ +------------+ +------------+ > | ... | | ... | | ... | | ... | > +------------+ +------------+ +------------+ +------------+ > 0xf0f02000 | 0xffff5007 | 0xff005000 | 0xc5201000 | 0xffff1000 | 0xf0f02007 | 0xffff5000 | 0xc5202000 | > +------------+ +------------+ +------------+ +------------+ > > Question: > > 2.a. Split the 32bit virtual address "0x00200ffc" into L1 index (10bit), L2 index > (10bit), and offset (12bit). > Write them down in _decimal_ numbers: Answer: L1 index: 0 L2 index: 512 offset: 4092 > > 2.b. When accessing virtual address "0x00200ffc" using the above %cr3, > what the L1/L2 page tables are used? > Write down L1/L2 page table starting addresses (namely, the physical > address of the first byte on these pages). Answer: L1 page table addr: 0xffff1000 L2 page table addr: 0xf0f02000 data page addr: 0xff005000 > 2.c. What's the output of the following code? > (hint: (1) because of this is x86-32, there are 1024 PT entries in a PT > page (4KB = 32bit x 1024); (2) notice the L2 index in the question 1.a.) > > #include "stdio.h" > int main() { > int *ptr2 = (int *) 0x00200ffc; > printf("%x\n", *ptr2); > } Answer: 0xbebeebee > 2.d. Copy the above code to a ".c" file, compile, and run. > What do you see? and why? (explain in 1 sentence) Answer: You should get a segfault (it's very unlikely you will see something else) because 0x200ffc is a invalid address. (why? a process only uses a tiny portion of the entire address space (2^48=>256TB!). When you randomly choose one address, it is very very likely that the page hasn't been mapped to anything, which will trigger a page fault and then kernel will kill the process without knowing what to do.) > > 3. TLB and page faults > > Assume that the assembly code below is executed after a context switch. Make > the following additional assumptions: > > -- The TLB is flushed (emptied) after context switch (this is how the x86 works). > (hint: the instruction TLB is flushed as well.) > -- Suppose all data pages (i.e. 0x200000, 0x300000) are stored on disk when instruction 0x500 is executing. > -- There is no prefetching. > > [context switch] > 0x500 movq 0x200000, %rax # move data in address 0x200000 to register %eax > 0x508 incq %rax, 1 # add one to %eax > 0x510 movq %rax, 0x300000 # move register %eax to memory location 0x300000 > > Answer the following questions: > > 3.a. How many TLB misses will happen, and for which pages? Answer: 3 +1 (when fetching code, for page 0x0) +2 (both 0x200000 and 0x300000) Again, notice that the TLB is in page granularity. So, the TBL will learn the instruction page mapping (for the page [0x0,0xfff]) when missing 0x500. > 3.b. How many page faults will happen, and for which pages? Answer: 2 +2 (both 0x200000 and 0x300000)