Week 11.a
CS6640
11/14 2023
https://naizhengtan.github.io/23fall/

1. page fault intro
2. on-demand allocation
3. classic use cases
---

Admin:
 - final oral exam
 - midterm
[went through Midterm]


1. page fault intro

  * cool things you can do with vm
    - Better performance/efficiency
      e.g., on-demand allocation
      e.g., copy-on-write fork
    - New features
      e.g., memory-mapped files

  * functionalities of virtual memory

    (1) translation
       VA -> PA
       for better programability

    (2) protection: isolation + access control
      each process has its own address space

    (3) Virtual memory provides a level-of-indirection
      provides kernel with opportunity to do cool stuff
       - shared a page
       - guard page
       - and more
      that is transparent to user applications.

   "Any problem in computer science can be solved with another level of indirection."
     -- Butler Lampson and David J. Wheeler

    Q: how this "level-of-indirection" is done?
    A: by page faults

  * what is page fault?
    a type of exception
    a process accesses (R/W/X) a virtual address and fails
      -- the VA is not mapped to PA
      -- permission violations

  * Page fault is a form of a trap (like a system call)
    egos-2k+ fatal error on page fault
      But you don't have to panic!
    Instead:
      update page table instead of fatal error
      restart instruction
    Combination of page faults and updating page table is powerful!

  * RISC-V page faults mechanics

    -- three exceptions are related to page fault
       Exceptions cause controlled transfers to kernel

    -- Information we might need at page fault to do something interesting:
      1) The type of violation that caused the fault
        See mcause register value (instruction, load, and store page fault)
      2) The virtual address that caused the fault
        See mtval register; CPU sets it to the fault address
      3) The instruction and mode where the fault occurred
        pc: mepc
        User vs. kernel: see below

    Q: how to tell if a page fault is within kernel?
       tell U/K mode: by stack pointer

    Q: how to tell what is the problem of page faults?
    A: walking the page table

2. on-demand page allocation

  * Motivation

    [a demo of malloc_large]

    Q: What do you expect to see?

    [explain how malloc works
     - check library/libc/malloc.c
     - read build/debug/helloworld.lst
    ]

  * lazy/on-demand page allocation
    * sbrk() is old fashioned;
      applications often ask for memory they need
      - for example, the allocate for the largest possible input but
        an application will typically use less
      if they ask for much, sbrk() could be expensive
      - for example, if all memory is in use, have to wait until
        kernel has evicted some pages to free up memory
      sbrk allocates memory that may never be used.

    * moderns OSes allocate memory lazily
        allocate physical memory when application needs it
        adjust brk (in library/libc/malloc.c), but don't allocate
        when application uses that memory, it will result in page fault
        on pagefault allocate memory
        resume at the fault instruction
      may use less memory
        if not used, no fault, no allocation
      spreads the cost of allocation over the page faults instead
      of upfront in sbrk()

  * [a demo of malloc_loop]

    Q: What do you expect to see?

    Q: Why the malloc fails after allocating two pages?
       [read library/elf/elf.c]

  * [demo: implementing on-demand page allocation]

    Q: How to fix this problem? (mimicing normal OS)

    [mapping VA->PA]

  * a note on lab5: debugging is important
    [turn on displaybitmap]

3. classic use cases

  [skipped]
  * copy-on-write fork
    * observation:
      fork (should potentially) copies all pages from parent
      but fork is often immediately followed by exec
    * idea: share address space between parent and child
      on page fault, make copy of page and map it read/write
      need to refcount physical pages

  Q: how to decide if a page is COW or just read-only?
  A: many possible solutions;
     an elegant one:
       use extra available system bits (RSW) in PTEs

    In practice, implementing COW is non-trivial:
    see https://lwn.net/Articles/849638/

  * Overcommitting memory: use virtual memory larger than physical memory
    * observation: application may need more memory than there is physical memory
    * idea: think of memory as a cache to disk
      page-in and page-out pages of the address address space transparently
    * works when working sets fits in physical memory
      most popular replacement strategy: least-recently used (LRU)
      the A(cess) bit in the PTE helps the kernel implementing LRU
    * replacement policies huge topic in OS literature

  * memory-mapped files
    * idea: allow access to files using load and store
      can easily read and writes part of a file
      e.g., don't have to change offset using lseek system call
    * Unix systems a new system call for m-mapped files:
      void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
    * kernel page-in pages of a file on demand
      when memory is full, page-out pages of a file that are not frequently used

  * pgfaults for user applications
    * many useful kernel tricks using pgfaults
      allow user apps also to do such tricks
    * linux: read "mmap/munmap" and "sigaction"
    * a JVM example

  * DSM: distributed shared memory
    * idea: allow processes on different machines to share virtual memory
      gives the illusion of physical shared memory, across a network
    * replicate pages that are only read
    * invalidate copies on write

  * Virtual memory is still evolving
    Recent changes in Linux
      PKTI to handle meltdown side-channel
        (https://en.wikipedia.org/wiki/Kernel_page-table_isolation)
    Somewhat recent changes
      Support for 5-level page tables (57 address bits!)
      Support for ASIDs
    Less recent changes
      Support for large pages
      NX (No eXecute) PTE_X flag

[acknowledgement: Frans Kaashoek]