Week 2
CS4973/CS6640
01/13 2025
https://naizhengtan.github.io/25spring/

□ 0. Admin, lab1, and lab2
□ 1. Review: threading
□ 2. Context switches in user-space
□ 3. [skipped] Cooperative vs. Preemptive multithreading in user-level
□ 4. [skipped] Kernel debugging
□ 5. Memory layout in egos
□ 6. gdb
---

0. Admin:
  - check if new joiners
  - office hours
  - how's your lab1?

  - suggestions/questions from students
    -- PATH (Sid)
       [explain a bit about how PATH works; starting from "ls"]
       -- $ PATH=$(getconf PATH)
       -- $ PATH=""
       -- revert osi-env location in PATH from env.sh
    -- how to understand 0x2000_0000 is not accessible? (Dani)
       [explain a bit about how qemu works; take a look at earth.lds]

  - lab2 setup:
    -- lab2 will be released by today
    -- draw upstream, origin, local

  - background:
    -- malloc/free
    -- registers

1. Review: threading

  - Review: processes and kernel-managed threads?
   (We refer to that as "kernel-level threading.")

    [draw process and threads]

    Two views of processes: OS kernel and apps

    from kernel, PCB (process control block):
    - its view of memory (for x86, %cr3)
    - its view of CPU (registers)
    - (file descriptor table)
    - (other metadata)

    TCB (thread control block):
    - thread's view of CPU (registers)

    [draw a memory layout of two threads]
    Recall basic segments of a memory layout:
      stack, code, heap, data

    Q: what C program "components" do the four carry?
       -- stack: local variables
       -- code: code
       -- heap: malloc'ed memory
       -- data: global variables

    Q: are all four absolutely necessary? Can we get rid of some?
       What's the minimum set of segments that a program can run?

      "a program can run with only stack and code sections."
      without having to have heap, data, and other things.

    -- basically same as a process, except two threads in the same
    process have the same address space
     (in x86, that means having the same %cr3)

    -- kernel threads are always preemptive

  - We can also have *user*-level threading, in which the kernel is
   completely ignorant of the existence of threading.

              [draw picture]

           T1     T2     T3
               thr package
                  OS
                  H/W

   -- in this case, the threading package is the layer of software that
   maintains the array of TCBs (thread control blocks)

   Responsibilities of threading package:
     1. manage threads (TCBs)
     2. create threads (allocating stacks)

        Q: does threading package need to allocate heap for new threads?
        [A: No, threads share heap.]

        Q: does threading package need to allocate code sections for new threads?
        [A: No, threads share code.]

     3. scheduling
     4. context switch! (in user-space)

   -- user-level threading can be non-preemptive (cooperative) or
    preemptive. we'll look at both.

    --but first, let's take a look at context switches in user-space.

2. Context switches in user-space

  * context in processes and threads

   Let's again start from process.

   - What is execution "context" for a process?
    Informally, the context represents the state of a process that,
    if altered unexpectedly, would disrupt its execution.

     Q: which is process's context?
      A. %pc
      B. %sp
      C. stack
      D. heap
      E. other processes' memory
      [Anser: A, B, C, D]

   A canonical process "context" really is about two things:
     - switching the view of memory (address space; x86's %cr3)
     - switching the registers (including stack pointer, meaning switching the stack)

   Now, consider threads:

     Q: which is thread's context?
      A. %pc
      B. %sp
      C. stack
      D. heap
      E. other processes' memory
      [Anser: A, B]

   The memory isn't relevant for user-level threading.

  * context switch is switching between two contexts

   How context switch works in your lab2?
     [see handout]

   - background RISC-V assembly I

   - go through a ctx_switch

     (i) go through high-level description
         [draw two stacks]

     (ii) go through the code

      Q: what is "a0"? what is "0(a0)"?
      A: first argument (see calling convention);
         dereference the first argument (which is "void** old_sp")

      Q: when ctx_switch() finishes, it runs "ret"; where it returns to?
      A: next of "ctx_switch()".

      Q: why the code only saves callee-saved's registers?
      A: caller-saved registers are automatically saved by compiler

   - Now, ctx_switch() assumes the other stack has "context" (the registers).
     How to create such a "context" on stack?

     Answer is using "ctx_start".
     [See handout panel 4]


3. [skipped] Cooperative vs. Preemptive multithreading in user-level

    Q: (in handout panel4) if function pointer f points to:
         void foo() {while(1);}
       What will happen?
       [Answer: a dead loop of the entire process]

    --This is also called *non-preemptive multithreading*.

    --It means that a context switch takes place only at well-defined
    points: when the thread calls thread_yield()

    --A bummer: what if the IO interface is synchronous?
      Not much we can do...
      Similar thins happen when using blocking syscalls.

   Q: How can we build a user-level threading package that does context
   switches at any time?

    Need to arrange for the package to get interrupted.

    How? Signals!

    Deliver a periodic timer interrupt or signal to a thread
    scheduler. When it gets its interrupt, swap out
    the thread, run another one

    Makes programming with user-level threads more complex -- all the
    complexity of programming with kernel-level threads, but few of the
    advantages (except perhaps performance from fewer system calls).

    in practice, systems aren't usually built this way, but sometimes it
    is what you want (for example, if you're simulating some OS-like
    thing inside a process, and you want to simulate the non-determinism
    that arises from hardware timer interrupts).

    A larger point: signals are instructive, and are used for many
    things. What a signal is really doing is abstracting a key hardware
    feature: interrupts.

    So this is an example of the fact that the OS's job is to give
    a user-space process the illusion that it's running on something
    like a machine, by creating abstractions. In this example, the
    abstraction is the signal, and the thing that it's abstracting is an
    interrupt.

    [if you are interested, check out:
      $ man signal
      $ man ualarm
      $ man makecontext
    ]

  Q: what is a good abstraction for a running app/program?
     VMs, containers, processes, kernel-threads, user-level threading

[Acknowledgments: Mike Walfish]

4. [skipped] Kernel debugging

  Q: how do you debug a normal C program?

  * "normal" C program
    + you do not need to understand hardware details (like CPU)
    + you have clear error messages
    + you do not have to worry about touching important memory
      (the program will be killed)
    + you do not use addresses directly
    + you have a nice address space containing your program only
    + you have a lot of tools (like IDE)

  * kernel programming
    - you need to understand hardware details (like CPU)
    - you have semi-clear error messages (if you know CPU)
    - your have to worry about touching important memory
      (the kernel will write something to there and later crash)
    - you sometimes need to use addresses directly
    - you do not have a nice address space
    - you have limited yet powerful tools

  * ULT labs? a mix of the two
    + you do not need to understand hardware details (like CPU)
    - you have semi-clear error messages (if you know CPU)
    - your have to worry about touching important memory
    + you do not use addresses directly
    - you do not have a nice address space
    - you have limited yet powerful tools

  * some debugging principles
    - "die" earlier than later
    - use "ASSERT" a lot
    - use "printf" but don't trust "printf" entirely
    - binary-printf is still useful
    - use "static analysis" more often (e.g., "git diff")

  * common "error messages":
    accessing invalid memory
       w/ and w/o virtual memory
       [example: invalid memory]

5. Memory layout in egos

  Talking about memory...

  * egos design:
    - earth: abstract hardware (like interrupt, memory, disk)
    - grass: provide services (like process, scheduler, syscall)
    - apps:
       -- system apps: provide basic functionalities (like fs, shell)
       -- user apps: normal apps

  * memory layout of CPU sifive_e
    [see handout]

  * check .lst files
    [take a look at
      earth.lst,
      grass.lst,
      sys_shell.lst, ult.lst]

  * where are stacks defined?
     apps/app.S
     earth/earth.S

  * grass and earth interfaces
    library/egos.h

6. gdb

  Q: Have you used gdb or debuggers before?

  * how to run
    You need two shells:

      sh1> make qemu-gdb

      sh2> riscv64-unknown-elf-gdb

  * egos gdb setup

     qemu <--[port:6640]--> gdb
    [stop]
                          [connect]
                          (gdb) c
  [start to run]
       ...
                          <Ctrl-C>
    [stop]
                          (gdb) quit
    [continue]

  * gdb basics

    Scenario 1: "what's wrong?"
    - run to failure
    - use gdb to see the final status

    Scenario 2: "I suspect this is wrong"
    - set a breakpoint
    - continue the egos
    - run until the breakpoint
    - single step running & monitoring

  * play with gdb
     [example: stack problem]

  Q: why the breakpoints have been triggered multiple times?

    because...
    ...so far, you don't have all apps share the same memory layout...
    ...we don't have virtual address space for each process,...
    ...which will be your future labs

    [explain a bit about processes and proc_curr_idx in grass/kernel.c]

  Q: wait...if all apps are using the same chunks of memory.
     how can egos run them?

    A: whenever scheduling a new app, kernel saves the previous app
       and copies the new app's memory
       to the corresponding memory addresses.

  * gdb on egos
    - initial symbols: earth and grass
    - add new symbol file:
      "(gdb) add-symbol-file build/release/helloworld.elf"
    - breakpoints at apps' region may be triggered by other apps
    - so...disambiguate the same breakpoints:
         (gdb) display/d proc_curr_idx
          [see "proc_curr_idx" in grass/kernel.c]
    - (there are timer interrupts)

  * show how to use:
    -- show navigation: walk through code both in C and asm
    -- show variable values: show "i" in "foo()""
    -- calling convention: show arguments before calling "printf"
    -- show register values: show $a4 value when calling "foo()"

  * a potential lab2 bug: stack problem
     [example: stack problem]
     -- the bug is that a wrong stack pointer is given to the ctx_start().
        Note that the return value of malloc should not be where the stack
        pointer should point to.