Week 3.b
CS6640
01/23 2026
https://naizhengtan.github.io/26spring/

□ 0. admin
□ 1. normal vs. kernel debugging
□ 2. Memory layout in egos
□ 3. gdb
----

0. admin

  -- lab2 released
  -- new TA

1. normal vs. kernel debugging

  A) Your debugging experinece?

    Q: When you debug, what usually takes more time?
      (e.g., finding where the bug is, or understanding why it happens?)

    Q: What's your routine to debug a normal program?

    Q: What is one debugging habit you think helps the most?

    // * What is the first thing you try when a program behaves incorrectly?
    //   (add prints, use a debugger, read the code, or search online?)

    // * What makes a bug hard to debug? the hardest bug you've ever debugged?
    // (unclear error messages, large codebase, nondeterministic behavior, or lack of tools?)

    // * When debugging, do you reason top-down (from high-level behavior) or bottom-up (from variables and state)?
    //   Which feels more effective?

    // * How does debugging your own code differ from debugging someone else's code?

 B) normal vs. kernel debugging

  * "normal" c program
    + you do not need to understand hardware details (like CPU)
    + you have clear error messages
    + you do not have to worry about touching important memory
      (the program will be killed)
    + you do not use addresses directly
    + you have a nice address space containing your program only
    + you have a lot of tools (like IDE)

  * kernel programming
    - you need to understand hardware details (like CPU)
    - you have semi-clear error messages (if you know CPU)
    - your have to worry about touching important memory
      (the kernel will write something to there and later crash)
    - you sometimes need to use addresses directly
    - you do not have a nice address space
    - you have limited yet powerful tools

  * ULT labs? a mix of the two
    + you need to understand hardware details (like CPU)
    - you have semi-clear error messages (if you know CPU)
    - your have to worry about touching important memory
    + you do not use addresses directly
    - you do not have a nice address space
    - you have limited yet powerful tools

  * some debugging principles
    - "die" earlier than later
    - use "assert" a lot
    - use "printf" but don't trust "printf" entirely
    - binary-printf is still useful
    - use "static analysis" more often (e.g., "git diff")

 C) examples

  * common "error messages":
    accessing invalid memory
    [example: invalid memory]

    Q: what to expect on macos?
       w/ and w/o virtual memory

    Q: how about on egos?

  * heap explosion
    [example: exploding the heap]

    Q: what do you expect to happen in real OS?

    Q: what do you expect to happen?
    [see where the error message comes from; library/libc/malloc.c]
    [circle back about heap]

2. Memory layout in egos

  Talking about memory...

  * memory layout of egos-2k+
    [see handout]

  * memory layout of your ULT
    -- code + data
       [see build/debug/ult.lst]
    -- where are stacks defined?
       [see apps/app.s]
    -- where is the heap?
       [see library/elf/app.lds]

  * check other .lst files
    [see egos.lst, sys_shell.lst]

  * egos design:
    - earth: abstract hardware (like interrupt, memory, disk)
    - grass: provide services (like process, scheduler, syscall)
    - apps:
       -- system apps: provide basic functionalities (like fs, shell)
       -- user apps: normal apps

    [show egos organization]


  Q: How does it work when sys_shell and ult run concurrently?
     (given the same memory space)

  * grass and earth interfaces
    library/egos.h

3. gdb

  * how to run
    You need two shells:

      sh1> make qemu-gdb

      sh2> riscv-none-elf-gdb

  * egos gdb setup

     qemu <--[port:6640]--> gdb
    [stop]
                          [connect]
                          (gdb) c
  [start to run]
       ...
                          <Ctrl-C>
    [stop]
                          (gdb) quit
    [continue]

  * gdb basics

    Scenario 1: "what's wrong?"
    - run to failure
    - use gdb to see the final status

    Scenario 2: "I suspect this is wrong"
    - set a breakpoint
    - continue the egos
    - run until the breakpoint
    - single step running & monitoring

    [again, exmaple: invalid memory]

  Q: why the breakpoints have been triggered multiple times?

    because...
    ...so far, you don't have all apps share the same memory layout...
    ...we don't have virtual address space for each process,...
    ...which will be your future labs

  Q: wait...if all apps are using the same chunks of memory.
     how can egos run them?

    A: whenever scheduling a new app, kernel saves the previous app
       and copies the new app's memory
       to the corresponding memory addresses.

  * gdb on egos
    - initial symbols: egos
    - add new symbol file:
      "(gdb) add-symbol-file build/release/ult.elf"
    - breakpoints at apps' region may be triggered by other apps
      [a demo]
    - there are timer interrupts

  * [skipped] a tricky bug: wrong stack pointer
     [example: stack problem]