Week 5.a
CS3650
02/05 2024
https://naizhengtan.github.io/24spring/

1. x86-64 assembly (cont'd)
2. Stack frames
3. Implementation of processes
4. Context switch intro
-----


* Where we're are:

  +-------+              +----------+
  |source |              |executable|           +-------+
  | code  |--[compile]-->| file     |--[exec]-->|process|
  +-------+              +----------+           +-------+
                         |<--HERE-->|


1. x86-64 assembly (cont'd)

     syntax:
         movq PLACE1, PLACE2
     means "move 64-bit quantity from PLACE1 to PLACE2". the places
     are usually registers or memory addresses, and can also be
     immediates (constants).

     pushq %rax   equivalent to :
                 [  subq $8, %rsp
                    movq %rax, (%rsp) ]


     popq %rax   [ movq (%rsp), %rax
                   addq $8, %rsp     ]


     call 0x12345  [ pushq %rip
                     movq $0x12345, %rip]


     ret           [ popq %rip ]

   --above we see how call and ret interact with the stack
   --call: updates %rip and pushes old %rip on the stack
   --ret: updates %rip by loading it with stored stack value

   [want to learn more about x86 assembly code?
    check out: https://cs.brown.edu/courses/cs033/docs/guides/x64_cheatsheet.pdf ]


2. Stack frames

    [draw the stack frame]

                      |
       +------------+ |
       | ret %rip   | /
       +============+
%rbp-> | saved %rbp | \
       +------------+ |
       |            | |
       |   local    | \
       | variables, |  >- current function's stack frame
       |  call-     | /
       | preserved  | |
       |   regs,    | |
       | etc.       | /
%rsp-> +------------+/

    %rbp and %rsp

    **a real example: see handout**

    caller func (f)            callee func (g)

      saving registers
      call g   -----------+
                          +----> # prologue
                                 do things
                                 # epilogue
                          +----- ret
      restore registers <-+


    [go through the handout line by line]

    // the points here are:

    - prologue & epilogue: note that the epilogue for f (starting on line 49)
    does the reverse of the prologue, thus restoring the stack to
    how it was before.

    -  Calling a function requires agreement between caller and callee
    about how arguments are passed, and which of them is responsible for
    saving and restoring registers.

    - In an executing program, the stack is partitioned into a set of
    stack frames, one for each function. The stack frame for the current
    function starts at the base pointer and extends down to the stack
    pointer. 

        ** Stack frames are how functional scope in languages like C are
        actually implemented -- allowing each function invocation to
        refer to different variables with the same name.  In other
        words, the programmer thinks they are writing a function with
        local variables; compiler has arranged to implement that with
        stack frames.

    - de-mystifying pointers: a pointer (like "int* foo") is an
    address. that's it. repeat: a pointer is an address. that
    address can be:
        - on the stack
        - on the heap
        - in the text section of the program

    - because of how stack frames work, it's unequivocally a bug
    to pass a pointer from a prior stack frame.

    - Unix calling conventions:

      --how could function main() know what happened in f()?
        --for example, has %rdx changed?

      --specifically, what happens to a function's state, that is, the registers,
      when a function is called? they might need to be saved, or not.

      --purely a matter of convention in the compiler, **not** hardware
      architecture

      --on x86-64, *arguments* are passed through registers: %rdi, %rsi,...
      (more than six? then spill to stack). And the *return value* is
      passed from callee to caller in %rax.

      --call-preserved and call-clobbered registers
        -- one calling convention:
        call-preserved registers are RBP, RBX, RSP, R12, R13, R14, and R15

      --Question: is %rax call-clobbered or call-preserved?
        [answer: call-clobbered because return value is stored in %rax so the
        callee needs to modify it.]

-------------
Admin

  * don't have lottery today; a student's story

-------------

3. Implementation of processes

   Briefly cover the OS's view:

   - process control block (PCB)

     -----------------
     |   process id  |
     |   state       |   (ready, runnable, waiting, etc.)
     |   open file   |   (0:stdin, 1:stdout, 2:stderr)
     | VM structures |   (will talk about in memory part)
     |   registers   |
     |   .....       |  (signal mask, terminal, priority, ...) 
     -----------------

     [show slide; a simplified version]

   - process id
       --for example, the return value of fork() in the parent process

   - process states: real linux process states include
       --running/runable
       --interruptible sleep
       --uninterruptible sleep
       --zombie
       --stopped

   Question: Linux has "zombine"; why no "orphan" state?

   point out that during scheduling, a mechanism that we have not seen,
   a core switches between processes. will discuss the mechanism next.


4. Context switch intro

   --motivation: one CPU can run one process at a time; how to run multiple
     process "at the same time" (multiplexing)? 

   --context switch: OS stops the running process and 
      switches to another ready process.

   [draw switching between P1 and P2]

        P1             OS            P2
         |
    [trap to kernel]
         +------------>+
                       |
               [save P1 context]
               [choose P2 to run]
              [restore P2 context]
                       |
                       +------------->+
                                      |
                                     ...
                                [trap to kernel]
                       +<-------------+
                       |
               [save P2 context]
               [choose P1 to run]
              [restore P1 context]
                       |
         +<------------+
         |
        ...

   --some points

      -- P1 and P2 have no idea they've been cut and switched out.

      -- OS (scheduler) decides which process to run next
         (but how make this decision? we will see in scheudling)

      -- if context switches happen frequently enough, users will fell
         P1 and P2 are running "at the same time"

   --context switching has a cost

     [draw two processes and kernel; switching from one to the other]

     --CPU time in kernel
     --save and restore registers
     --switch address spaces
     --indirect costs
       --TLB shootdowns, processor cache, OS caches (e.g., buffer
       caches)

     --result: more frequent context switches will lead to worse
     throughput (higher overhead)