Week 6.a
CS5600
2/13 2023
https://naizhengtan.github.io/23spring/

0. Last time
1. Mutex
2. Condition variables
3. Semaphores
4. Monitors and standards
5. Advice for concurrent programming
--------------------------------------------

0. Last time

    --protecting critical sections.

        --want lock()/unlock() or enter()/leave() or acquire()/release()
        --lots of names for the same idea

        --in each case, the semantics are that once the thread of
        execution is executing inside the critical section, no other
        thread of execution is executing there

    --implementing critical sections

        --"easy" way, assuming a uniprocessor machine: 

            enter() --> disable interrupts
            leave() --> reenable interrupts

            [convince yourself that this provides mutual exclusion]

        --bakery algorithm (see below)

        --we will study other implementations (locks) later.


2. Mutexes

    -- Mutex : mutual exclusion object

    -- Usage:

        mutex_t m
        mutex_init(mutex_t* m)
        acquire(mutex_t* m)
        release(mutex_t* m)
        ....

        --pthread implementation:
         pthread_mutex_init(), pthread_mutex_lock(), ...

    --using critical sections
      [see handout]

        --linked list example

        --bounded buffer example

    --why are we doing this?

        --because *atomicity* is required if you want to reason about
        the correctness of concurrent code

        --atomicity requires mutual exclusion aka a solution to critical
        sections

        --mutexes provide that solution

        --once you have mutexes, don't have to worry about arbitrary
        interleavings. critical sections are interleaved, but those are
        much easier to reason about than individual operations.

        --why? because of _invariants_.

            examples of invariants:

            "list structure has integrity"

            "'count' reflects the number of entries in the buffer"

        the meaning of lock.acquire() is that if and only if you get
        past that line, it's safe to violate the invariants.

        the meaning of lock.release() is that right _before_ that line,
        any invariants need to be restored.

        the above is abstract.

        let's make it concrete:

            invariant: "list structure has integrity"

            so protect the list with a mutex

            only after acquire() is it safe to manipulate the list

    --Question: by the way, why aren't we worried about *processes*
      trashing each other's memory?

      [answer: because the OS, with the help of the hardware,
      arranges for two different processes to have isolated memory
      space. such isolation is one of the uses of virtual memory,
      which we will study in a few weeks.]


3. Condition variables

    A. Motivation

        --producer/consumer queue 

        --very common paradigm. also called "bounded buffer":

            --producer puts things into a shared buffer
            --consumer takes them out
            --producer must wait if buffer is full; consumer must
              wait if buffer is empty
            --shows up everywhere
            --Soda machine: producer is delivery person, consumer
                is soda drinkers, shared buffer is the machine
            --OS implementation of pipe()
            --DMA buffers

    --producer/consumer queue using mutexes (see handout, 2a)

        --what's the problem with that?

        --answer: a form of *busy waiting* 

          --analogy: the next person keeps knocking the door

    --It is convenient to break synchronization into two types:

        --*mutual exclusion*: allow only one thread to access a given
        set of shared state at a time

        --*scheduling constraints*: wait for some other thread to do
        something (finish a job, produce work, consume work, accept
        a connection, get bytes off the disk, etc.)


    B. Usage

    --API

        --void cond_init (Cond *, ...); 
        --Initialize

        --void cond_wait(Cond *c, Mutex* m);
        --Atomically unlock m and sleep until c signaled 
        --Then re-acquire m and resume executing 

        --void cond_signal(Cond* c);
        --Wake one thread waiting on c

        [in some pthreads implementations, the analogous
        call wakes *at least* one thread waiting on c. Check the
        the documentation (or source code) to be sure of the
        semantics. But, actually, your implementation shouldn't
        change since you need to be prepared to be "woken" at
        any time, not just when another thread calls signal().
        More on this below.]

        --void cond_broadcast(Cond* c);
        --Wake all threads waiting on c

    C. Important points

        (1) We MUST use "while", not "if". Why?

        --Because we can get an interleaving like this:

            --The signal() puts the waiting thread on the ready list
            but doesn't run it

            --That now-ready thread is ready to acquire() the mutex
            (inside cond_wait()).

            --But a *different* thread (a third thread: not the
            signaler, not the now-ready thread) could acquire() the
            mutex, work in the critical section, and now invalidates
            whatever condition was being checked

            --Our now-ready thread eventually acquire()s the
            mutex...

            --...with no guarantees that the condition it was
            waiting for is still true

        --Solution is to use "while" when waiting on a condition
        variable

        --DO NOT VIOLATE THIS RULE; doing so will (almost always)
        lead to incorrect code

            --NOTE: NOTE: NOTE: There are two ways to understand
            while-versus-if:
            (a) It's the 'while' condition that actually guards the program.
            (b) There's simply no guarantee when the thread comes out of
            wait that the condition holds. 

        (2) cond_wait releases the mutexes and goes into the waiting
        state in one function call (see panel 2b of the handout). 

            --QUESTION: Why?

        --Answer: can get stuck waiting.

            Producer: while (count == BUFFER_SIZE)
            Producer: release()
            Consumer: acquire()
            Consumer: .....
            Consumer: cond_signal(&nonfull)
            Producer: cond_wait(&nonfull)

        --Producer will never hear the signal!


    Discussion: thread_cond_wait implementation
    [skipped]

    --pthread implementation: 
    https://code.woboq.org/userspace/glibc/nptl/pthread_cond_wait.c.html

        thread_cond_wait(...) {

            release lock;

            do {
                wait-for-signal;
            } while (!signal)

            acquire lock;
        }


4. Monitors and standards

    Monitors = mutex + condition variables

    --High-level idea: an object (as in object-oriented systems)

    --in which methods do not execute concurrently; and

    --that has one or more condition variables

    --More detail

    --Every method call starts with acquire(&mutex), and ends with
    release(&mutex)

    --Technically, these acquire()/release() are invisible to the
    programmer because it is the programming language (i.e., the
    compiler+run-time) that is implementing the monitor

        --So, technically, a monitor is a programming language
        concept

        --But technical definition isn't hugely useful because no
        programming languages in widespread usage have true monitors

        --Java has something close: a class in which every method is
        set by the programmer to be "synchronized" (i.e., implicitly
        protected by a mutex)

        --Not exactly a monitor because there's nothing forcing
        every method to be synchronized

        --And we can *use* mutexes and condition variables to
        implement our own manual versions of monitors, though we
        have to be careful

    --Given the above, we are going to use the term "monitor" more
    loosely to refer to both the technical definition and also a
    "manually constructed" monitor, wherein:

        --all method calls are protected by a mutex (that is, the
        programmer inserts those acquire()/release() on entry and
        exit from every procedure *inside* the object)

        --synchronization happens with condition variables whose
        associated mutex is the mutex that protects the method calls

    --In other words, we will use the term "monitor" to refer to the
    programming conventions that you should follow when building
    multithreaded applications

    --Example: see handout week6.b, panel 1

    B. Standards & why?

      - RULES:
       --acquire/release at beginning/end of methods
       --hold lock when doing condition variable operations
       --always use "while" to check invariants, not "if"

      - why?

        --Mike Dahlin stands on the desk when proclaiming the standards

        --see Mike D's "Programming With Threads"

        --You are required to follow this document

        --You will lose points (potentially many!) on
        the exam if you stray from these standards

        --Note that in his example in section 4, there needs to be
        another line:

        --right before mutex->release(), he should have:
            assert(invariants hold)

       --the primitives may seem strange, and the rules may seem
    arbitrary: why one thing and not another?

        --there is no absolute answer here

        --**However**, history has tested the approach that we're
        using. If you use the recommended primitives and follow
        their suggested use, you will find it easier to write
        correct code

    --For now, just take the recommended approaches as a given,
    and use them for a while. If you can come up with something
    better after that, by all means do so!

    --But please remember three things:

        a. lots of really smart people have thought really hard
        about the right abstractions, so a day or two of
        thinking about a new one or a new use is unlikely to
        yield an advance over the best practices.

        b. the consequences of getting code wrong can be
        atrocious. see for example:

        http://www.nytimes.com/2010/01/24/health/24radiation.html
        http://sunnyday.mit.edu/papers/therac.pdf
        http://en.wikipedia.org/wiki/Therac-25

        c. people who tend to be confident about their abilities
        tend to perform *worse*, so if you are confident you are
        a Threading and Concurrency Ninja and/or you think you
        truly understand how these things work, then you may
        wish to reevaluate.....

        --Dunning-Kruger effect
          --https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

    C. The Commandments

        --RULE:

        --acquire/release at beginning/end of methods

        --RULE:

        --hold lock when doing condition variable operations

        --Some people
            [for example, Andrew Birrell in this paper:
                http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-35.pdf ]
        will say: "for experts only, no need to
        hold the lock when signaling". IGNORE THIS. Putting the signal
        outside the lock is only a small performance optimization, and
        it is likely to lead you to write incorrect code.

        --Different styles of monitors:

          --Hoare-style: signal() immediately wakes the waiter

          --Hansen-style and what we will use:
          signal() eventually wakes the waiter. Not an immediate transfer

        --RULE:

        --a thread that is in wait() must be prepared to be restarted at
        any time, not just when another thread calls "signal()".

        --why? because the implementor of the threads and condition
        variables package *assumes* that the user of the threads package
        is doing while(){wait()}.


5. Advice for concurrent programming

    A. Top-level piece of advice: SAFETY FIRST.

    --Locking at coarse grain is easiest to get right, so do
    that (one big lock for each big object or collection of
    them)

    --Don't worry about performance at first

    --In fact, don't even worry about liveness at first

        --In other words don't view deadlock as a disaster

    --Key invariant: make sure your program never does the wrong thing

    B. More detailed advice: design approach

    --Here's a four-step design approach:

        1. Getting started:

         1a. Identify units of concurrency. Make each a thread with
         a go() method or main loop. Write down the actions a thread
         takes at a high level.  

         1b. Identify shared chunks of state. Make each shared
         *thing* an object. Identify the methods on those objects,
         which should be the high-level actions made *by* threads
         *on* these objects. Plan to have these objects be protected by mutexes.

         1c. Write down the high-level main loop of each thread. 

        Advice: stay high level here. Don't worry about synchronization 
        yet. Let the objects do the work for you. 

        Separate threads from objects. The code associated with a
        thread should not access shared state directly (and so there
        should be no access to locks/condition variables in the
        "main" procedure for the thread). Shared state and
        synchronization should be encapsulated in shared objects. 

        --QUESTION: how does this apply to the example on the
        handout?
        --separate loops for producer(), consumer(), and
        synchronization happens inside MyBuffer.

        Now, for each object: 

        2. Write down the synchronization constraints on the
        solution. Identify the type of each constraint: mutual
        exclusion or scheduling. For scheduling constraints, ask,
        "when does a thread wait"?

        --NOTE: usually, the mutual exclusion constraint is
        satisfied by the fact that we're programming with
        monitors.

        --QUESTION: how does this apply to the example on the
        handout?
            --Only one thread can manipulate the buffer at a time
            (mutual exclusion constraint)
            --Producer must wait for consumer to empty slots if all
            full (scheduling constraint)
            --Consumer must wait for producer to fill slots if all
            empty (scheduling constraint)

        3. Create a lock or condition variable corresponding to each 
        constraint 

        --QUESTION: how does this apply to the example on the
        handout?
            --Answer: need a lock and two condition variables.
            (lock was sort of a given from the fact of a monitor).

        4. Write the methods, using locks and condition variables for 
        coordination  


[thanks to David Mazieres, Mike Dahlin, and Mike Walfish
for content in portions of this lecture.]