Week 5.a CS5600 10/05 2021 https://naizhengtan.github.io/21fall/ 1. Last time 2. Condition variable, continued 3. Monitors and standards 4. Advice 5. Practice with concurrent programming -------------------------------------------- Admin - Lab2 shell, released -- not much code, but a lot to absorb; start earlier -- a common feeling: "I kind of know how it works, but...not sure how to implement" -- read instructions carefully and multiple times; many hints - midterm -- talking to Khoury operation team -- (likely) 10/22 (Fri, week 7) -- where? unclear -- (likely) midterm review, a day before -- where? unclear - survey: -- attendees: 50% -- hard: 55% -- suffering: 20% (80% for neural + enjoyable) -- hard & enjoyable : 20% -------------------------------------------- 1. Last time -- concurrency world is dangerous -- critical section provides atomicity and helps us keep invariants -- two types of synchronization -- mutual exclusion -- scheduling constraints -- Mutexes -- acquire(): entering critical section -- release(): leaving critical section -- notice: -- two mutexes do not provide exclusion -- a thread re-acquiring the same mutex: deadlock -- Condition variables -- motivation (1 soda producer and 100 soda consumers) -- concept 2. Condition variables, --APIs --void cond_init (Cond *, ...); --Initialize --void cond_wait(Cond *c, Mutex* m); --Atomically unlock m and sleep until c signaled --Then re-acquire m and resume executing --void cond_signal(Cond* c); --Wake one thread waiting on c --void cond_broadcast(Cond* c); --Wake all threads waiting on c --two important points (1) We MUST use "while", not "if". Why? --Because we can get an interleaving like this: --The signal() puts the waiting thread on the ready list but doesn't run it --That now-ready thread is ready to acquire() the mutex (inside cond_wait()). --But a *different* thread (a third thread: not the signaler, not the now-ready thread) could acquire() the mutex, work in the critical section, and now invalidates whatever condition was being checked --Our now-ready thread eventually acquire()s the mutex... --...with no guarantees that the condition it was waiting for is still true --Solution is to use "while" when waiting on a condition variable --DO NOT VIOLATE THIS RULE; doing so will (almost always) lead to incorrect code --NOTE: NOTE: NOTE: There are two ways to understand while-versus-if: (a) It's the 'while' condition that actually guards the program. (b) There's simply no guarantee when the thread comes out of wait that the condition holds. (2) cond_wait releases the mutexes and goes into the waiting state in one function call (see panel 2b of the handout week4a). --QUESTION: Why? --Answer: can get stuck waiting. Producer: while (count == BUFFER_SIZE) Producer: release() Consumer: acquire() Consumer: ..... Consumer: cond_signal(&nonfull) Producer: cond_wait(&nonfull) --Producer will never hear the signal! Discussion: thread_cond_wait implementation -- cond_wait() implementation --pthread implementation: https://code.woboq.org/userspace/glibc/nptl/pthread_cond_wait.c.html thread_cond_wait(...) { release lock; do { wait-for-signal; } while (!signal) acquire lock; } -- Semaphores --Advice: don't use these. We're mentioning them only for completeness and for historical reasons: they were the first general-purpose synchronization primitive, and they were the first synchronization primitive that Unix supported. --Reasons we recommend against their use: --semaphores are dual-purpose (for mutual exclusion and scheduling constraints), so hard to read code and hard to get code right --semaphores have hidden internal state --(In this class, we require you to use condition variables and mutexes; semaphores will be considered incorrect.) 3. Monitors and standards Monitors = mutex + condition variables --High-level idea: an object (as in object-oriented systems) --in which methods do not execute concurrently; and --that has one or more condition variables --More detail --Every method call starts with acquire(&mutex), and ends with release(&mutex) --Technically, these acquire()/release() are invisible to the programmer because it is the programming language (i.e., the compiler+run-time) that is implementing the monitor --So, technically, a monitor is a programming language concept --But technical definition isn't hugely useful because no programming languages in widespread usage have true monitors --Java has something close: a class in which every method is set by the programmer to be "synchronized" (i.e., implicitly protected by a mutex) --And we can *use* mutexes and condition variables to implement our own manual versions of monitors, though we have to be careful --Given the above, we are going to use the term "monitor" more loosely to refer to both the technical definition and also a "manually constructed" monitor, wherein: --all method calls are protected by a mutex (that is, the programmer inserts those acquire()/release() on entry and exit from every procedure *inside* the object) --synchronization happens with condition variables whose associated mutex is the mutex that protects the method calls --In other words, we will use the term "monitor" to refer to the programming conventions that you should follow when building multithreaded applications --Example: see handout week5.a, panel 1 B. Standards & why? - RULES (3/6): [read all six rules from Mike Dahlin's writeup] --acquire/release at beginning/end of methods --hold lock when doing condition variable operations --always use "while" to check invariants, not "if" - why? --Mike Dahlin stands on the desk when proclaiming the standards --see Mike D's "Programming With Threads" --You are required to follow this document --You will lose points (potentially many!) on the exam if you stray from these standards --Note that in his example in section 4, there needs to be another line: --right before mutex->release(), he should have: assert(invariants hold) --the primitives may seem strange, and the rules may seem arbitrary: why one thing and not another? --there is no absolute answer here --**However**, history has tested the approach that we're using. If you use the recommended primitives and follow their suggested use, you will find it easier to write correct code --For now, just take the recommended approaches as a given, and use them for a while. If you can come up with something better after that, by all means do so! --But please remember three things: a. lots of really smart people have thought really hard about the right abstractions, so a day or two of thinking about a new one or a new use is unlikely to yield an advance over the best practices. b. the consequences of getting code wrong can be atrocious. see for example: http://sunnyday.mit.edu/papers/therac.pdf http://en.wikipedia.org/wiki/Therac-25 c. people who tend to be confident about their abilities tend to perform *worse*, so if you are confident you are a Threading and Concurrency Ninja and/or you think you truly understand how these things work, then you may wish to reevaluate..... --Dunning-Kruger effect https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect C. The Commandments --RULE: --acquire/release at beginning/end of methods --RULE: --hold lock when doing condition variable operations --Some people may say: "no need to hold the lock when signaling". IGNORE THIS. Putting the signal outside the lock is only a small performance optimization, and it is likely to lead you to write incorrect code. --Different styles of monitors: --Hoare-style: signal() immediately wakes the waiter --Hansen-style and what we will use: signal() eventually wakes the waiter. Not an immediate transfer --RULE: --a thread that is in wait() must be prepared to be restarted at any time, not just when another thread calls "signal()". (so always use "while") --why? because the implementor of the threads and condition variables package *assumes* that the user of the threads package is doing while(){wait()}. --Question: Can we replace SIGNAL with BROADCAST, given our monitor semantics? (Answer: yes, always.) Why? --while() condition tests the needed invariant. program doesn't progress pass while() unless the needed invariant is true. --result: spurious wake-ups are acceptable.... --...which implies you can always wakeup a thread at any moment with no loss of correctness.... --....which implies you can replace SIGNAL with BROADCAST [though it may hurt performance to have a bunch of needlessly awake threads contending for a mutex that they will then acquire() and release().] --Question: Can we replace BROADCAST with SIGNAL? --Answer: not always. --Example: --memory allocator --threads allocate and free memory in variable-sized chunks --if no memory free, wait on a condition variable --now posit: --two threads waiting to allocate chunks of memory --no memory free at all --then, a third thread frees 10,000 bytes --SIGNAL alone does the wrong thing: we need to awaken both threads 4. Advice for concurrent programming A. Top-level piece of advice: SAFETY FIRST. --Locking at coarse grain is easiest to get right, so do that (one big lock for each big object or collection of them) --Don't worry about performance at first --In fact, don't even worry about liveness at first --In other words don't view deadlock as a disaster --Key invariant: make sure your program never does the wrong thing B. More detailed advice: design approach [We will use handout week5.a as a case study.....] --Here's a four-step design approach: 1. Getting started: 1a. Identify units of concurrency. Make each a thread with a go() method or main loop. Write down the actions a thread takes at a high level. 1b. Identify shared chunks of state. Make each shared *thing* an object. Identify the methods on those objects, which should be the high-level actions made *by* threads *on* these objects. Plan to have these objects be protected by mutexes. 1c. Write down the high-level main loop of each thread. Advice: stay high level here. Don't worry about synchronization yet. Let the objects do the work for you. Separate threads from objects. The code associated with a thread should not access shared state directly (and so there should be no access to locks/condition variables in the "main" procedure for the thread). Shared state and synchronization should be encapsulated in shared objects. --QUESTION: how does this apply to the example on the handout? --separate loops for producer(), consumer(), and synchronization happens inside MyBuffer. Now, for each object: 2. Write down the synchronization constraints on the solution. Identify the type of each constraint: mutual exclusion or scheduling. For scheduling constraints, ask, "when does a thread wait"? --NOTE: usually, the mutual exclusion constraint is satisfied by the fact that we're programming with monitors. --QUESTION: how does this apply to the example on the handout? --Only one thread can manipulate the buffer at a time (mutual exclusion constraint) --Producer must wait for consumer to empty slots if all full (scheduling constraint) --Consumer must wait for producer to fill slots if all empty (scheduling constraint) 3. Create a lock or condition variable corresponding to each constraint --QUESTION: how does this apply to the example on the handout? --Answer: need a lock and two condition variables. (lock was sort of a given from the fact of a monitor). 4. Write the methods, using locks and condition variables for coordination D. More advice 1. Don't manipulate synchronization variables or shared state variables in the code associated with a thread; do it with the code associated with a shared object. --Threads tend to have "main" loops. These loops tend to access shared objects. *However*, the "thread" piece of it should not include locks or condition variables. Instead, locks and CVs should be encapsulated in the shared objects. --Why? (a) Locks are for synchronizing across multiple threads. Doesn't make sense for one thread to "own" a lock. (b) Encapsulation -- details of synchronization are internal details of a shared object. Caller should not know about these details. "Let the shared objects do the work." --Common confusion: trying to acquire and release locks inside the threads' code (i.e., not following this advice). Bad idea! Synchronization should happen within the shared objects. Mantra: "let the shared objects do the work". --Note: our first example of condition variables -- handout w4.a, item 2b -- doesn't actually follow the advice, but that is in part so you can see all of the parts working together. 2. Different way to state what's above: --You want to decompose your problem into objects, as in object-oriented style of programming. --Thus: (1) Shared object encapsulates code, synchronization variables, and state variables (2) Shared objects are separate from threads 5. Practice with concurrent programming --we **guarantee** to test concurrent programming in this course --Producer/consumer problem doesn't have "complex conditions" for accessing the shared object (the shared buffer); let's see a slightly complicated example. --an example: workers interact with a database --motivation: banking, airlines, etc. --readers never modify database --writers read and modify data --using only a single mutex lock would be overly restrictive. Instead, want --many readers at the same time --only one writer at a time --let's follow the concurrency advice ..... 1. Getting started a. what are units of concurrency? [readers/writers] b. what are shared chunks of state? [database] c. what does the main function look like? --read() check if writers (wait until no writers) access DB --write() check if readers or writers (wait until no readers or writers) access DB 2. and 3. Synchronization constraints and objects --only one thread manipulates reader/writer numbers at a time; only one thread modifies DB at a time. NOTE: **this does not mean only one thread in the DB at a time** (mutex) --reader can access DB when no writers (condition: okToRead) --writer can access DB when no other readers or writers (condition: okToWrite) 4. write the methods --inspiration required: int AR = 0; // # active readers int AW = 0; // # active writers int WR = 0; // # waiting readers int WW = 0; // # waiting writers --see handout w5.a for the code --QUESTION: why not just hold the lock all the way through "Execute req"? (Answer: the whole point was to expose more concurrency, i.e., to move away from exclusive access.) --QUESTION: what if we had shared locks? The implementation of shared locks is given on the handout [read handout w5.a, panel 3; HW3] [thanks to David Mazieres and Mike Dahlin for content in portions of this lecture.]