week 11a
CS6640
03/17 2026
https://naizhengtan.github.io/26spring/

□ 1. on the difficulty of concurrency
□ 2. challenges beyond interleaving
□ 3. consistency model
---

1. On the difficulty of concurrency

    [cont'ed from last time]

    --why is concurrency hard?

        *** Hard to reason about all possible interleavings
        *** Source of non-determinism
        *** (deeper reason: human brain is "single-threaded")

    --handout:

    panel 1:
    Question: what can go wrong?
        1a:  x = 1 or x = 2.
        1b:  x = 13 or x = 25.
        1c:  x = 1 or x = 2 or x = 3 

            say x is at mem location 0x5000

            f is "x = x+1;", which might compile to:

            lw   t0, 0x5000    # load x into register t0
            addi t0, t0, 1     # t0 = t0 + 1
            sw   t0, 0x5000    # store result back to x

            g is "x = x+2;", which might compile to:

            lw   t1, 0x5000    # load x into register t1
            addi t1, t1, 2     # t1 = t1 + 2
            sw   t1, 0x5000    # store result back to x


    panel 2: incorrect list structure
    Question: what can go wrong?

       both threads will insert new List_elem as head
       one List_elem will be lost (later threads cannot access it)


    points:

       --all of these are called **race conditions**;
                not all of them are errors, though

       --all of these can happen on one-CPU machine
         (multi-core case can be worse; will see soon)

   --What are you going to do when debugging concurrency code?

       --worst part of errors from race conditions is that a program
       may work fine most of the time but only occasionally show
       problems. why?  (because the instructions of the various threads
       or processes or whatevever get interleaved in a
       non-deterministic order.)

       --and it's worse than that because inserting debugging code may
       change the timing so that the bug doesn't show up

2. challenges beyond interleaving

    --hardware makes the problem even harder

      [look at handout panel 3; what is the correct answer?]

      [draw these cases on board]

      [answer: "it depends on the hardware"]

    Question: why? how come these happen?
      [will answer soon]

    ...before that, what occurred in the previous examples?
    ...those executions correspond to interleavings that humans can reason about...
    ...this property is called sequential consistency.

    (sequential consistency not always in effect,
     which is the short answer to the question)

    --sequential consistency means:
        or what I call, "formal definition of how normal human being
        thinks of concurrency"

      --if a memory has _sequential consistency_, it means that:

          a concurrent program running on multiple CPUs
            looks as if
          the same program (with the same number of threads)
          running on a single CPU.

        or, more formally,

          there always _exists_ a sequential order of memory operations
          that is _equivalent_ to what the concurrent program produces
          running on multiple CPUs.

      - On multiple CPUs, we can get "interleavings" _that are
      impossible on single-CPU machines_. In other words, the number
      of interleavings that you have to consider is _worse_ than
      simply considering all possible interleavings of sequential code.

      - explain why sequential consistency is not held:

        -- CPUs -> Memory            (too slow)
        -- CPUs -> Caches -> Memory  (Write stall: invalidating cache requires a long time)
        -- CPUs -> StoreBuffers -> Caches -> Memory

        -- more...
          -- out-of-order execution
          -- speculative execution

       [draw:

          +-----+           +-----+
          | CPU |           | CPU |
          |     |           |     | [later: StoreBuffer]
          |     |           |     | [later: cache]
          +--+--+           +--+--+
             |                 | 
             +-------+---------+
                     |
                     v
               +------------+
               |   memory   |
               +------------+

       ]

      - To whom interested in knowing more about the
        details of memory consistency:
         --http://sadve.cs.illinois.edu/Publications/computer96.pdf
         --Paul E. McKenney, "Memory Barriers: a Hardware View for Software Hackers"

    --when you program, most of the time,
      you can assume sequential consistency in your mental model;
      better---if you use synchronization primitives properly.

  - in fact, checking sequential consistency (SC) is NP-hard

  - multi-core CPU architecture
    [show slides]

  - But it's not just hardware:

    -- Compiler may play a role as well.
    -- here is an example:

        int x = 0;

        void* foo(void* arg) {
            for (int i = 0; i < 1000000; i++) {
                x = x + 1;
            }
            return NULL;
        }

    Q: if foo() is run on two threads,
      what can be the possible output?

    Q: Can we see "1000000"?
    [YES! because a compiler may optimize foo() as

      """
       x = x + 1000000;
      """

      The two threads may interleave and have:
      (tmp1/2 are registers in CPU, x in memory)

       tmp1 = x               // thread 1
       tmp1 = tmp1 + 1000000  // thread 1 (tmp1 = 1000000)
       tmp2 = x               // thread 2 (tmp2 = 1000000)
       tmp2 = tmp2 + 1000000  // thread 2
       x = tmp1               // thread 1
       x = tmp2               // thread 2
    ]

  -- point: concurrency is HARD

3. consistency model

  -- formal specification of correctness for concurrent operations

  -- a core trade-off:

     strong consistency
     (execution order approximates a human-expected sequential schedule)
        vs.
    performance
    (system throughput and latency )

  -- hierachical of different consistency models
    See: https://jepsen.io/consistency/models

  -- checking them is an active research field
    See: https://sccp-workshop.github.io/sccp26/index.html