Week 13.a
CS 5600
11/30 2021

On the board
------------

1. Last time
2. Crash recovery: journaling
3. Security intro
4. Authentication


-------------------------------------------------------
Admin

--final exam: 12/14, Tue, 13:35

--lab4 errata (constantly updating):
  https://naizhengtan.github.io/21fall/notes/fix/fix04.html

-------

1. Last time

  --mkdir
    --five blocks to update
    --bitmap
    --parent dir's inode
    --parent dir's data block
    --dir's inode
    --dir's data block

  --crash recovery
    --ad-hoc: fsck
    --COW fs

2. Crash recovery: Journaling

    -- Copy on write showed that crash consistency is achievable when
    modifications **do not** modify (or destroy) the current copy. 

    Golden rule of atomicity, per Saltzer-Kaashoek:
    "never modify the only copy"

    -- Problem is that copy-on-write carries significant write and space overheads.
       What to do if we have a small disk?

    -- borrowing ideas from how transactions are implemented in databases

    -- a transaction: a set of operations that either happen all together or none
       of them happen.

    -- Core idea: Treat file system operations as transactions. Concretely, this means that
       after a crash, failure recovery ensures that:
        * Committed file system operations are reflected in on-disk data structures.
        * Uncommitted file system operations are not visible after crash recovery.

    -- Core mechanism: Record enough information to finish applying committed operations 
       (*redo operations*) and/or roll-back uncommitted operations (*undo operations*). 
       This information is stored in a redo log or undo log. Discuss this in detail next.

    --concept: commit point: the point at which there's no turning back.

        --actions always look like this:
        --first step 
        ....            [can back out, leaving no trace]
        --commit point
        .....           [completion is inevitable]
        --last step

        --Question: what's commit point when buying a house?

        --Question: what's the commit point in in the copy-on-write fs?
          [answer: the uberblock is updated.]

        -- Redo logging
            * Used by Ext3 and Ext4 on Linux, going to discuss in that context.

            * Log is a fixed length ring buffer placed at the beginning of the disk
              (see handout week12a).

            * Basic operations

                Step 1: planning
                filesystem computes what would change due to an operation. For instance,
                creating a new file involves changes to directory inodes, appending to a file 
                involves changes to the file's inode and data blocks.

                Step 2: begin txn
                the file system computes where in the log it can write this transaction,
                and writes a transaction begin record there (TxnBegin in the handout). This 
                record contains a transaction ID, which needs to be unique. The file system 
                **does not** need to wait for this write to finish and can immediately proceed to
                the next step.

                Step 3: write to journal
                the file system writes a record or records detailing all the changes it computed in 
                step 1 to the log. The file system **must** now wait for these log changes and
                the TxnBegin record (step 2) to finish being written to disk.

                Step 4: commit txn
                once the TxnBegin record, and all the log records from step 3 have been
                written, the system writes a transaction end record (TxnEnd in the handout). 
                This record contains the same transaction ID as was written in Step 2, and the 
                transaction is considered committed once the TxEnd has been successfully written to disk.

                Step 5: checkpointing
                Once the TxnEnd record has been written, the filesystem asynchronously
                performs the actual file system changes; this process is called **checkpointing**. 
                While the system is free to perform checkpointing whenever it is convenient, 
                the checkpoint rate dictates the size of the log that the system must reserve.

            --Question: which step is  the commit point?
                [answer: step 4; why? see recovery below]

            * Crash recovery: During crash recovery, the filesystem needs to read through the logs,
              determine the set of **committed** operations, and then apply them. Observe that:
              -- The filesystem can determine whether a transaction is committed or not by comparing 
                 transaction IDs in TxnBegin and TxnEnd records.
              -- It is safe to apply the same redo log multiple times. 

              Operationally, when the system is recovering from a crash, the system 
              does the following:

                Step 1: The file system starts scanning from the beginning of the log. 
                Step 2: Every time it finds a TxnBegin entry, it searches for a 
                    corresponding TxnEnd entry.
                Step 3: If matching TxnBegin and TxnEnd entries are found -- indicating that
                    the transaction is committed -- the file system applies (checkpoints) the
                    changes.
                Step 4: Recovery is completed once the entire log is scanned.

                Note, for redo logs, filesytems generally begin scanning the log from the
                **start of the log**.

              --Now, let's revisit crash in these five steps.
                convince yourself that we're good when fs crashes at any moment.

            [skip in class]
            * What to log?
            Observe that logging can double the amount of data written to disk.
            To improve performance, Ext3 and 4 allow users to choose what to log.
                * Default is to log only metadata. The idea here is that many people
                  are willing to accept data loss/corruption after a crash, but 
                  keeping metadata consistent is important. This is because if metadata is
                    inconsistent the FS may become unusable, as the data
                    structures no longer have integrity.
                * Can change settings to force data to be logged, along with metadata.
                  This incurs additional overheads, but prevents data loss on crash.

        -- Undo logging
            * Not used in isolation by any file system.

            * Key idea: Log contains information on how to rollback any changes made
            to data. Mechanically, during normal operations:

            Step 1: begin txn
            Write a TxBegin entry to the log.

            Step 2: write to journal and checkpointing
            For each operation, write instructions for how to undo any updates
            made to a block. These instructions might include the original data in the
            block. In-place changes to the block can be made right after these instructions
            have been persisted.

            Step 3: finish checkpointing
            Wait for in-place changes (what we referred to as checkpointing) to finish
            for all blocks.

            Step 4: commit txn
            Write a TxnEnd entry into the block, thereby committing the transaction.
            *Note* this implies that if a transaction is committed, then all changes have been written to 
            the actual data structures of the file system.

            During crash recovery:
            Step 1: Scan the log to find all uncommitted transactions, these are ones where a
            TxnBegin entry is present, but no TxnEnd entry is found.
            Step 2: For each such transaction check to see whether the undo entry is valid. This
            is usually done through the use of a checksum.
                Why do we need this? Remember a crash might occur before
                the undo entry has been successfully written. If that
                happened, then (by the procedure described above), the
                actual changes corresponding to this undo entry have not
                been written to disk, so ignoring this entry is safe. On
                the other hand, trying to undo using a partially
                complete entry might result in data corruption, so using
                this entry would be **unsafe**.
            Step 3: Apply all valid undo entries found, in order to restore the disk
            to a consistent state.

            Note, for undo logs, logs are generally scanned from the
            **end of the log**.

            * Advantage: Changes can be checkpointed to disk as soon as the undo log
            has been updated. This is beneficial when the amount of buffer cache is
            low.

            * Disadvantage: A transaction is not committed until all dirty blocks have
            been flushed to their in-place targets.

        -- Redo Logging vs Undo Logging

            This is just a recap of the advantages and disadvantages.

            **Redo logging**

            * Advantage: A transaction can commit without all in-place updates (writes
            to actual disk locations) being completed. Updating the journal is sufficient.
                Why is this useful? In-place updates might be scattered all over the disk,
                so the ability to delay them can help improve performance.

            * Disadvantage: A transaction's dirty blocks need to be kept in the buffer-cache
              until the transaction commits and all of the associated journal entries have
              been flushed to disk. This might increase memory pressure.

            **Undo log**

            * Advantage: A dirty block can be written to disk as soon as the undo-log entry
            has been flushed to disk. This reduces memory pressure.

            * Disadvantage: A transaction cannot commit until all dirty blocks have been flushed
            to disk. This imposes additional constraints on the disk scheduler, might result in 
            worse performance.

            --a trade-off between
                memory usage (buffer-cache size) vs. txn commit time


        -- Combining Redo and Undo Logging

            * Done by NTFS. 

            * Goals:
                - Allow dirty buffers to be flushed as soon as their associated journal entries
                are written. This can reduce memory pressure when necessary.
                - Transactions commit as soon as logging is done, so the system has greater flexibility
                when scheduling disk writes.


            * Why? Designed for a time when the same Operating System ran on machines with very
            little memory (8-32MB), and also on "big-iron" servers with lots of memory (1GB+). 
            This was an attempt to get the best of both worlds. 


3. Security intro

   Question: which option will you choose:
     option 1: 50% losing 500 dollars
     option 2: 0.1% losing all the money you have

     [expected answer: option 2, because the probability is so low that it is
     unlikely to happen. BUT, the consequences is devastating!]

      --this reflects why many security approaches usually have lower priority
      than other features

      --and some security-enhanced systems have never been widely deployed

   Also, security is a pervasive design issue which is context sensitive.

   Before commenting a system to be "secure" or "insecure",
   there are a bunch of questions to make clear:

     -- what're the security assumptions?
        (are you assuming the entire OS is trustworthy and bug-free?)

     -- how strong the adversaries are?
        (computational resources + what they can/cannot do)

     -- what are entities to be protected?
        (data? code? execution? or what?)

     -- ...

   [an analogy is "performance":
     it is hard to comment a system to be performant without specifying
     the workload, the used resources, the performance target, and many other factors.
   ]

   In fact, security is a broad topic that include:
     (copied from S&P'22 call for paper,
     https://www.ieee-security.org/TC/SP2022/cfpapers.html)

    - Applied cryptography 
          [think of Lab1]
    - Attacks with novel insights, techniques, or results
          [we will learn an old-school attack: buffer overflow]
    - Authentication, access control, and authorization
          [we will see Authentication]
    - Blockchains and distributed ledger security
    - Cloud computing security
    - Cyber physical systems security
    - Distributed systems security
    - Economics of security and privacy
    - Embedded systems security
    - Formal methods and verification
    - Hardware security
    - Hate, Harassment, and Online Abuse
    - Intrusion detection and prevention
    - Machine learning and computer security
    - Malware and unwanted software
    - Network security
    - Operating systems security (*)
    - Privacy-enhancing technologies, anonymity, and censorship
    - Program and binary analysis
    - Protocol security
    - Security and privacy metrics
    - Security and privacy policies
    - Security architectures
    - Security foundations
    - Systems security
    - Usable security and privacy
    - Web security
    - Wireless and mobile security/privacy

  --In OS security, there are three main goals:

    * confidentiality: protecting data or other information
      from being fetched by other parties.

    * integrity: protecting the data or executions to be in the
      "correct" state, without being tampered by adversaries

    * availability: allow systems to function when valid users
      require services

  --an example:
    [with some possible security topics]

    calculating "1 + 2"

    And you want...

      // regarding confidentiality:

    ...to prevent other processes/users to see the results
        [memory isolation, file system permission, cryptography]

    ...to prevent the OS to see the computation
        [Trust Execution Environment, Intel SGX
         Homomorphic Encryption]

    ...(if you've run many programs)
       to retrieve one result without letting others know which one result you fetch
        [Private Information Retrieval, Oblivious RAM]

      // regarding integrity:

    ...to prevent the data ("1", "2", and the result "3") from being modified
        [data integrity storage, memory integrity]

    ...to prevent the computation from being tampered with
        [Execution Integrity: replication, attestation, and verifiable computation]

      // regarding availability:

    ...to prevent adversaries from disrupting the calculation and making the
    machine unavailable (deny-of-service attack)
        [DoS and DDoS defense]


4. Authentication:

  Authentication is the process of verifying one's identity.

  Approach 1: password

    --more broadly, this is based on something that the user **knows**.
      (other examples are security questions, PIN, ...)

    Passwords were originally deployed in the 1960s for access to time-shared
    mainframe computers.

    --plaintext passwords stored in files

    --attack: read the file

    --hashed passwords (assumption: you cannot revert a hash function)

    --attack: rainbow table attack
      --pre-compute hashes for all possible strings
      --find the users' password hashes in the rainbow table
      --return the plaintext password

    --hashed and salted password (in 1979, Robert Morris and Ken Thompson)
      --pair a password wit ha "salt" (a random number, like 128bits)
      --store the salted hash [=hash(password + salt)]
      --the password file contains: salted-hash and salt

    --Question: why rainbow table attack is not effective in this case?
      [answer: because a comprehensive rainbow table would be 2^128 times
      larger than the original rainbow table!]

    However, here is the password status quo:
      --Empirical estimates suggest that over 40% of sites store passwords unhashed
      --plaintext passwords: Rockyou and Tianya
      --hashed but unsalted: LinkedIn
      --improperly hashed: Gawker

    [J. Bonneau and S. Preibusch. The password thicket: technical and market
    failures in human authentication on the web. WEIS 2010.]

    --Unix login (classic version)

      1. A privileged login process asks for the username,
        which is echoed on the screen.

      2. The login process asks for the password,
        which is not echoed.

      3. The login process checks the username and password in the password file.
        -- check salted-hashed password

      4. If succeed, login forks, exec shell with the user's id, and switches to
      the home directory of the user.

      Question:
      (a) why password has no echo on screen?
      (b) if failed, should the login tell users if the username or the password is incorrect?
      (c) if failed, should it takes longer to reject a wrong username?


  [next time]

  Approach 2: based on what you have

  Approach 3: authentication by what you are


---
* How does NTFS work with undo/redo logs?

* basic operations

    Step 1: filesystem computes what would change due to an operation. For instance,
    creating a new file involves changes to directory inodes, appending to a file 
    involves changes to the file's inode and data blocks.

    Step 2: the file system computes where in the log it can write this transaction,
    and writes a transaction begin record there (TxnBegin in the handout). This 
    record contains a transaction ID, which needs to be unique. The file system 
    **does not** need to wait for this write to finish and can immediately proceed to
    the next step.

    Step 3: the file system writes both a redo log entry and an undo log entry for each
    of the changes it computed in step 1. These live together. The filesystem can begin
    making in-place changes (checkpointing changes) the moment this undo + redo log
    information has been written.  

    Step 4: Wait until the TxnBegin record, and all the log records from step 3, have been
    written, the system writes a transaction end record (TxnEnd in the handout). 
    This record contains the same transaction ID as was written in Step 2, and the 
    transaction is considered committed once it has been successfully written to disk.

    Step 5: Similar to the redo logging case, the filesystem asynchronously continues to
    checkpoint/perform in-place writes whenever it is convenient.

* Recovery

    For crash recovery, the filesystem first goes through the log finding all 
    committed transactions, and using the redo entry within them to apply committed 
    changes.

    Next it scans through the log (backwards) finding all uncommitted transactions,
    and uses the undo entries associated with these to undo any in-place updates.