Week 15.a
CS 5600
04/24 2022

On the board
------------

0. Lab3
1. Last time
2. Access control (Unix)

----------------------------------------------------

Admin

--Final review this Friday online; stay tuned

--Lab4's grading is the test cases you have.

--Lab3
  --how grading works
  --some bummers, working on it
  --Lab3 regrading session (TBD)

--Lab3 challenge
  --statistics
  --KV-store implementation
  --how you handle concurrency


-----

1. Last time: authentication

  Last time:
  Approach 1: password
    -- authenticated by *what you know*

  Approach 2: authenticating based on what you have

    --idea: something the user has can prove identity,
      for example, ID card (without photo),
      security token, smart card, ...

    --NEU's two factor authentication

  Approach 3: authentication by what you are

   --idea: unique biology features or behaviors can identify one person,
     for example, fingerprints, DNA, Apple face id, ...

   -- examples of using all three approaches:

    NEU's two factor authentication + iPhone (face id)
    OR
    NEU's two factor authentication + Android (fingerprint)


  As an aside, question: 
  which option will you choose:
       option 1: 50% losing 500 dollars
       option 2: 0.1% losing all the money you have

       [expected answer: option 2, because the probability is so low that it is
       unlikely to happen. BUT, the consequences is devastating!]

        --this reflects why many security approaches usually have lower priority
        than other features

        --and some security-enhanced systems have never been widely deployed


2. Access control (Unix)

  The problem of access control:

    A subject accesses an object. Should OS allow or deny?

    subjects: users, processes, or any other actors

    objects: files, devices, or any other resources

    (different abstractions will give you different subjects/objects).

    There are two common approaches:
      --access control list (ACL)
      --capability-based

    Both are used in today's OSes.

    At a high level:

    ACL usually associates with objects.
      When an subject accesses an object,
      system checks if subject is in the obj's access list

    Capability usually associates with subjects.
      When an subject accesses an object,
      system checks if subject has the capability to access the obj.


 A. Intro Unix's access control

    * UIDs and GIDs

    UIDs are historically unsigned 16-bit integers (0-65535).

    UNIX keeps the mapping between usernames and UIDs in the file /etc/passwd.
      [try "$ cat /etc/passwd"]

    see your UID by:
      $ id <username>
      and you can get your username by "$ whoami"

    special user: uid 0, called root, treated specially by
    the kernel as administrator

        uid 0 has all permissions: can read any file, do anything

        certain ops only root can do:
        --binding to ports less than 1024
        --change current process's user or group ID
        --mount or unmount file systems
        --opening raw sockets (so you can do something like ping remote machines,
        for example)
        --set clock
        --halt or reboot machine
        --change UIDs (so login program needs to run as root)

    GIDs are also 16-bit integers.
    A group represents a group of users.
      [see all groups by "$ cat /etc/group"]

    * processes have a user ID and one or more group IDs

      when a process runs, it is associated with UID/GIDs
        [see them by "$ ps -l"]

    * files and directories are access-controlled.

        you saw this in Lab4 (recall "mode" in inode)

        system stores with each file who owns it.

        where's the info stored? (answer: inode.)

    [draw figure of a Unix system with UIDs]

      [running processes]
      User (uid=1000) --> login (uid=0)
                            | (check username/passwd)
                            |
                            +----> shell (uid=1000)
                                     |
                                     +--> vim (uid=1000)
                                     +--> gcc (uid=1000)
                                     +--> chrome (uid=1000)

      [fs]
        / (owner:uid=0)
        |
        +-->home (owner:uid=0)
             |
             +--> user (owner: uid=1000)
                   |
                   +-> ...

      note: devices are abstracted as files in Unix, so they are
      access-controlled in the same manner.

    Notice that login is authentication (our last topic).
    Questions:
      (a) why password has no echo on screen?
      (b) if failed, should the login tell users if the username or the password is incorrect?
      (c) if failed, should it takes longer to reject a wrong username?


 B. Setuid

    So far so good, are we done?

    Question: how can users update their password stored in "/etc/passwd"?
    Can a user "vim /etc/passwd" and modify?
    [answer: of course no! The file also contains other users' information.]

    --Some legitimate actions require more privs than UID
        --E.g., how should users change their passwords?
        --Passwords are stored in root-owned /etc/passwd and /etc/shadow files

    --going to go into a bit of detail. 
      why? because setuid/setgid are the sole means on Unix to *raise* a
      process's privilege level

        --Solution:  Setuid/setgid programs

           idea: a way for root -- or another user -- to delegate its
           ability to do something.

        --special "setuid" bit in the permissions of a file

        --Run with privileges of file's owner or group

        --Each process has _real_ and _effective_ UID/GID

        -- _real_ is user who launched setuid program

        -- _effective_ is owner/group of file, used in access checks

        --for setuid programs, on exec() of binary, kernel sets
        effective uid = file uid

    --Examples:

        --/usr/bin/passwd : change a user's passwd. User needs to be
        able to run this, but only root can modify the password file.

        --/bin/su: change to new user ID if correct password is typed.

            $ ls -l `which passwd`
            -rwsr-xr-x 1 root root 63736 Jul 27  2021 /usr/bin/passwd

            $ ls -l `which su`
            -rwsr-xr-x 1 root root 63568 Jan 10  2021 /usr/bin/su

        [note the 's' in "-rwsr-xr-x"]

        --Obviously need to own file to set the setuid bit
        --Need to own file and be in group to set setgid bit

    --Have to be EXTREMELY careful when writing setuid code

        --Here's an example for intuition

            Imagine you leave your terminal unattended, and some other
            user ("attacker") sits down and types:

            $ cp /bin/sh /tmp/break-acct
            $ chmod 4755 /tmp/break-acct

            the leading 4 sets the setuid bit.
            the 755 means  "rwxr-xr-x"

            Question: what will happen if the attacker (or anyone else)
            later runs:

            $ /tmp/break-acct -p

            result: attacker now has a shell with your privileges and
            can do anything you can do (read your private files, remove
            them, overwrite them, etc.). in fact anyone on the system
            can run break-acct to get the same effect (since it's
            world-executable).

            More generally, imagine that you are writing a program on a
            shared system, you are the owner, and you set the setuid bit

            What you are doing is letting that program run with *your*
            privileges.

        --Of course that was an attack. Sometimes people intentionally
        install setuid-root binaries. When you do that, as a system
        administrator or packager, you have to be extremely careful. 

            You're saying in essence that everyone on the system should
            be able to run the binary with root's privileges.

        --Fundamental reason you need to be careful: very difficult
        to anticipate exactly how and in what environment the code
        will be run....yet when it runs, it runs with *your*
        privileges (where "your" equals "root" or "whoever set the
        setuid bit on some code they wrote")

        --NOTE: Attackers can run setuid programs any time (no need
        to wait for root to run a vulnerable job)

        --FURTHER NOTE: Attacker controls many aspects of program's
        environment

  C. EXAMPLE ATTACKS that exploit setuid

    --running a program with setuid assumes that the functionality of the
    program is limited such that "no harm" can be done by an attacker.

      for example, "passwd" only changes the password of whoever calls
      this program...

      ...AND it authenticates users with old passwords.

      There are no other things that "passwd" can do.

    --So, if all programs have limited functions, what can go wrong?

    --here are two examples.

    (1) Close fd 2 before execing program

        -- passwd program (pseudocode):

          fd = open("/etc/passwd")
          ask user the old password
          check
          ask user the new password
          write(fd, new_password)

        --notice: setuid program opens the passwd file ("/etc/passwd")
        (normally, would be fd=3, but because
        fd 2 was closed, the file will be given fd 2).

        --attack: close fd 2 before execing program
          (recall fork-exec separation)

        --then, the program later encounters an error message
        and does fprintf(stderr, "some error msg").

        --result: the error message goes into the password
        file!

        --fix: for setuid programs, kernel will open dummy fds
        for 0,1,2 if not already open


    (2) a program called "preserve" installed as setuid root; used by
    old editors (like the old vi) to make a backup of files in a
    root-accessible directory.

        --preserve program (pseudocode):
            save a copy to some file
            system("/bin/mail")

        [it does this to send email to notify the user that there is
        a backup, for example after a crash/restart]

        --"system" uses the shell to parse its argument

        --now if IFS (internal field separator) is set to "/" before
        running vi, then we get the following:

            --vi forks and execs /usr/lib/preserve (IFS is still set
            to '/', but exec() call doesn't care)

            --preserve invokes system("/bin/mail"), but this causes
            shell to parse the arguments as:
                bin mail

            --which means that if the attacker locally had a
            malicious binary called 'bin', then that binary could
            do:

                cd /homes/mydir/bin 
                cp /bin/sh ./sh 
                chown root sh  # this succeeds because 'bin' is running as root
                chmod 4755 sh  # this succeeds because 'bin' is running as root

            (the leading 4 means "set the setuid bit")

            --result is that there is now a copy of the shell
            executable that is owned by root and setuid root

            --anyone who runs this shell has a root shell on the
            machine

    --Question: how to fix this problem?

    --shell has to ignore IFS if the shell is running as
    root or if EUID != UID.
        (in general, all shell environment is dangerous when EUID != UID)
        (also, "preserve" should not have been setuid root;
        there should have been a special user/group just for this
        purpose.)

    --also, modern shells refuse to run scripts that are setuid.
    (the issue there is a bit different, but it is related.)

    More reading about the setuid bit and the classic example above:
         http://web.deu.edu.tr/doc/oreily/networking/puis/ch05_05.htm


    D. TOCTTOU attacks (time-of-check-to-time-of-use)

    --very common attack

    --say there's a setuid program that needs to log events to a
    file, specified by the caller. The code might look like this,
    where logfile is from user input

        fd = open(logfile, O_CREAT|O_WRONLY|O_TRUNC, 0666);

    --what's the problem?

        --setuid program shouldn't be able to write to file that user
        can't. thus:

        if (access(logfile, W_OK) < 0)
            return ERROR;
        fd = open(logfile, ....)

        [note: access checks the original UID, meaning
          "(assuming I'm a setuid binary) can the user who invoked me
          read/write/execute this file?"]

        should fix it, right?

        NO!

    --here's the attack........

        attacker runs setuid program, passing it "/tmp/X"

        ---------------------------------------------------
        setuid program                   attacker
        ---------------------------------------------------

                                         creat("/tmp/X");

        check access("/tmp/X")
        --> OK
                                         unlink("/tmp/X");
                                         symlink("/etc/passwd", "/tmp/X")

        open("/tmp/X")

        ---------------------------------------------------

    --from the BSD man pages:
        "access() is a potential security hole and should never be
        used."

    --the issue is that access check and open are non-atomic

    --to fix this, have to jump through hoops: manually traverse
    paths. check at each point that the dir you're in is the one you
    expected to be in (i.e., that you didn't accidentally follow a
    symbolic link). maybe check that path hasn't been modified
        also need to use APIs that are relative to an opened directory
        fd:

            -- openat, renameat, unlinkat, symlinkat, faccessat
            -- fchown, fchownat, fchmod, fchmodat, fstat, fstatat

        Or

            (make the operations atomic)
            Wrap groups of operations in OS transactions

            --Microsoft supports transactions on Windows Vista and
            newer
            https://msdn.microsoft.com/en-us/library/windows/desktop/bb986748%28v=vs.85%29.aspx

            --research papers:

            http://www.fsl.cs.sunysb.edu/docs/valor/valor_fast2009.pdf
            http://www.sigops.org/sosp/sosp09/papers/porter-sosp09.pdf


    E. The confused deputy and capability

    * motivating example (the original story)

      [Norm Hardy. The Confused Deputy (or why capability might have been
      invented). SIGOPS Operating Systems Review, 1988]

      the setup:
        --the author's company ran timesharing services.
          (this was the mainframe era when people timeshare an expensive computer.)
        --they had a compiler which 
           a) takes user input about where to write debugging logs (for example, "home/user/logs")
           b) needs to produce logs into a file "sysx/stat", hence, having the write permission to "sysx/"
        --meanwhile, user bills were stored in "sysx/bill"

      a bummer:
        --some users realized the setup and...
          ...asked the compiler to use the file name "sysx/bill" as debugging outputs...
          ...which ruined the bills of course.

      Question:
        Who is to blame? Which component went wrong in this process?

      Superficially, the problem is
        giving compiler the write permission to the **entire** "sysx/".
        There are certain files (like "sysx/stat") that the compiler should
        work on, but there are others (like "sysx/bill") that it should not touch.

      deeper:
        "The fundamental problem is that the compiler runs with authority
        stemming from two sources. (That's why the compiler is a confused deputy.)"


    * capability

    additionally, Linux's capability system (`man 7 capabilities`) also
    provides a mechanism to limit user's ability. A user who has not been
    granted a capability cannot perform the corresponding operations.

    for example: setuid is a capability

    """
    CAP_SETUID
      * Make arbitrary manipulations of process UIDs (setuid(2), setreuid(2), setresuid(2), setfsuid(2));
      * forge UID when passing socket credentials via UNIX domain sockets;
      * write a user ID mapping in a user namespace (see user_namespaces(7)).
    """

    F. Thoughts / editorial

    --at a high level, the real issue is that the correct version of the
    code is way harder to write than the incorrect version:
        --correct version has to traverse path manually
        --be super-careful when running as setuid

    --cannot just blame application writers; must also blame the
    interfaces with which they're presented.

    --rules are incoherent. not clear how permissions compose

    --for all that, Unix security is actually quite inflexible:

        --can't pass privileges to other processes
        --can't have multiple privileges at once
        --not a very general mechanism (cannot create a user or group
        unless root)

[thanks to Mike Walfish, David Mazieres, Nickolai Zeldovich, Robert Morris]