week 11
cs4973/cs6640
03/17 2025
https://naizhengtan.github.io/25spring/

□ admin: final project
□ 1. intro to fs
□ 2. Unix files
□ 3. fs namespace
□ 4. egos-fs preview
□ admin: exam
------

1. Intro to file systems

   Q: what does a FS do?

     1. provide persistence (don't go away ... ever)

     2. give a way to "name" a set of bytes on the disk (files)

     3. give a way to map from human-friendly-names to "names" (directories)

   --persistence comes from hardware:

   * classic persistent storage
     SSD/disk/tape (yes, we are still using tapes)

     --disk/SSD are the first thing we've seen that (a) doesn't go away;
     and (b) we can modify (BIOS ROM, hardware configuration, etc.
     don't go away, but we weren't able to modify these things).
     two implications here:

       (i) we're going to have to put all of our important state on
       the disk

       (ii) we have to live with what we put on the disk! scribble
       randomly on memory --> reboot and hope it doesn't happen
       again. scribbe randomly on the disk --> now what? (answer:
       in many cases, we're hosed.)

   * new persistent storage:
     --glass:
       Project Silica: https://www.microsoft.com/en-us/research/project/project-silica/

     --persistent memory (see slides)

   --disk/ssd abstraction
     an array of blocks

   --a note about abstracting machines
     -- CPU: scheduler
     -- Memory: virtual memory
     -- disk: file system

2. Files

    --what is a file?
        --answer from user's view: a bunch of named bytes on the disk
        --answer from FS's view: collection of disk blocks

    Q: what does a file do?

    --big job of a FS: map name and offset to disk blocks

                                 FS
                   {file,offset} --> disk addrss

        this is called file mapping.

    --the problem: "file mapping"
                 (file, offset) -> disk addr
      "70% of the time spent on file mapping" from paper
       [Neal, Ian, Gefei Zuo, Eric Shiple, Tanvir Ahmed Khan, Youngjin Kwon,
       Simon Peter, and Baris Kasikci. "Rethinking File Mapping for Persistent
       Memory.", FAST'21]

    --Unix inode: how Unix does file mapping

        [draw on board]

        permisssions
        times for file access, file modification, and inode-change
        link count (# directories containing file)
        ptr 1  --> data block
        ptr 2  --> data block
        ptr 3  --> data block
        .....
        ptr 11  --> indirect block 
                      ptr --> 
                      ptr --> 
                      ptr --> 
                      ptr -->
                      ptr -->
        ptr 12 --> indirect block
        ptr 13 --> double indirect block
        ptr 14 --> triple indirect block

    This is just an imbalanced tree.

   --Unix fs: (file, offset) -[inode]-> disk addr
    Why is inode designed the way in Unix?
    "fs is a disk data structure":
      -- granularity: block
      -- read/write a block is expensive
      -- sequential access > random access

   Q: How does Unix fs reads/writes files?

          app
         /   \
        /     \
      mmap   read/write
     ---------------------
        \     /  [kernel]
         \   /
      page cache
           |
          disk

   --introduce mmap using an example
     * implementing mmap could be an interesting final project

   --Unix-style file mappings works well in the past when we have
      single-core CPU,
      slow IO,
      sequential access > random access,
      read/write granularity is a block,...
    ...but, do these still hold today?

 Q: assume that memory is persistent,
    then do we need fs?
    how will that fs look like?


3. fs namespace

  * the question under-the-hood:

    Q: Consider you have 1,000 books at home. How will you organize them so that
       you can easily find the book you want?

    Q: Consider you are a librarian with 1,000 books. How would you organize books
       so that when a guest comes, you can find the book for them?

    Q: Are there any difference of how you organize the books?

    [these apply to file systems
      books: files
      your organization system: namespace
    ]

    * a book is an extent-based readonly fs
      --book pages: disk (persistent)
      --articals/sections: files
      --contents: dirs

  * "boring" hierarchical namespace

    -- used since CTSS (1960s), and Unix picked it up and used it nicely

    -- structure like:
        [draw: 
                      "/"
           bin/     dev/     tmp/    usr/
         ls, grep    ...
        ]

    -- How to lookup a file, say "/bin/ls"?

  * is the hierarchical namespace the only option? 
    No, take a look at databases.
    comparison: file system vs. databases
      databases: structured data
      fs: unstructured data

  * "Hierarchical file systems are dead" (2009)
    Margo Seltzer and Nicholas Murphy, HotOS'09

    -- hFSD argues,
       hiererachical namespace works good in the past.
       "The situation, however, has evolved"

    i) storage size grows
      -- 1992: 300MB disk
      -- 2009: 300GB disk (the paper's time, 17 years later)
      //-- 2023: 22TB disk & 8TB SSD (14 years later than paper)
      -- 2025: 36TB disk (from Seagate) & 8TB SSD (16 years later than paper)
      -- 2030(?): 1000TB SSD
         [https://www.techradar.com/news/1000tb-ssds-could-become-mainstream-by-2030-as-samsung-plans-1000-layer-nand]

    ii) "..they [file sizes] have not increased by the same margin."
       larger space => 
          file size wasn't growing that fast => 
            more files => 
              harder to manage
        [Q: does logic flow?]

    iii) "Google is a verb"
      what they want instead of where it lives
      [a sharp observation]

   * Why hierarchical namespace is a problem?
     -- files are siloed (no need for a global namespace)
        Sounds familiar?
          this is very much like smartphone's logic
          (notice that this paper was 2009; iphone debuted in 2007)
     -- walk the hierarchy (performance is bad, without cache)
     -- concurrency (this is fundamental)

   * hFAD design

     idea: instead of a hierarchical namespace,
           use a tagged, search-based namespace

     [see Fig]
     -- based on an object-based storage device (OSD)
     -- naming (via index) and accessing (via obj store)
     -- "An object is named by one or more tag/value pairs."

     Q: how about updates? deleting a file needs to update multiple
        indexes at the same time.

        Note: index is a data structure that accelerate data retrieval,
        at the cost of more expensive updates and more space.

   * DISCUSSION: search-based fs vs. hierarchical fs
     real question: human-readable vs. machine-readable