lab4: CS3650 File System

In lab4, you will implement a simplified Unix-like file system, called fs3650 (CS3650 File System).

fs3650 uses the FUSE (File system in User SpacE) library, which allows us to implement a file system in user space. This makes the development much easier than in kernel space. FUSE exposes a few file system function interfaces that need to be instantiated and your task is to fill in some of these function bodies.

As always, there is not too much code to write (if you organize your code nicely); instead, a lot of the work of the lab is understanding the system that you’ve been given.

Section 0: Getting started

Click the GitHub lab4 link on Canvas homepage to create your lab4 clone on GitHub.
Start your VM and open the VM’s terminal.
Clone the lab4 repo to your VM:
```
$ cd ~
$ git clone git@github.com:NEU-CS3650-labs/lab4-<Your-GitHub-Username>.git lab4
```
Note that the repo address git@github.com:... can be obtained by going to GitHub repo page (your cloned lab4), clicking the green “Code” button, then choosing “SSH”.

Check contents:

$ cd ~/lab4; ls

// you should see:
diskfmt.py  fs3650.h  gen-disk1.py  homework.c  lab4fuse.c  Makefile  memcrc.py  misc.c  print-disk.py

Install required packages (Ubuntu):
```
 sudo apt install libfuse3-dev
```
(note: if you’re not using Ubuntu but other Linux distributions, please Google and install the corresponding packages in your platform.)

Exercise 0 the first glance of the unimplemented fs3650.

Here’s an example to show how fs3650 works with FUSE, where test.img is a disk image that CS3650 staff created:
 $ cd ~/lab4
 $ make          // you should see:
 gcc -Wall -pedantic -g   -c -o homework.o homework.c
 ...

 $ mkdir fs     // create a directory to serve as a mount point

 $ df fs        // see what file system fs is associated with
 Filesystem     1K-blocks    Used Available Use% Mounted on
 /dev/sda1        7092728 4536616   2172780  68% /

 $ ls fs        // notice, "fs" is empty

 $ ./lab4-fuse -image test.img fs    // mount test.img at fs

 $ df fs        // this ends up with an error because you haven't implemented fs3650.
                // but notice that fs's file system is different.
 df: fs: Operation not supported

 $ ls fs        // similarly, "ls" reports an error as well
 ls: cannot access 'fs': Operation not supported

 $ ls -l        // you will see that fs is in an unknown state
 ...
 d????????? ? ?      ?            ?            ? fs
 ...

 $ fusermount -u fs  // now unmount fs

 $ ls fs        // and "fs" is an empty dir again
Note that in the above example, after we run lab4-fuse, the kernel is actually dispatching all file system calls that ls and df make to your fs3650 implementation. Given that you haven’t implemented these functions, fs3650 returns error messages that these operations are not supported. Later, when fusermount is run, fs3650 is unmounted from fs, and then all I/O operations under fs return to being serviced normally by the kernel.
Below, we will use $ make mount and $ make umount to mount and umount fs3650 to fs (see Makefile for details).

Section 1: CS3650 File System (fs3650)

How you should use fs3650 descriptions: below are the design of fs3650 (which we study in class) and a bunch of facts about fs3650. Here is how you should use them:

First, you should skim all the contents before you start coding.
After skimming, you should have a basic understanding of how things work and have an impression of what information is where.
Later, when something is unclear or uncertain, you should come back and study the corresponding part carefully.

fs3650 disk interface
File system format
Superblock
Inode
File mode
Directory

We also provide some useful tips and how FUSE (the underlying framework connecting fs3650 and Linux) works in the apprendices:

Appendix 1: fs3650 tips
Appendix 2: FUSE

fs3650 disk interface

The fs3650 disk is abstracted as an array of blocks. The size of each block is 4KB (4096 bytes or FS_BLOCK_SIZE). You can read disk blocks via a simple interface:

int block_read(void *buf, int lba, int nblks);

The buf is a pointer to the output memory - blocks read by this function will be stored here. The lba is the block id (starting from 0) that you want to read. The nblks is the number of blocks you will read.

File system format

The format of fs3650 is as follows:

the first block (block 0) is the superblock;
the second block (block 1) is the root directory;
other blocks are blocks that contain either inodes or data.

   +-------+----------+------------------------+
   | super | root dir |     data blocks ...    |
   | block |   inode  |                        |
   +-------+----------+------------------------+
 block     0          1         2          3 ...

Superblock

The superblock is the first block in the file system, and contains the information needed to find the rest of the file system structures. The following C structure (see fs3650.h) defines the superblock:

struct fs_super {
    uint32_t magic;
    uint32_t disk_size;    /* in blocks */
    uint32_t root_inode;   /* block number */

    /* pad out to an entire block */
    char pad[FS_BLOCK_SIZE - 2 * sizeof(uint32_t)];
};

The magic is a number to distinguish fs3650 from other file systems.
disk_size indicates the size of the disk, in blocks.
root_inode tells where the root (/) is located (in fs3650, this is inode 1).

Note that uint32_t is a standard C type found in the <stdint.h> header file, and refers to an unsigned 32-bit integer. (similarly, uint16_t, int16_t and int32_t are unsigned/signed 16-bit ints and signed 32-bit ints)

Inodes

fs3650 simplifies the standard Unix inode.

ERROR: the figure is missing

In particular, inode is defined as follows (see also fs3650.h):

  struct fs_inode {
      uint16_t uid;      /* file owner */
      uint16_t gid;      /* group */
      uint32_t mode;     /* type + permissions (see below) */
      uint32_t ctime;    /* last status change time */
      uint32_t mtime;    /* modification time */
      int32_t  size;     /* size in bytes */
      uint32_t ptrs[FS_BLOCK_SIZE/4 - 5]; /* inode = 4096 bytes */
  };

Each inode corresponds to a file or directory. In fs3650, inodes are uniquely identified by their inode number, which is equal to their block id. Again, fs3650 uses block ids as inode numbers. (This is different from the Unix fs and makes fs3650 simpler.) The root directory is always found at inode 1; inode 0 (the super block) is invalid and can be used as a NULL value.

Hint: a path lookup always begins with inode 1, the root directory.

File mode

The FUSE APIs (and Linux internals in general) mash together the concept of fs object’s type (file/directory/device/symlink…) and its permissions. The result is called the file mode (the attribute mode in the fs_inode above), and looks like this:

  |<-- S_IFMT --->|           |<-- user ->|<- group ->|<- world ->|
  +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
  | F | D |   |   |   |   |   | R | W | X | R | W | X | R | W | X |
  +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Since it has multiple 3-bit fields, it is commonly displayed in base 8 (octal). For example, permissions allowing RWX for everyone (rwxrwxrwx) are encoded as 777 (an octal number). Note that, in C, octal numbers are indicated by a leading 0, e.g., 0777, so the expression 0777 == 511 is true.

The F and D bits correspond to 0100000 and 040000, indicating if the inode is a regular file (the F bit is set) or a directory (the D bit is set).

A side note: since the demise of 36-bit computers in the early 70s, Unix permissions are about the only thing in the world that octal is used for.

There are a bunch of macros (in <sys/stat.h>) that you can use for looking at the mode; you can see the official documentation with the command man inode, but the important ones are:

S_ISREG(m) - is it (i.e. the inode with mode m) a regular file?
S_ISDIR(m) - is it a directory?
S_IFMT - bitmap for inode type bits.
You can get just the inode type with the expression m & S_IFMT, and just the permission bits with the expression m & ~S_IFMT. (note that ~ is bitwise NOT - i.e. 0s become 1s and 1s become 0s)

Part A: fs3650 implementation

Now, get ready to code.

Exercise 1 implement fs_getattr.

fs_getattr is used to get the attributes of a file/directory.
Read instructions in homework.c and implement fs_getattr.
Now test your implementation:
$ make
$ make umount // if you have mounted fs3650
$ make mount
$ ls -l fs/file.1
// you should see:
-rwxrwxrwx 1 1001 1002 900 May  4  1976 fs/file.1
Notes:
If your code crashes (like touching invalid memory), you will see an error message when executing: ls: cannot access 'fs': Transport endpoint is not connected.
You can ignore struct fuse_file_info *fi (this applies to later exercises as well); it is a FUSE data structure, see details here.
The ctime and mtime in fs3650’s inodes are seconds since 1970/1/1.
To understand the underlying disk image, run $ python print-disk.py test.img to see the details of the filesystem described by test.img and its disk block information.

Now, you can get the details of a file (like fs/file.1). However, you cannot list files within your fs3650 yet. Run ls fs. You will see ls: reading directory 'fs': Operation not supported. Next, you will implement fs_readdir, allowing your fs3650 to read directories.

Exercise 2 implement fs_readdir.

fs_readdir enumerates entries in a directory (think of ls).
Read instructions in homework.c and implement fs_readdir.
Now test your implementation:
$ make
$ make umount // if you have mounted fs3650
$ make mount
$ ls fs
// you should see:
dir  dir1  dir2  dir-with-long-name  file.1  file.2

$ ls -l fs/dir
// you should see:
-rwxrwxrwx 1 1001 1002   100 May  3  1976 file1
-rwxr-xr-x 1 1001 1002  1200 May  4  1976 file2
-rwxrwxrwx 1 1001 1002 10111 May  4  1976 file3

So far, your fs3650 works for ls. But, it doesn’t yet support reading files. Try $ cat fs/file.1. You will see cat: fs/file.1: Operation not supported.

Exercise 3 implement file reading, fs_read

implement fs_read: reading data from a file.

after implementing, you should be able to cat files:

$ make
$ make umount // if you have mounted fs3650
$ make mount
$ cat fs/file.1
// you should see (a random string):
>FBQF58mSqeK<B"8KcMvE[OScSCp4;$o]"...

$ cat fs/dir/file1
// you should see (another random string):
&_y5.?P8r[V4=p-7w9LJks%|YU1]...

After finishing Part A, now you have a read-only file system! You can mount fs3650 and cd into it. You can perform all read operations like cd, ls, and cat. Your fs3650 should just work (it should not ) while walking through it.

Finally, submit your part A

Submitting consists of three steps:

Executing this checklist:
- Fill in ~/lab4/slack.txt with (1) your name, (2) your NUID, (3) slack hours you used, and (4) acknowledgements.
- update (03/27) please create the slack.txt.
- Make sure that your code builds with no warnings.
  note: we will apply a 10% penalty to the compilation warnings you have.
- Make sure you have added (git add) all files that you created (if any).

Push your code to GitHub:

 $ cd ~/lab4
 $ git commit -am 'submit lab4'
 $ git push origin 

 Counting objects: ...
  ....
  To ssh://github.com/NEU-CS3650-labs/lab4-<username>.git
   7337116..ceed758  main -> main

Actually submit your lab via Gradescope:
- Navigate to https://www.gradescope.com/ and click on log in.
- Select login with “School Credentials” and select “Northeastern University”.
- Enter Northeastern SSO login information and you should be able to log in to your gradescope account.
- Now, on Canvas, go to the CS3650 course and click on “Gradescope 1.3” from the left navigation bar. You will then be asked to accept the course invitation after which you can access the course on Gradescope.
- On Gradescope, select the lab/assignment you wish to submit and click on “Upload Submission”.
- You will then be asked to upload a zip file consisting of the files that the lab/assignment specifies.
  Note: you can either zip all the files within your lab folder, or zip your lab folder; its name must start with “lab4-“ (this is supposed to be your GitHub repo name, lab4-<username>). If you zip a folder named, for example, “mysubmit”, then Gradescope will complain about it and your submission might not work.
- After uploading the zip file, the autograder will evaluate your submission and provide a score for it.
- After the manual grading process is performed by the TAs, your final score for the lab/assignment will be released.

This completes the lab4 part A.

Part B: testing fs3650

Read Lab4 Part B: Testing CS5600 File System and finish the exercise.

Appendix 1: fs3650 tips

[A] mode

File type constants (see also man 2 stat)

S_IFMT - mask for type bits
S_IFREG - regular file
S_IFDIR - directory

A lot of times it’s easier to use:

S_ISDIR(mode) - true if it’s a directory
S_ISREG(mode) - true if it’s a regular file

[B] error code

When encountering errors, you need to return error codes. The comments in fs3650.c will tell you which errors are ought to return from which methods. Here is a list of error codes that you will use:

ENOENT - file/dir not found
ENOTDIR - not a directory
EISDIR - is a directory
EINVAL - invalid parameter (see comments in fs3650.c)
EOPNOTSUPP - not implemented yet

Note that you can use strerror(e) to get a string representation of an error code. Note: you’ll need to use the positive error code for strerror() (e.g., ENOENT), whereas you’ll need to return the negative version in code (e.g., return -ENOENT;).

[C] fs3650 design choices

Here are the design choices that fs3650 made:

fs3650 is read-only.
fs3650 uses a block (4KB) as an inode.
fs3650 puts all block pointers in the inode. An inode can hold 1019 32-bit block pointers (see fs_inode in fs3650.h), for a max file size of ~4MB.
Directories are not nested more than 20 times. For example, there is no /dir1/dir2/dir3/.../dir21/.
Directories only have one data block (or only inode.ptrs[0] contains valid block ids).
fs3650 does not support links.

[D] implementation hints

1. reuse code. There will be a lot of code sharing between different fs_* functions. You should factor out the code that is used multiple times. It’s bad to duplicate the same piece of code in multiple functions (sometimes known as “copy-paste bugs”; see a research paper about this topic here). An obvious drawback is that you’ll find it hard to fix bugs—fixing one bug in some of the copies does not fix the same bug in other copies!

2. performance. Don’t worry about performance. The simplest possible implementation is okay, but an incorrect implementation is NOT okay.

3. path translation Assume you’ve split your path, and you have a count (e.g., int pathlen) and an array of strings (e.g., char **pathv). As an example, for a path /home/cs3650, you will have pathlen=2, pathv={"home","cs3650"}.

To find the file pointed by /home/cs3650, you need to translate this path (i.e., pathlen and pathv) into an inode number. You should:

go to the root dir (the inode in block 1)
try to find dir “home” (i.e., pathv[0]) in the root dir
then, fetch the dir “/home”’s inode
try to find file “cs3650” (i.e., pathv[1]) in the dir “/home”
finally, return the inode number of the file “cs3650”, or -ENOENT/-ENOTDIR if there was an error.

Appendix 2: FUSE

The file system that we will build is implemented as a user-level process. This file system’s storage will be a file (which is created by us called test.img) that lives in the normal file system of your vm. Much of your code will treat this file as if it were a disk.

This entire arrangement (file system implemented in user space with arbitrary choice of storage) is due to software called FUSE (File system in User SpacE).

In order to really understand what FUSE is doing, we need to take a brief detour to describe VFS. Linux (like Unix) has a layer of kernel software called VFS; conceptually, every file system sits below this layer, and exports a uniform interface to VFS. (You can think of any potential file system as being a “driver” for VFS; VFS asks the software below it to do things like “read”, “write”, etc.; that software fulfills these requests by interacting with a disk driver, and interpreting the contents of disk blocks.)

The purpose of this architecture is to make it relatively easy to plug a new file system into the kernel: the file system writer simply implements the interface that VFS is expecting from it, and the rest of the OS uses the interface that VFS exports to it. In this way, we obtain the usual benefits of pluggability and modularity.

FUSE is just another “VFS driver”, but it comes with a twist. Instead of FUSE implementing a disk-based file system (the usual picture), it responds to VFS’s requests by asking a user-level process (which is called a “FUSE driver”) to respond to it. So the FUSE kernel module is an adapter that speaks “fuse” to a user-level process (and you will be writing your code in this user-level process) and “VFS” to the rest of the kernel.

Meanwhile, a FUSE driver can use whatever implementation it wants. It could store its data in memory, across the network, on Jupiter, whatever. In the setup in this lab, the FUSE driver will interact with a traditional Linux file (as noted above), and pretend that this file is a sector-addressable disk.

The FUSE driver registers a set of callbacks with the FUSE system (via libfuse and ultimately the FUSE kernel module); these callbacks are things like read, write, etc. A FUSE driver is associated with a particular directory, or mount point. The concept of mounting was explained in OSTEP 39 (see 39.17). Any I/O operations requested on files and directories under this mount point are dispatched by the kernel (via VFS, the FUSE kernel module, and libfuse) to the callbacks registered by the FUSE driver.

To recap all of the above: the file system user interacts with the file system roughly in this fashion:

When the file system user, Process A, makes a request to the system, such as listing all files in a directory via ls, the ls process issues one or more system calls (stat(), read(), etc.).
The kernel hands the system call to VFS.
VFS finds that the system call is referencing a file or directory that is managed by FUSE.
VFS then dispatches the request to FUSE, which dispatches it to the corresponding FUSE driver (which is where you will write your code).
The FUSE driver handles the request by interacting with the “disk”, which is implemented as an ordinary file. The FUSE driver then responds, and the responses go back through the chain.

lab4 FUSE Diagram

Acknowledgments

This lab is created by Peter Desnoyers, with modifications by the CS3650 staff. FUSE introduction is borrowed from Mike Walfish’s cs202 lab instructions.