Lab2: CS3650 Shell
In this lab, you will learn how a shell is built. You will improve (or reinforce) your shell-using skills. You will also gain experience with C programming (you will interact with critical constructs, such as pointers and strings). Along the way, you will use the fork()
, a system call that we will intensively discuss in lectures.
A shell parses a command line, and then runs (executes) that command line. One can also think of GUIs as shells, in which case the “command line” is expressed by the user’s mouse clicks, for example. We’ve given you the skeleton, and all of the parsing code, for a simple shell sh3650
. You will also fill in the logic for executing the command line: you will implement support for executing internal and external commands and I/O redirection.
Some notes:
- There is not much code to write, but there is a lot to absorb. We observed from prior semesters that students are eager to write code sometimes before understanding what to write! Again, doing labs is supposed to be a learning (instead of evaluating) process. You need to study what you’re asked to do first.
- Please read lab instructions carefully, and expect to come back and read this page many times when working on your Lab2.
- In the instruction, we will mention commands (like
ls
), syscalls (likechdir
), C library functions (likesprintf
) that might be new to you. Please useman
(you will see what this is later) or Google or chatGPT to figure out what they are. - We recommend beginning this lab early (again, early is often earlier than you think).
Section 0: Getting started
- Click the GitHub Lab2 link on Canvas homepage to create your Lab2 clone on GitHub.
- Start your VM and open the VM’s terminal.
- Clone Lab2 repo to VM:
$ cd ~ $ git clone git@github.com:NEU-CS3650-labs/lab1-<Your-GitHub-Username> lab2
Note that the repo address
git@github.com:...
can be obtained by going to GitHub repo page (your cloned lab2), clicking the green “Code” button, then choose “SSH”. - Check contents:
$ cd ~/lab2 $ ls // you should see: Makefile parser.c parser.h sh3650.c slack.txt
Part 1: Shell commands and constructs (warm-up)
It will be much easier to do the coding work in this lab if you have some familiarity with shells in general; this is for two reasons. First, comfort with shells makes you more productive. Second, if you have a good handle on what a shell is supposed to do, the coding work will make more sense. This portion of the lab is intended to provide some of that background (but some of the background you will have to acquire by “playing around” with the shell on your computer). In this part of the lab, we will interact with the installed shell on your system (rather than the source code that you retrieved above). We will be assuming the bash shell, which is the default shell on both of the development platforms in this class.
A. Basic functionality
Run a cmd
A shell is a program whose main purpose is to run other programs. Two of the key workhorses in a shell are fork()
and execve()
. Here is a simple shell command:
$ ls -a
The shell parses this command into two arguments, ls
and -a
. The ls
argument names the binary (executable program) that should be executed. So the shell forks a child process to execute ls
with those two arguments: the first argument is the binary itself ls
(yes, ls
program will see its name as an input) and the second argument -a
is what we provide to the binary. The ls
program has a simple job: it prints all file names in the current working directory to the console. (ls -a
will show all files, including hidden ones.) Meanwhile, the parent process (the shell) waits for the child to finish; when it does, the parent returns to read another command.
You may be interested in a reasonable tutorial for Unix shells. You can find others by searching for, e.g., “shell tutorial” on Google. Let us know if you find one you really like.
Internal commands
In the above example, ls
is a program on your file system, which you can find by $ which ls
(this shows where the cmd ls
locates in your file system). In addition to running programs from the file system, shells have internal commands (also known as builtin commands) that provide functionality that could not be obtained otherwise. Three internal commands that our shell will implement are cd
, pwd
, and exit
.
The cd
command changes the shell’s current directory, which is the default directory that the shell uses for files. So cd dir
changes the current directory to dir
. (You can think of the current directory as the directory that shows up by default when you use an “Open” or “Save” dialog in a GUI program.) Of course, files can also be manipulated using absolute pathnames, which do not depend on the current directory; /home/studentname/lab2/parser.c
is an example. The pwd
command shows the current working directory.
There may also come a time when you would like to leave your shell; the exit
command instructs the shell to exit with a given status. (exit
alone exits with status 0.)
(Why are cd
and exit
part of the shell instead of standalone programs?)
Exit status and $?
A command finishes with an exit status. You can think of the exit status as a command’s “return value”. If a command accomplishes its function successfully, that command generally exits with status 0, by calling exit(0)
. (This is also what happens when a program runs off the end of its main
function.) But if there is an error, most commands will exit with status 1
. For example, the cat
command will exit with status 0 if it reads its files successfully, and 1 otherwise:
$ cat parser.c
... // exit status 0
$ echo $?
0
$ cat donotexit.txt
cat: donotexist.txt: No such file or directory // exit status 1
$ echo $?
1
The special variable $?
in bash contains the exit value of the previous command.
Input/output redirection
Each program has standard input, standard output, and standard error file descriptors, whose numbers are 0, 1, and 2, respectively. The ls
program writes its output to the standard output file descriptor. Normally this is the same as the shell’s standard output, which is the terminal (your screen). But the shell lets you redirect these file descriptors to point instead to other files. For example:
$ ls > files.txt
This command doesn’t print anything to the screen. But let’s use the cat
program, which reads a file and prints its contents to standard output, to see what is in output.txt
:
$ cat files.txt
// you should see a list of file names of the current directory
The >
filename operator redirects standard output, <
filename redirects standard input, and 2>
filename redirects standard error. (The syntax varies from shell to shell; we generally follow the syntax of the Bourne Again Shell or bash
.)
B. Advanced features
Backgrounding
You can also execute a command in the background with the &
operator. Normally, the shell will not read a new command until the previous command has exited. But the &
operator tells the shell not to wait for the command.
$ echo foo &
$ foo
Note: foo
is printed on top of the next shell prompt.
Command separator
Shells offer several ways to chain commands together. For example, the ;
operator (also called “command separator”) says “do one command, then do another”. This shell command prints two lines:
$ echo foo ; echo bar
// you should see:
foo
bar
Conditional chaining
Instead of always executing commands in sequence like ;
, &&
and ||
allow you to conditionally execute commands based on their exit status: &&
says “execute the command on the right only if the command on the left exited with status 0”. And ||
says “execute the command on the right only if the command on the left exited with status NOT equal to 0”. For example:
// suppose files.txt exists and contains "foo"
$ cat files.txt && echo "files.txt exists!"
foo
files.txt exists!
// suppose NULL.txt doesn't exist
$ cat NULL.txt && echo "NULL.txt exists!"
cat: NULL.txt: No such file or directory // Note: does not run echo!
$ cat files.txt || echo "files.txt does not exist."
foo
$ cat NULL.txt || echo "NULL.txt does not exist."
cat: NULL.txt: No such file or directory
NULL.txt does not exist.
Pipe
Finally, the pipe operator |
sends the output of one command to the input of another. For example:
$ echo foo | rev
oof
Note: rev
reverses a string.
Another example:
$ echo -e "foo\nbar" | shuf -n 1
// the output can be either foo or bar
Note: you have to install shuf
first. (How? See command not found.)
Some useful commands
You may find the following commands particularly useful for testing your shell. Find out what they do by reading their manual pages. Be creative with how you combine these!
cat
(print one or more files to standard output)echo
(print arguments to standard output)true
(exit with status 0)false
(exit with status 1)sleep
(wait for N seconds then exit)sort
(sort lines)
Part 2: Implementing sh3650
For simplicity, our shell will only support:
- the
cd
,pwd
, andexit
built-in commands - external programs, like
ls
andrev
- redirection of external program input and output
- shell variable
$?
(but not others)
Other features like backgrounding, conditional chaining, and pipe are not supported.
At various points in this description you are given instructions to refer to the “man page” for a system call or library function; please do so in a terminal window at that point. Note that much of the contents of a man page can be ignored. The most important parts for what we are doing are (1) the list of include files to use and (2) the arguments and return value. (the “RETURN VALUE” section is often near the end of a long man page)
We also provide a series of Appendix that are useful:
- [A] Command Line Tokenizer explains how the parser works in our shell
- [B] ASCII characters explains how a
char
type is interpreted as a human-understandable character for our shell. - [C] Testing and Debugging your shell gives advice to thoroughly test your shell, which hints how we will eventually (after the lab deadline) grade your lab.
Note: when you submit, the autograder on the Gradescope contains a subset of the final test cases. Meaning, if you get full scores when submitting doesn’t necessarily mean you get all the credits in the end. We suggest you go through the testing advice we give.
Section 1: Signals
If your shell is interactive, you’ll want to disable the ^C
signal (that is, press Ctrl
without releasing and then press C
), so that you can quit out of a running program without terminating the shell:
signal(SIGINT, SIG_IGN); /* ignore SIGINT=^C */
Later when you use fork
to create a subprocess, you’ll want to set it back to its default in that subprocess, so you can terminate a running command:
signal(SIGINT, SIG_DFL);
Exercise 1 disable ^C
- edit file
sh3650.c
and add the line of disabling signal in themain
function- make and run the shell
$ make ... $ ./sh3650 sh3650>
- when you type
^C
, the shell won’t exit- it will exit properly on end of file (i.e. when you type
^D
, which indicates end-of-file on the Unix terminal. Again,^D
is pressingCtrl
, not releasing, then pressD
)
Section 2: Internal commands
As introduced, internal commands are commands like cd
, pwd
, and exit
that are contained within the shell, literally built in. This is either for performance reasons—internal commands execute faster than external commands, which usually require forking—or because a particular builtin needs direct access to the shell internals.
Note:
- the command line tokenizer is described below, in the section Command Line Tokenizer
- you can compare strings for equality using
strcmp
(“man 3 strcmp”), which returns zero if two strings are equal.
(question: why does cd
have to be implemented as a built-in command rather than an executable run in a separate process? exit
?)
cd
For the cd
command you will use the chdir
command (“man 2 chdir”) to change to the indicated directory. With no arguments you should use the value of the HOME
environment variable, i.e. getenv("HOME")
.
Note that cd
can fail two ways:
- wrong number of arguments: print
"cd: wrong number of arguments\n"
to standard error - usefprintf(stderr, ...
chdir
fails: print"cd: %s\n", strerror(errno)
to standard error
In both cases set status to 1, and set it to 0 otherwise.
pwd
pwd
will use the getcwd
system call (“man 2 getcwd”) to get the current directory, passing it a buffer of PATH_MAX
bytes, and print the result. You can assume getcwd
always succeeds and set status to 0.
exit
exit
takes zero or 1 argument; with more than 1 it prints "exit: too many arguments"
to stderr and sets status=1. With 0 arguments it calls exit(0)
; with a single argument it calls exit(atoi(arg))
, using atoi
(“man 3 atoi”) to convert the argument from a string to an integer.
Exercise 2 implement cd
, pwd
, and exit
- implement the three internal commands
- make and test your implementation:
- run
make
, run your shell./sh3650
, and test:pwd
: does it print out the right current directory? does it fail if you give it arguments?cd /tmp
to directories that exist, check withpwd
cd
to non-existent directory, check (a) error message, (b) still in same directoryexit
: does it work correctly with 0, 1, >1 argument? Try exiting with an arbitrary non-zero status and verify using the$?
variable in your normal shell:$ ./sh3650 sh3650> exit 5 $ echo $? 5
- hints:
- here are a list of useful library functions and syscalls:
strcmp
,chdir
,getcwd
,atoi
,exit
- factor
cd
,pwd
, andexit
into individual functions that each takeargc
andargv
as arguments. Maybe returnstatus
as the return value, but more on that later.
Now that you’ve implemented your first commands, make sure that it ignores empty command lines without complaining or crashing.
Section 3: External commands
If a command isn’t an internal command, it’s an external one: you’ll fork a sub-process; in the child process you’ll use exec
to run the command, while the parent will use wait
to wait until it’s done.
After fork()
(“man 2 fork”) you’ll want to do the following:
- re-enable
^C
(see section 1: signals) - use the
execvp
library function (“man 3 execvp”) to exec the indicated command
From the man page:
int execvp(const char *file, char *const argv[]);
The first argument is the executable name, while the second is the argv
array to be passed to the newly loaded program. Instead of providing an argument count, the argv
array is terminated with a NULL pointer. For example, if one runs $ ls /home
, the following arguments are fed into execvp
:
argv-> +------+
| *--|---->"ls"
+------+
| *--|---->"/home"
+------+
| NULL |
+------+
execvp
will load the executable ls
(possibly under /usr/bin/ls
) and pass it argc=2, argv={“ls”, “/home”}.
(question - how does execvp
know where to find the executable ls
?)
The command line parser I’ve given you makes sure that the argv[]
array is terminated with a NULL pointer, so you can just pass it to execvp: execvp(argv[0], argv);
If execvp
fails, you should print a message to standard error, "%s: %s\n", argv[0], strerror(errno)
, and then exit with EXIT_FAILURE
. (question: why do you have to exit here, rather than returning?)
In the parent process you’ll need to wait for the child pracess to finish, using waitpid
, and get its exit status (i.e. the argument passed to exit()
) It’s ok to copy and paste the following code without fully understanding it:
// the variable "pid" should be the child process's process id
int status;
do {
waitpid(pid, &status, WUNTRACED);
} while (!WIFEXITED(status) && !WIFSIGNALED(status));
int exit_status = WEXITSTATUS(status);
Exercise 3 implement executing external cmds
- implement the mentioned
fork
,execvp
, andwaitpid
- run
make
, then test:
- successful commands, e.g.
ls
,ls /tmp
, etc.- unsuccessful ones, e.g.
this-is-not-a-command
^C
handling: runsleep 5
and verify you can kill it with^C
and return to your shell.- hint:
- FACTORING: we suggest that you factor out the code which forks and execs, and put it in a separate function from where you call
waitpid
.- Debugging:
- You may find the
gdb
commandset follow-fork-mode child
useful. documentation- Also the
strace -f
command can be very useful, although verbose: e.g. here’s a selection of the 140 lines it prints out for my (Peter) shell. (note thatfork
in Linux is actually implemented using a system call calledclone
)$ echo ls | strace -f ./sh3650 execve("./sh3650", ["./sh3650"], 0xffffed0a7ba8 /* 24 vars */) = 0 brk(NULL) = 0xaaaadecaa000 ... clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 119667 attached , child_tidptr=0xffff9934bf50) = 119667 [pid 119667] set_robust_list(0xffff9934bf60, 24 <unfinished ...> ... [pid 119667] execve("/usr/local/sbin/ls", ["ls"], 0xffffe3f4c0a8 /* 24 vars */ <unfinished ...> ...
Section 4: The $?
special shell variable
The basic shell has a number of built-in variables, listed under “Special Parameters” in the man page (man sh
); we implement only one of these, $?
: expands to the exit status of the most recent command.
To implement this you can just use sprintf
to print the exit status into a buffer (e.g. define and use char qbuf[16]
), and then go through your array of tokens, find any which compare equal to $?
, and replace them with a pointer to that buffer.
Exercise 4 implement $?
- implement
$?
in yoursh3650.c
make
and test:$ ./sh3650 sh3650> false sh3650> echo $? 1 sh3650> ./sh3650 // notice: below we're in another sh3650 shell sh3650> exit 5 // now we're back to the first sh3650 shell sh3650> echo $? 5
Section 5: File redirection
As mentioned in input/output redirection, people can redirect the inputs and outputs to files rather than consoles. For example, $ ls > foo.txt
will redirect the output of ls
to file foo.txt
instead of printing on your console. How this works internally is that
- the shell
open
(a syscall, see “man 2 open”) the required filefoo.txt
; - it
dup2
(a syscall, see “man 2 dup2”) thefoo.txt
’s file descriptor to the standarded output (which is1
); - it
close
(a syscall, see “man 2 close”) thefoo.txt
’s file descriptor because we don’t need it anymore (now, file descriptor1
points tofoo.txt
).
For your implementation, you should scan the shell’s input tokens for “>” and “<”, and replace standard input and output appropriately. Note that “<” (or “>”) may be followed by zero, one, or multiple words before “>” (“<”) or end of line:
- zero words: don’t redirect
- more than one: redirect to the first one
If you’ve factored out a “launch” function which takes an argv pointer and file descriptors for stdin and stdout, you can make a “wrapper” for it which checks for file redirection and replaces the appropriate file descriptors if necessary. (make sure you close any file descriptors that aren’t needed)
Exercise 5 implement redirection
- implement the mentioned redirection in
sh3650.c
- hints:
- here are some useful syscalls:
open
,dup2
,close
- for
open
, these flags might be useful:O_RDONLY
,update (01/29)O_CREAT|O_RDWR
O_CREAT|O_TRUNC|O_WRONLY
,0777
- Debugging:
- you can use the
lsof
command to list open file descriptors, to make sure you’re not leaking. E.g. from another terminal:$ ps aux |grep sh3650 pjd 118403 0.0 0.0 2196 780 pts/4 S+ 01:29 0:00 ./sh3650 $ lsof -a -d 0-999 -p 118403 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sh3650 118403 pjd 0u CHR 136,4 0t0 7 /dev/pts/4 sh3650 118403 pjd 1u CHR 136,4 0t0 7 /dev/pts/4 sh3650 118403 pjd 2u CHR 136,4 0t0 7 /dev/pts/4
(the man page for
lsof
is horrible. The command means to list all open “normal” files, i.e. with file descriptors 0-999 (-d 0-999
), AND (-a
) are open in a specific process (-p 118403
).- Just like before, the
strace -f
command can be quite useful.$ echo 'ls | cat' | strace -f ./sh3650 execve("./sh3650", ["./sh3650"], 0xffffeccb9e08 /* 24 vars */) = 0 ... 240 more lines...
Finally, submit your work
Submitting consists of three steps:
- Executing this checklist:
- Fill in
~/lab2/slack.txt
with (1) your name, (2) your NUID, (3) slack hours you used, and (4) acknowledgements. - Make sure that your code builds with no warnings.
note: we will apply a 10% penalty to the compilation warnings you have. - Make sure you have added (
git add
) all files that you created (if any).
- Fill in
Push your code to GitHub:
$ cd ~/lab2 $ git commit -am 'submit lab2' $ git push origin Counting objects: ... .... To ssh://github.com/NEU-CS3650-labs/lab2-<username>.git 7337116..ceed758 main -> main
- Actually submit your lab via Gradescope:
- Navigate to https://www.gradescope.com/ and click on log in.
- Select login with “School Credentials” and select “Northeastern University”.
- Enter Northeastern SSO login information and you should be able to log in to your gradescope account.
- Now, on Canvas, go to the CS3650 course and click on “Gradescope 1.3” from the left navigation bar. You would then be asked to accept the course invitation after which you can access the course on Gradescope.
- On Gradescope, select the lab/assignment you wish to submit and click on “Upload Submission”.
- You would then be asked to upload a zip file consisting of the files the lab/assignment specifies.
Note: you can either zip all files within your lab folder, or zip your lab folder whose name must start with “lab2-“ (this is supposed to be your GitHub repo name,lab2-<username>
). If you zip a folder named, for example, “mysubmit”, the Gradescope will complain. - After uploading the zip file, the autograder will evaluate your submission and based on it provide a score for your submission.
- After the manual grading process is performed by the TAs, your final score for the lab/assignment will be released.
This completes the lab.
Appendix
A. Command Line Tokenizer
The simplest way of tokenizing a line in C is to use the strtok
library function, or the slightly less horrible strsep
, which overwrite whitespace characters to split a line into multiple strings. An example: start with the line "ls | cat"
, zero out the whitespace characters:
['l']['s'][' ']['|'][' ']['c']['a']['t'][ 0 ]
-> ['l']['s'][ 0 ]['|'][ 0 ]['c']['a']['t'][ 0 ]
and keep pointers to the beginning of each region of non-whitespace characters:
argv[] ['l']['s'][ 0 ]['|'][ 0 ]['c']['a']['t'][ 0 ]
+-----+ ^ ^ ^
| *--|----------+ | |
+-----+ | |
| *--|-------------------------+ |
+-----+ |
| *--|-----------------------------------+
+-----+
| ... |
Problem: this breaks when you don’t have any whitespace, like "ls|cat"
The parser you’re given handles this by copying the input string into a second buffer, rather than modifying it in place:
input string: buffer:
['l']['s']['|']['c']['a']['t'][ 0 ] [ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ]
output:
argv[] -> ['l']['s'][ 0 ]['|'][ 0 ]['c']['a']['t'][ 0 ]
+-----+ ^ ^ ^
| *--|------------------------------+ | |
+-----+ | |
| *--|---------------------------------------------+ |
+-----+ |
| *--|-------------------------------------------------------+
+-----+
| 0 | <- terminated with NULL pointer (see arg formats in "man 3 execvp")
+-----+
| ... |
The skeleton code you’re given shows an example of how to use it.
For a “real” shell you’d probably use a tokenizer and parser based on the standard compiler tools lex
and yacc
, creating an abstract syntax tree of linked “token” objects. That’s far too complicated for this assignment, so we have a simple tokenizer that does a pretty good job of splitting simple lines with redirection symbols and single and double quotes, and returns pointers to strings rather than more complex structures.
The parser is not guaranteed to be bug-free, but your code will only be tested against the cases we have tested.
B. ASCII characters
By default C uses the basic 8-bit ASCII character set, rather than the much larger Unicode character set used in today’s user interfaces. To see the actual character set, we can print out a string containing the bytes 1 through 255, with a 256th byte as the null terminator:
$ cat > test.c <<EOF
#include <stdio.h>
int main(void) {
char c, buf[256];
for (int i = 0, c = 1; i < 256; i++)
buf[i] = c++;
printf("%s", buf);
}
EOF
$ gcc test.c
$ ./a.out | od -A d -t c
You should see the following - note that offsets (left column) are in decimal, while non-printing characters are printed in octal, which no one uses anymore. (“od” = “octal dump”)
The “missing” character at the end of the second line is actually a space, ' '
, and there are several backslash-style escaped characters, of which the only ones we care about are \n
(newline) and sometimes \t
(tab).
0000000 001 002 003 004 005 006 \a \b \t \n \v \f \r 016 017 020
0000016 021 022 023 024 025 026 027 030 031 032 033 034 035 036 037
0000032 ! " # $ % & ' ( ) * + , - . / 0
0000048 1 2 3 4 5 6 7 8 9 : ; < = > ? @
0000064 A B C D E F G H I J K L M N O P
0000080 Q R S T U V W X Y Z [ \ ] ^ _ `
0000096 a b c d e f g h i j k l m n o p
0000112 q r s t u v w x y z { | } ~ 177 200
0000128 201 202 203 204 205 206 207 210 211 212 213 214 215 216 217 220
0000144 221 222 223 224 225 226 227 230 231 232 233 234 235 236 237 240
0000160 241 242 243 244 245 246 247 250 251 252 253 254 255 256 257 260
0000176 261 262 263 264 265 266 267 270 271 272 273 274 275 276 277 300
0000192 301 302 303 304 305 306 307 310 311 312 313 314 315 316 317 320
0000208 321 322 323 324 325 326 327 330 331 332 333 334 335 336 337 340
0000224 341 342 343 344 345 346 347 350 351 352 353 354 355 356 357 360
0000240 361 362 363 364 365 366 367 370 371 372 373 374 375 376 377
C. Testing and Debugging your shell
In order to test your submission properly, you need to think like someone who is trying to break it. In other words, there are two types of tests you’ll want to write:
- basic 1+1=2 tests, verifying that each of the (few) functions your shell performs are working
- diabolical tests that try to provoke your code into using NULL pointers and crashing
Note: for some of these tests you may be running the sh3650
executable a whole bunch of times. The -fsanitize-address
compile option causes it to take about a second to start up, but you’ll probably save more time in the long run if you keep it enabled.
Below we give some advice on how to test each section of your code:
Test cases for Exercise 1
Start your shell in interactive mode, verify that ^C
doesn’t kill it.
Test cases for Exercise 2
Note that a lot of these tests mention checking the value of $?
inside your shell - obviously you’ll have to defer this until implementing $?
. (and you’ll need external commands, so you can use echo $?
to see its value)
But at this stage you can test $?
outside of your shell, verifying that you called exit()
with the right argument.
$ ./sh3650 <<EOF
exit 5
EOF
$ echo $? // to see if it is 5
Here are some test cases you should test:
- empty line: should not crash, should prompt for next command (in interactive mode)
- end-of-file/^D: A control-D character on the console should cause your shell to exit gracefully, i.e. with
$?
set to 0 - cd/pwd:
- no argument:
cd
with no should change (as reported bypwd
) to your home directory, $HOME. Both should set$?
to zero. - valid arg:
cd /tmp
(or other real directory) should work, as reported bypwd
, set$?
to zero - invalid:
cd /not-a-directory
should printcd: No such file or directory
and set$?
to 1 - extra args:
cd a b
should printcd: wrong number of arguments
and set$?
to 1
- no argument:
- exit
exit
should exit, with$?
set to 0exit 7
(or whatever) should exit, with$?
set to 7exit 1 2
should printexit: too many arguments
and set$?
to 1
Test cases for Exercise 3
- verify that you can run a few simple commands - e.g.
ls
,echo a b c
,/bin/ls
etc. - check that it fails correctly on bogus commands, e.g.
$ this-is-not-a-command this-is-not-a-command: No such file or directory $ ./this-is-not ./this-is-not: No such file or directory
Test cases for Exercise 4
- Go back to the tests for Exercise 2 and verify the status after each command
- verify that you handle commands returning a status of 0, 1, and another value, and report a status of 1 for command not found:
sh3650> true sh3650> echo $? 0 sh3650> false sh3650> echo $? 1 sh3650> sh -c 'exit 7' sh3650> echo $? 7 sh3650> not-a-command not-a-command: No such file or directory sh3650> echo $? 1
Test cases for Exercise 5
You’ll need to test the “normal” cases (i.e., input and output redirection work for a single command), and some abnormal cases:
- multiple
> file
or< file
for a single command - one and multiple ‘>’ or ‘<’ without files
- ’> file’, ‘< file’, ‘>’ and ‘>’ (and combinations) on a line with no commands
Finally, you need to check that you’re not “leaking” open file descriptors. The easiest way to do this is to run your shell in one terminal window, running a number of commands with redirected I/O, then go to another window, find the process ID of your program with ps
, and use the lsof
utility to list its open files:
$ ps aux |grep sh3650
cs3650 58863 0.0 0.0 2196 1152 pts/1 S+ 15:52 0:00 ./sh3650
cs3650 58865 0.0 0.0 8492 2048 pts/3 S+ 15:52 0:00 grep --color=auto sh3650
$ lsof -p 58863
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sh3650 58863 cs3650 cwd DIR 259,2 4096 918097 /home/cs3650/cs3650-f23/hw1
sh3650 58863 cs3650 rtd DIR 259,2 4096 2 /
sh3650 58863 cs3650 txt REG 259,2 107968 917589 /home/cs3650/cs3650-f23/hw1/sh3650
sh3650 58863 cs3650 mem REG 259,2 1641496 2359916 /usr/lib/aarch64-linux-gnu/libc.so.6
sh3650 58863 cs3650 mem REG 259,2 187776 2359751 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
sh3650 58863 cs3650 0u CHR 136,1 0t0 4 /dev/pts/1
sh3650 58863 cs3650 1u CHR 136,1 0t0 4 /dev/pts/1
sh3650 58863 cs3650 2u CHR 136,1 0t0 4 /dev/pts/1
Those three lines at the bottom are the three file descriptors: 0, 1 and 2, i.e. standard input, output, and error. The ‘u’ means they’re open for read+write, and the actual “file” is a terminal device, /dev/pts/1
. If you have a bunch of higher-numbered file descriptors listed, you’re leaking them.
Acknowledgments
This lab is created by Peter Desnoyers. Lab instruction Part 1 is adapted from Mike Walfish’s cs202 lab instructions; Part 2 is borrowed from Peter’s prior CS5600 shell lab.