CS5600 Lab1: C review

This lab will give you practice in the C programming language. For those who know C, this will reinforce your prior experience; for others, this is a good opportunity for you to learn C. The overall comfort you’ll gain (or reinforce) with C language and the Linux computing environment will be helpful for future labs.

Lab1 has three main sections:

Implementing a simple cipher.
Implementing a linked queue.
Implementing a ciphered queue using the cipher and the queue.

Challenge: there are two challenges in Lab1. They are optional. If you successfully finish them, you will get extra credits.

Section 0: C, text editor, and cloning Lab1

C programming language

All of the CS5600 projects will be done in C, for two main reasons. First, some of the things that we want to implement require direct manipulation of hardware state and memory, which are operations that are naturally expressed in C. Second, C are widely used languages (still!), so learning them is a useful thing to do in its own right.

If you are interested in why C looks like it does, we encourage you to look at Ritchie’s history. Here, perhaps, is the key quotation: “Despite some aspects mysterious to the beginner and occasionally even to the adept, C remains a simple and small language, translatable with simple and small compilers. Its types and operations are well-grounded in those provided by real machines, and for people used to how computers work, learning the idioms for generating time- and space-efficient programs is not difficult. At the same time the language is sufficiently abstracted from machine details that program portability can be achieved.”

If you want to master C, one book is (probably) enough:

The C programming language (second edition), Brian W. Kernighan and Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN: 0-13-110362-8.

Here is another useful link:

C Tutorial: https://www.cprogramming.com/tutorial/c-tutorial.html

A good text editor is essential

If you’re using a good editor on your desktop (such as Sublime Text or Atom), then you can continue using them. If you are not doing so already, begin using a powerful text editor. We recommend vim (or emacs). vim is already installed on the pre-built VM. If you want to install emacs, do: sudo apt-get install -y emacs.

Then, to edit:

$ vim myfile.txt
// or
$ emacs myfile.txt

Most of our instructions below will be with reference to vim, so you may wish to use that one.

To get started, work through either this vim tutorial or the emacs tutorial that is built into emacs. (From within emacs, you can pull up the tutorial by typing C-h t. This is read as “press ctrl-h, then let go, then press t”).

Then spend some time just navigating and editing sample text (perhaps doing this lab, perhaps writing notes to your friends that you later copy into email). You may feel at first that these editors are slowing you down (“how do I exit vim???”; you’re not alone, see a hilarious blog post here), but comfort and fluency with them will pay off in the long term. It’s worth slowing down and “learning proper technique.” You’ll find that within a day or so, you’ll have the required keystrokes internalized, and within a few days, you’ll probably be faster at common editing tasks.

Cloning the Lab1 repository

Click the GitHub Lab1 link on Canvas homepage to create your Lab1 clone on GitHub.
Start your VM and open the VM’s terminal.
(M1 users: start QEMU and ssh to the QEMU VM; recall here)
Clone the Lab1 repo to VM:
```
$ cd ~
$ git clone git@github.com:NEU-CS5600-21fall/lab1-<Your-GitHub-Username>.git lab1
```
Note that the repo address git@github.com:... can be obtained by going to GitHub repo page (your cloned lab1), clicking the green “Code” button, then choose “SSH”.
- Why “21fall” in the repo address? This is an accident. See why in a Piazza question here.

Check contents:

$ cd ~/lab1
$ ls -F

// you should see:
Makefile        caesar_main.c           lab1-shell-APIs.txt     queue_main.c
caesar.c        ciphered_queue_main.c   queue.c                 slack.txt
caesar.h        examples/               queue.h                 tests/

We provide test cases for Lab1. The unit tests are written in Python (python3). If you don’t have python3 installed, run $ sudo apt install python3.

Note: all the shell commands below happen within the folder ~/lab1/.

Section 1: Ciphers

Encryption is a method that encodes a normal message (called plaintext) into a secret code (called ciphertext) that hides the original message’s true meaning. To encode a message, a cipher includes a variable, called key, that makes a cipher’s output unique.

The Caesar Cipher (try it online here) is the earliest and simplest cipher. It takes any integer value as a key and replaces each letter of the plaintext with a different letter of the alphabet which is key positions further away in the alphabet. For example, if key = 3, then A is encoded by D, B is encoded by E, … X is encoded by A, Y is encoded by B, and Z is encoded by C.

Read and understand file caesar.h and caesar.c, then compile the code by:

$ make
gcc -o caesar -Wall -pedantic -fstack-protector-all -fsanitize=address caesar.c caesar_main.c -I.
gcc -o queue -Wall -pedantic -fstack-protector-all -fsanitize=address queue.c queue_main.c -I.
gcc -o ciphered_queue -Wall -pedantic -fstack-protector-all -fsanitize=address caesar.c queue.c ciphered_queue_main.c -I.

Run the compiled program to see the results:

$ ./caesar

// You should see the outputs:
Usage: ./caesar [encode|decode] <number> <string>

Exercise 1 finish encode(...).

Fill in function encode(...) that encodes the string plaintext using the Caesar cipher. It shifts characters by the given number of positions.
You can assume that the input text only includes uppercased letters, lowercased letters, and numbers. These three categories (uppercased letters, lowercased letters, and digits) form three “Caesar Cipher loops”: with a shift of one position, 'Z' => 'A', 'z' => 'a', and '9' => '0'.
Test your code (compile and run):
$ make
$ ./caesar encode 1 ABC
  
// the correct output should be
BCD

Exercise 2 finish decode(...).

Fill in the function decode(...) that decodes the ciphertext string using the Caesar cipher by shifting the characters back by key positions.
Test your code.
NOTE: whenever you modify the code, you need to compile the code (i.e., make) for the update to take effect.
$ make
$ ./caesar decode 1 BCD
  
// the correct output should be
ABC

Exercise 3 capture illegal characters.

You might have noticed that we made an assumption about the inputs (they are either letters or numbers), but what if the inputs are not what we want?
Update your encode(...) function to produce an error message when encountering illegal characters:
if the input string contains any character that is not [A-Z | a-z | 0-9], then function encode(...) returns “ILLCHAR” (a string) as the output.
does your code properly handle the case that the input string is shorter than “ILLCHAR”? (for example, “+” as input)
Test your code.
$ make
$ ./caesar decode 1 +
  
// the correct output should be
ILLCHAR

Now, with all functions implemented, you should test your caesar with more test cases (that we provide) and figure out if your code has bugs.

$ make test1

// If you pass all test cases, you should see:
[RUN] Project0Runner.decode_alphabet_lower_upper
[OK] Project0Runner.decode_alphabet_lower_upper
[RUN] Project0Runner.decode_alphanumeric
[OK] Project0Runner.decode_alphanumeric
[RUN] Project0Runner.decode_numbers
[OK] Project0Runner.decode_numbers
[RUN] Project0Runner.encode_alphabet_lower_upper
[OK] Project0Runner.encode_alphabet_lower_upper
[RUN] Project0Runner.encode_alphanumeric
[OK] Project0Runner.encode_alphanumeric
[RUN] Project0Runner.encode_numbers
[OK] Project0Runner.encode_numbers
All test cases passed!!!

If you fail some test cases, our script will provide the failed input, the expected output, and your output (which should be different from the expected output). You can use the information for debugging.

Section 2: Linked Queue

In operating systems, queues are used widely. For example, processes are represented using a specific data structure that contains all the information related to the process (identifier, etc). The scheduler keeps track of processes to schedule using a linked queue.

Read and understand file queue.h and queue.c. The type node_t represents a node in a queue that has both a pointer (node_t *) to the next node and a piece of data (char[]). The type queue_t represents a queue with a head and a tail pointing to the first and the last node in the queue.

Compile the code and run:

$ make
$ ./queue

// you should see
Usage: ./queue <file>

Read queue_main.c and understand how the program works. The program queue needs a file as input. Let’s feed the program with an example file called ./examples/q_case.txt:

$ ./queue ./examples/q_case.txt

// you should see:
cmd: enqueue ABC
cmd: dequeue
cmd: enqueue BDE
cmd: dequeue
cmd: dequeue
cmd: enqueue CS
cmd: enqueue 5600
finally: 

A couple of things to keep in mind:

You will have to allocate memory for new nodes (by using malloc).
Don’t forget to free the memory that you allocate!
Use $ man <function-name> to check useful C functions. (See usage of man by $ man man!)
You are free to add other structures or functions in your implementation.
C does not have classes, it is a procedural language, but it does have struct (see queue.h) for grouping together several data items into a new type. Types can be given new names using typedef. By convention, they are usually named ending with a _t.

Exercise 4 implement queue.c

Fill in the function enqueue(...) that adds elements to the end of the queue.
Fill in the function void * dequeue(...) that removes and returns the element at the front of the queue. Notice that the return value is a general pointer (void *) and you are free to return whatever you want. For example, possible options include returning the node, returning the data in the node, etc.
Implement free_queue(...).
Implement print_queue(...).

Exercise 5 fill in queue_main.c

Read the “section 3” of lab1-shell-APIs.txt carefully. This document explains the expected input-output format of Lab1. It is crucial to follow the APIs because TA’s grading code interacts with your code via these APIs. If you code does not follow them, the grading tools will think your code is incorrect (hence losing points).
Fill in the function process_line(...) in queue_main.c.

Test your queue implementation by running make test2. If you pass all test cases, you should see:

$ make test2

[RUN] Project2Runner.empty_enqueue_test
[OK] Project2Runner.empty_enqueue_test
[RUN] Project2Runner.large_enqueue_op_test
[OK] Project2Runner.large_enqueue_op_test
[RUN] Project2Runner.regular_pass_test
[OK] Project2Runner.regular_pass_test
[RUN] Project2Runner.regular_print_test
[OK] Project2Runner.regular_print_test
All test cases passed!!!

Section 3: Ciphered Queue

In this section, you will combine the Caesar cipher and the queue to build a ciphered queue, in which nodes’ data may be encrypted. For the ciphered queue, you will reuse your previous Caesar cipher and queue implementations.

Read file ciphered_queue_main.c, compile, and run.

$ make
$ ./ciphered_queue

// you should see:
Usage: ./ciphered-queue <file>

// run with an example file
$ ./ciphered_queue ./examples/cq_case.txt

// you should see:
cmd: encode_enqueue 1 ABC
cmd: dequeue
cmd: encode_enqueue 1 ABC
cmd: dequeue_decode 1
cmd: encode_enqueue 1 A+C
cmd: dequeue_decode 1
cmd: enqueue CS
cmd: enqueue 5600
cmd: encode_enqueue 1 CS
cmd: encode_enqueue 1 5600
finally: 

Exercise 6 complete ciphered_queue_main.c.

Read the “section 4” of lab1-shell-APIs.txt carefully. Again, it is important to strictly follow the input-output specifications.
Implement function encode_enqueue(...) and dequeue_decode(...).
Fill in function process_line2(...).

A clarification of two erroneous cases. There are two “erroneous” cases (illegal characters and invalid commands). Your program ciphered_queue should behave differently:

when meeting an invalid command (e.g., “enqueue” with no arguments), your program should ERROR-and-stop: (1) print “ERROR\n” on screen and (2) quit gracefully (remember to free memory!)
when the plaintext for encode_enqueue is invalid (e.g., encode_enqueue 1 A+C; notice “+” is an illegal character), you should store something special in the node. What we (the grading script) expect is to see an “ILLCHAR” as an output when the program dequeue or dequeue_decode the node. (your program should not stop in this scenario.)

Test your code ciphered_queue by running make test3. If you pass all test cases, you should see:

$ make test3

[RUN] Project3Runner.empty_enqueue_test
[OK] Project3Runner.empty_enqueue_test
[RUN] Project3Runner.large_enqueue_op_test
[OK] Project3Runner.large_enqueue_op_test
[RUN] Project3Runner.regular_pass_test
[OK] Project3Runner.regular_pass_test
All test cases passed!!!

Make sure you pass all the test cases before submitting Lab1 (otherwise, you will lose points). Our grading scripts will have more test cases.

Challenges

There are two challenges for Lab1. Challenge 1 and 2 are 5 and 15 points, respectively (out of 100 of Lab1).

Challenge I support larger data string.

(Optional) In our current implementation, the data string size in a node is limited to 127 bytes (why not 128? see here). Can you support strings with no length limits? That is however long a data string is, you queue can save it.
Note: increasing 128 to 256, or any other large numbers, doesn’t work. (which is also a waste of memory!)

Challenge II add undo to the ciphered queue.

(Optional) Add a new commands undo to your implementation of ciphered queue.
undo applies to the status of the queue. It cancels the effectiveness of the last command (encode_enqueue, dequeue_decode, enqueue, or dequeue).
multiple, say n, undo will cancel the most recent n commands.
If there is no command to undo, undo will do nothing.
Here is an example. Given an input file:
encode_enqueue 1 ABC
encode_enqueue 1 DEF
dequeue_decode 1
undo
dequeue_decode 1
undo
undo
the output (on screen) should be:
ABC    // output of the first dequeue_decode
ABC    // output of the second dequeue_decode
finally: [BCD]   // "encode_enqueue 1 DEF" is undoed

Finally, submit your work

Submitting consists of three steps:

Executing this checklist:
- Fill in slack.txt with (1) your name, (2) your NUID, (3) slack hours you used, and (4) if you have taken challenges.
  Note: you have to specify that you finished the challenges. Otherwise, we will assume you didn’t.
- Make sure that your code build with no warnings.
- Make sure you have added (git add) all files that you created.

Push your code to GitHub:

 $ cd ~/lab1
 $ git commit -am 'submit lab1'
 $ git push origin 

 Counting objects: ...
  ....
  To ssh://github.com/NEU-CS5600-21fall/lab1-<username>.git
   7337116..ceed758  main -> main

Actually commit your lab (with timestamp and git commit id):
1. Get the git commit id of your work. A commit id is a 40-character hexadecimal string. You can obtain the commit id for the last commit by running the command git log -1 --format=oneline.
2. Paste both your git repo address and the commit id (the hexadecimal string) to Canvas. In Canvas, there will be an assignment for this lab. You should paste the git repo address and the commit id in two lines:
```
 git@github.com:NEU-CS5600-21fall/lab1-<username>.git
 29dfda9c788fade33421f242b5dd1ff5295fd3c9
```
  Notice: the repo address must start with git@github.com:... (not https://...). You can get your repo address on GitHub repo page by clicking the green “Code” button, then choose “SSH”.
3. Note: You can submit as many times as you want; we will grade the last commit id submitted to Canvas. Also, you can submit any commit id in your pushed git history; again, we will grade the commit id submitted to Canvas.

NOTE: Ground truth is what and when you submitted to Canvas.

A non-existent repo address or a non-existent commit id in Canvas means that you have not submitted the lab, regardless of what you have pushed to GitHub—we will not grade it. So, please double check your submitted repo and commit id!
The time of your submission for the purposes of tracking lateness is the timestamp on Canvas, not the timestamp on GitHub.

This completes the lab.

Acknowledgments

This lab is created by Peter Desnoyers, with modifications by the CS5600 staff. Parts of the lab instructions are borrowed from Mike Walfish’s CS202 lab instructions.