Week 2.b CS3650 01/17 2024 https://naizhengtan.github.io/24spring/ 1. A life cycle of a program 2. Why C? 3. C basics 4. C pointers --------------------- Recap: - bits, bytes, and ints [show the xkcd] - crash course of computer organization -- cpu: a loop of loading and executing instructions [A simple CPU for section 3: https://piazza.com/class/lr5c8wm87f62yp/post/11 ] -- memory: an array of bytes -- disk: an array of blocks 1. a life cycle of a program [DRAW PICTURE: loader HUMAN ---> SOURCE CODE --> EXECUTABLE -----------> PROCESS vi gcc as ld loader HUMAN ---> foo.c ---> foo.s ----> foo.o ---> a.out ----> process NOTE: 'ld' is the name of the linker. it stands for 'linkage editor'.] * demon: a compilation $ gcc hello.c -o hello // generate everything If we want to see each step: $ gcc -S hello.c // => hello.s, assembly file $ as hello.s -o hello.o // => generate binary object file $ ld hello.o -o hello -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -L/usr/local/lib -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/lib/darwin/libclang_rt.osx.a // => generate the executable binary [note: this is on my laptop, macos; doesn't apply to yours.] 2. why C? - history C is designed for building UNIX by Dennis Ritchie - good for low-level programming easy mapping between C and machine instructions easy mapping between C types and hardware structures e.g.., set bit flags in hardware registers of a device - minimal runtime easy to port to another hardware platform direct access to hardware - explicit memory management no garbage collector in complete control of memory management - efficient: compiled (no interpreter) compiler compiles C to assembly - popular for building kernels, system software, etc. good support for C on almost any platform - it is both amazing and depressing, as we still use C for systems programming - good reasons why not C? easy to write incorrect code easy to write code that has security vulnerabilities - using high-level language to implement OS? It was (?) a hot topic: -- using Go: https://pdos.csail.mit.edu/papers/biscuit:login19.pdf (OSDI'2018) -- using Rust: https://www.yecl.org/publications/boos2020osdi.pdf (OSDI'2020) [given time, might circle back to Biscuit.] 3. C basics I will mechanically go through the basics. Note that even if I were to spent 5hrs on this, you won't learn much until you play with C on your own. today, learn by examples A. Control flow - branches: if, else, else if, switch - loops: -- while(...){}, do{}while(...), for(;;){} -- break; continue -- goto [exists but don't use] B. functions and scope - function definition: int foo(int x, int y) { ... return ret; } - variable scopes: -- local variable (in function) -- global variable (everywhere) C. types & operators - basic types: char, int, float, double [see an example] - assignment: = - arithmetic operators: +, -, *, /, % - relational operators: <, <=, >, >=, ==, !=, &&, || - precedence and associativity (tricky) [see handout] - Example: A + B * C * D => (1) tmp = B * C (2) tmp2 = C * D (3) result = X + tmp2 - Q: what's the steps of "A[B]->C"? [answer: (1) tmp=A[B], (2) result=tmp->C ] - Q: what's the steps of "(int)A[B]"? [answer: (1) tmp=A[B], (2)result=(int)tmp ] - Q: what's the steps of "A=B==C"? [see an example] [answer: (1)tmp=B==C, (2)A=tmp] 4. C pointers - a pointer = a memory address every variable has a memory address: (i.e., p = &i) so each variable can be accessed through its pointer (i.e., *i) a pointer can be variable (e.g., int *p) and thus has a memory address, etc. - get the address of a variable x: "&x" - demon: a pointer and a pointer of a pointer an example: char val = 'a'; char *ptr = &val; char **ptrptr = &ptr; "ptrptr" points a pointer, say "ptr", which further points to a char "val" addr0 +-------+ | addr1 | // <-- ptrptr +-------+ addr1 +-------+ | addr2 | // <-- ptr +-------+ addr2 +-------+ | 'a' | // <-- val +-------+ Q: What do you expect to see: int main() { char val = 'a'; char *ptr = &val; char **ptrptr = &ptr; printf(" val=%c\n ptr=%p => *ptr=%c\n ptrptr=%p => *ptrptr=%p => **ptrptr=%c\n", val, ptr, *ptr, ptrptr, *ptrptr, **ptrptr); } [you should try it yourself.] Q: Does the results the same for multiple runs? Why? [Answer: each time the program is loaded into different part of the memory] Q: Why a pointer of a pointer is useful? Answer: may cases, for example, modify a pointer within a function - pointer arithmetic [demo: int main() { char *c = (char *)0x08200000; int *i = (int *) 0x08200000; printf("%p %p\n", (c+1), (i+1)); return 0; } ] Q: what is the value of c+1 and i+1? [try it yourself] Q: what do you expect to see when dereference the ptr? like: printf("%c\n", *c); [try it yourself]