Week 1.b CS5600 1/11 2023 https://naizhengtan.github.io/23spring/ 0. course structure 1. OS history 2. Processes 3. Process's view of memory (and registers) 4. Crash course of x86-64 assembly --------------------------------------------------------------------------- 0. course structure [from last time] - components of the course: --lectures --labs --exams --reading --homeworks - lab questions: --we expect you think through, then ask --"Here is my code. It doesn't work. Please debug." won't work. --If you get a reply from a TA, and then send email 20 minutes later asking a closely related question, that’s probably not great too. - exams: -- midterm and final -- closed book -- will cover lectures, homeworks, readings, and labs - reading & homework --you may find lectures would not repeat readings. That's intended (also that's the point). [draw a figure about depth versus width of knowledge; point out relations between textbooks, lectures, and labs] - integrity policies: --Here are some questions: Looking at a classmate's solution and then coding it by yourself afterward Showing your code to a classmate who has questions Modifying code that you find on StackOverflow Modifying code for a similar assignment that you find on GitHub The correct answer: ALL of these are ruled out by the policy. --Please see the policy page, and let me say here: --The collaboration and academic integrity policy is real --please make sure that you've really thought through your question on your own before you ask for help --Exams will have questions about labs; and "If there are inexplicable discrepancies between exam and lab performance, we will overweight the exam, and possibly interview you." (see the policy page) 1. A brief history of OSes --We'll begin the story with Unix, but operating systems go back earlier. --why OS: people were tired of operating machines --Unix: 1969 to early 1970s (from Ken Thompson and Dennis Ritchie at Bell Labs) --goal was to have a simple system that runs on cheap hardware. PDP-7 --hard to port to other machines --eventually, Thompson and Ritchie decide they need a new programming language to write Unix in --So C gets born --Unix took over the world. --Its abstractions are still in use everywhere --(Which is arguably depressing) --Separate strand: in the 1970s, at Xerox PARC, people are trying to build the personal computer.... --eventually led to the graphical interfaces and idioms we have today (windowing systems, mouse, menus) --hard to imagine why "personal computer" was revolutionary, but keep in mind that when the PARC team started, telling someone that you were going to build a personal computer was like telling them that you would have a personal nuclear reactor --Monolithic kernel (Linux) vs. microkernel vs. Exokernel (1995) [skipped] --Virtualization and cloud, Unikernel? -- DISCO, 1997, VMware -- multi-core machines, the scalable commutative rule -- XPU era -- CPU + GPU/DPU/NPU/IPU -- CXL (compute express link) -- many exciting hardware breakthroughs: -- RDMA -- 750MB CPU L3 cache -- persistent memory [skipped] [Makimoto's Wave and app vs. hardware] 2. Processes --key abstraction: process --motivation: you want your computer to be able to do multiple things at once: --you might find it annoying to write code without being able to listening to music or downloading files. --or if we have a computer with multiple users, they all need to get things done simultaneously, for example, login.khoury.northeastern.edu --another motivation: resource efficiency: --example #1: increase CPU utilization: --->|wait for input|---->|wait for input| gcc----------------> [this is called "overlapping I/O and computation"] --example #2: reduce latency: A goes for 80 s, B goes for 20 s A-----------> B --> : takes B 100 s run A and B concurrently, say for 10s each, makes B finish faster. --process from scratch, an example [DRAW PICTURE: loader HUMAN --> SOURCE CODE --> EXECUTABLE -----------> PROCESS vi gcc as ld loader HUMAN ---> foo.c ---> foo.s ----> foo.o ---> a.out ----> process NOTE: 'ld' is the name of the linker. it stands for 'linkage editor'.] --classical definition of a process: instance of a running program --examples are browser, text editor, word processor, PDF viewer, image processor, photo editor, messaging app --a process can be understood in two ways: a. from the **process's** point of view this class. high-level: process sees an abstract machine Today: we will use "a process's view of the world" to do several things: - demystify functional scope - demystify pointers - describe how programmers get certain kind of work out of the system. b. from the **OS's** point of view meaning how does the OS implement, or arrange to create, the abstraction of a process? we will deprioritize this for now, and come back later in the course. 3. Process view of memory Background, before we even get to processes, recall some basic elements in a machine (a computer) are: Q: basic elements in a machine? [later, plan a review session about computer organization.] a. CPU core (a processor), which consists of * Some execution units (like ALUs) that can perform computation, in response to _instructions_ (addq, subq, xorq ...). Arithmetic and logical instructions typically take one _processor cycle_, or less. * A small number of registers which execution units can read from very quickly (in under a cycle). There are many types of registers. For today, we only need to think about two kinds: ** _General-purpose_ registers. There are 16 of these on the types of machines we are considering (known as the x86-64 architecture): RAX, RBX, RCX, RDX, RSI, RDI, R8-R15, RSP and RBP. RSP is special: it is the _stack pointer_; more on this below. RBP is the base pointer (in Intel terminology); we will often call this the _frame pointer_. ** Special purpose registers. The only one we consider here is: RIP This is the instruction pointer (also known as program counter). It points to the *next* instruction that the processor will execute, assuming there is no branch or jump to another location. * Registers are in CPUs (not part of memory) [handout] b. Memory, which takes more time to access than registers (usually 2 to several 100 cycles). [In reality, there are "hierarchies of memory", built from caches, but for today, we will just think of memory as a homogeneous resource.] c. peripherals (disks, GPUs, ...) [draw CPU + Mem on board] - how a program runs -- "code" (executable binary) is stored in memory -- CPU fetches instructions from memory [where is the current instructions? pointed by %RIP] -- CPU decodes the current instruction and executes it -- "%RIP = %RIP + 1" --------------------------------------------------------------------------- Admin: - lab lateness policy; lab extension -- slack hours (120hrs) -- penalty: -1% a hour and -50% max - ground truth timestamp and code? how lab is graded? a sad story from prior CS5600 student --------------------------------------------------------------------------- Today, our focus is on registers and memory. Three aspects to a process: (i). Each process "has its own registers." What does that mean? It means that while the process is executing (and from the perspective of the programmer who wrote the program that became the process), the process is using those registers mentioned above. This is part of what we mean when we say that a process has the impression that it's executing on an abstract machine. (The reason that this should not be taken for granted is that a given physical machine will generally have multiple processes that are sharing the CPU, and each has "its own" registers, yet each process has no knowledge of the others. The mechanisms by which the hardware and OS conspire to give each process this isolated view are outside of our current scope; we'll come back to this when we study context switches. It relates to process/OS control transfers, discussed at the end.) (What your book calls "direct execution" is the fact that, while the process is "executing", it's using the actual CPU's true registers.) (ii). Each process has its own view of memory, which contains: * the ".text" segment: memory used to store the program itself * the ".data" segment: memory used to store global variables * The memory used by the heap, from which programmer allocates using `malloc()` * The memory used for the stack, which we will talk about in more detail below. process (really, the developer of the program) thinks of memory as a contiguous array: [text/code | data | heap --> <--- stack ] (iii). For a process, very little else is actually needed, but a modern process does have a lot of associated information: --- signal state, signal mask, priority, whether being debugged, etc., etc. 4. Crash course in x86-64 assembly [write on board] syntax: movq PLACE1, PLACE2 means "move 64-bit quantity from PLACE1 to PLACE2". the places are usually registers or memory addresses, and can also be immediates (constants). pushq %rax equivalent to : [ subq $8, %rsp movq %rax, (%rsp) ] [stop here; will catch from here next time] popq %rax [ movq (%rsp), %rax addq $8, %rsp ] call 0x12345 [ pushq %rip movq $0x12345, %rip] ret [ popq %rip ] --above we see how call and ret interact with the stack --call: updates %rip and pushes old %rip on the stack --ret: updates %rip by loading it with stored stack value [want to learn more about x86 assembly code? check out: https://cs.brown.edu/courses/cs033/docs/guides/x64_cheatsheet.pdf]