Week 4.b CS 5600 02/01 2023 https://naizhengtan.github.io/23spring/ 1. Last time 2. Scheduling intro 3. Scheduling disciplines FIFO SJF RR Prio MLFQ Lottery 4. Vote for scheduling algorithms --------------------------------------- - HW2: updates -- problem: different OSes implement things differently - TA 1. Last time - wrap up process -- PCB -- context switch (high level) [explain the differences between "understand" vs. "internalize"] 2. Scheduling intro High-level problem: operating system has to decide which process (or thread) to run. A. When scheduling decisions happen: [draw process transition diagram] exit (iv) +-------->[terminated] (iii) interrupt | [new] --> [ready] <-------------- [running] ^ ---------------> | I/O completion \ scheduler dispatch | or signal (ii) \ __________| \ / [waiting] <---/ I/O or wait (i) scheduling decisions take place when a process: (i) Switches from running to waiting state (ii) Switches from waiting to ready (iii) Switches from running to ready state (iv) Exits preemptive scheduling: at all four points nonpreemptive scheduling: at points (i), (iv), only (this is the definition of nonpreemptive scheduling) B. what are the metrics and criteria? timeline ----v---------v------------v----> arrival 1st execution completion --turnaround time time for each process to complete (completion - arrival) --waiting/response/output time. variations on this definition, but they all capture the time spent waiting for something to happen rather than the time from launch to exit. --time between when job enters system and starts executing; following the class's textbook, we will call this "response time" (1st execution - arrival) [some texts call this "waiting time"] --time from request to first response (e.g., key press to character echo) we will call this "output time" [some texts call this "response time"] --system throughput # of processes that complete per unit time --fairness different possible definitions: --freedom from starvation --all users get equal time on CPU --having CPU proportional to resources needed --etc. 3. Scheduling disciplines - scheduling game (recall) -- 1 CPU -- multiple processes: P1, P2, P3, ... -- each with arrival time and running time -- ignore context switch costs, unless happening too frequently -- assume no I/O (Completely unrealistic but lets us build intuition, as a warmup.) -- scheduling output: a sequence of scheduling decisions - let's do some intellectual exercises! A. FCFS/FIFO --run each job until it's done --a case: process arrival running P1 0 24 P2 0+\epsilon 3 P3 0+2\epsilon 3 --[ P1 P2 P3 ] --average turnaround time? (24 + 27 + 30) / 3 = 27 --average response time? (0 + 24 + 27) /3 = 17 --observe: scheduling P2,P3,P1 would drastically reduce average turnaround time Question: if the arrival sequence is P2, P3, P1, what will the average turnaround time and response time be? --advantages to FCFS: --simple --no starvation --few context switches --disadvantage: --short jobs get stuck behind long ones B. SJF and STCF --SJF: shortest job first Schedule the shortest job first --STCF: shortest time-to-completion first preemptive version of SJF: if job arrives that has a shorter time to completion than the remaining time on the current job, immediately preempt CPU to give to new job --idea: --get short jobs out of the system --big (positive) effect on short jobs, small (negative) effect on large jobs --example: process arrival running P1 0 7 P2 2 4 P3 4 1 P4 5 4 preemptive: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 P1 P1 P2 P2 P3 P2 P2 P4 P4 P4 P4 P1 P1 P1 P1 P1 --Question: what is the average turnaround time? Answer: (16 + 5 + 1 + 6) / 4 = 7 --Question: what is the average response time? Answer: (0 + 0 + 0 + 2) / 4 = 0.5 --defer advantages and disadvantages to after we introduce relaxation of "no I/O" assumption let's decide we also care about response time (as defined by our book: duration between when a job arrives and when it is first scheduled.) C. Round-robin (RR) --add timer --preempt CPU from long-running jobs. per time slice (also called quantum) --after time slice, go to the back of the ready queue --most OSes do something of this flavor (see MLFQ below) -- notice that our office hours use RR! --example: (with slice of 1 unit of time) process arrival running P1 0 50 P2 0 50 --schedule: P1 P2 P1 P2 ... --average turnaround time: (99+100) /2 = 99.5 --Question: average response time: (0+1) / 2 = 0.5 --advantages: --fair allocation of CPU across jobs --low average response time when job lengths vary --good for output time if small number of jobs --disadvantages: --what if jobs are same length? --in the example: 2 jobs of 50 time units each, a slice of 1 --average turnaround time: 100 (vs. 75 for FCFS) --how to choose the slice size? --want much larger than context switch cost --pick too small, and spend too much time context switching --pick too large, and response time suffers (extreme case: system reverts to FCFS) --typical time slice is between 10-100 milliseconds. context switches are usually microseconds or tens of microseconds (maybe hundreds) D. Priority --priority scheme: give every process a number (set by administrator), and give the CPU to the process with the highest priority (which might be the lowest or highest number, depending on the scheme) --can be done preemptively or non-preemptively --example: process arrival running P1 (high) 0 10 P2 (low) 0 5 --schedule: P1 then P2 --turnaround time: (10 + 15) / 2 = 12.5 --response time: (0+10) / 2 = 5 --if we swap the priorities on P1 and P2 (P2 is high; P1 is low) --turnaround time: (5+15) / 2 = 10 --response time: (0+5) / 2 = 2.5 --generally a bad idea to use strict priority because of starvation of low priority tasks. (--note: SJF is priority scheduling where priority is the predicted next CPU running time ) --solution to this starvation is to increase a process's priority as it waits E. Multi-level feedback queue (MLFQ) -- combine RR and Priority three ideas: --*multiple queues, each with different priority*. 32 queues, for example. --round-robin within each queue (flush a priority level before moving onto the next one). --feedback: process's priority changes, for example based on how little or much it's used the CPU --you can design different feedback mechanisms --one example: "downgrade" a process after 5 unit of time running example: (with slice of 1 unit of time; 2 units of time to change priority, i.e., "downgrade") process arrival running P1 0 4 P2 0 4 P3 4 4 schedule: 0 1 2 3 4 5 6 7 8 9 10 11 P1 P2 P1 P2 P3 P3 P1 P2 P3 P1 P2 P3 advantages: --approximates STCF disadvantages: --can't donate priority --not very flexible --not good for real-time and multimedia --can be gameable: user puts in meaningless I/O to keep job's priority high F. Lottery and stride scheduling [citations: C. A. Waldsburger and W. E. Weihl. Lottery Scheduling: Flexible Proportional-Share Resource Management. Proc. Usenix Symposium on Operating Systems Design and Implementation, November, 1994. http://www.usenix.org/publications/library/proceedings/osdi/full_papers/waldspurger.pdf C. A. Waldsburger and W. E. Weihl. Stride Scheduling: Deterministic Proportional-Share Resource Management. Technical Memorandum MIT/LCS/TM-528, MIT Laboratory for Computer Science, June 1995. http://www.psg.lcs.mit.edu/papers/stride-tm528.ps Carl A. Waldspurger. Lottery and Stride Scheduling: Flexible Proportional-Share Resource Management, Ph.D. dissertation, Massachusetts Institute of Technology, September 1995. Also appears as Technical Report MIT/LCS/TR-667. http://waldspurger.org/carl/papers/phd-mit-tr667.pdf ] Issue lottery tickets to processes -- Let p_i have t_i tickets -- Let T be total # of tickets, T = \sum t_i -- Chance of winning next slice is t_i / T -- Note lottery tickets not used up, more like season tickets --example (with slice of 1 unit of time) process arrival running P1 (t_1=20) 0 20 P2 (t_2=10) 0 10 --schedule (many possibilities; for example, P2 P2 P2 P2 ... P1 P1 P1 ...) --Question: expected turnaround time for (P1, P2): A. (20, 30) B. (30, 20) C. (30, 30) Answer: C expected turnaround time = (running time) / (expected running time in each slice) P1: 20 / (20/30) = 30 P1: 10 / (10/30) = 30 controls long-term average proportion of CPU for each process can also group processes hierarchically for control --subdivide lottery tickets --can model as currency, so there can be an exchange rate between real currencies (money) and lottery tickets --lots of nice features --deals with starvation (have one ticket --> will make progress) --don't have to worry that adding one high priority job will starve all others --adding/deleting jobs affects all jobs proportionally (T gets bigger) --can transfer tickets between processes: highly useful if a client is waiting for a server. then client can donate tickets to server so it can run. --note difference between donating tix and donating priority. with donating tix, recipient amasses enough until it runs. with donating priority, no difference between one process donating and 1000 processes donating --many other details --ticket inflation for processes that don't use their whole slice --use fraction f of slice; inflate tix by 1/f until it next gets CPU --disadvantages --latency unpredictable --expected error somewhat high --for those comfortable with probability: this winds up being a binomial distribution. variance n*p*(1-p) --> standard deviation \proportional \sqrt(n), --where: p is fraction of tickets owned n is number of quanta --in reaction to these disadvantages, Waldspurger and Weihl proposed *Stride Scheduling* --basically, a deterministic version of lottery scheduling. less randomness --> less expected error. --see OSTEP (chap 9) for details 4. Vote for your favorite scheduling algorithm Candidates: FIFO, STCF, RR, Prio, MLFQ, and Lottery Winners (by votes): "Best Turnaround Time" winner: STCF "Best Response Time" winner: RR "Best Fairness" winner: MLFQ "Most popular algorithm" winner: MLFQ cheng's choices: "Best Turnaround Time" winner: STCF "Best Response Time" winner: RR "Best Fairness" winner: Lottery "Most popular algorithm" winner: Lottery