CS 162 Midterm 2 Review
lecture 10: scheduling
thread life cycle
performance metrics
- response time (latency) = user perceived time to do some task
- throughput = the rate at which tasks are completed
- scheduling overhead = the time to switch from one task to another
- predictability = variance in response times for repeated requests
- fairness = equality in performance perceived by one task
- starvation = lack of progress for one task, due to resources being allocated to different tasks
other useful metrics
- waiting time for P = time before P is scheduled
- average waiting time = average of all processes’ wait time
- completion time = waiting time + run time
- average completion time = average of all processes’ completion time
assumptions
- threads are independent and only one user
- only look at a work-conserving scheduler (never leave processor idle)
- workload = set of tasks and how long
- compute bound = tasks that primarily perform compute
- fully utilize CPU
- IO bound = waits for IO, limited compute
- often in the Blocked state
scheduling policies
- first come, first serve (FCFS)
- runs tasks in order of arrival until completion (or blocks on IO) no preemption
- dmv model
- aka FIFO
- convoy effect = short process stuck behind long processes, lots of small tasks build up, FIFO is non-preemptible (as in, tasks cannot overtake each other)
- shortest job first
- minimize average completion time by scheduling shorter jobs over longer jobs
- can lead to starvation
- is subject to the convoy effect since it doesn’t interrupt
- *requires knowing the duration of the jobs
- shortest time to completion first (STCF)
- introduces ******preemption****** = ************current running task can be de-scheduled
- schedule task with least amount of time left
- requires a ton of switching
- can lead to starvation!
- no convoy effect tho :)
- round robin scheduling
- runs each job for a time slice called ****quantum****
- **********time-slicing********** = switch to the next job in ready queue when time is up
- switching takes a lot of time
- small quantas have lots of overhead
- RR can’t lead to starvation and no convoy effect!
- no process waits more than (n-1)q time
- completion time can be high because it stretches out longer jobs
lecture 11: scheduling
scheduling policy goals
- minimize average waiting time for IO/interactive tasks (tasks with short CPU bursts)
- minimize average completion time
- maximize throughput (including minimizing context switching)
- remain fair/starvation-free
nice
int nice(int inc)
- adds inc to the nice value for the current thread
- higher nice value = lower priority
strict priority scheduling
- split jobs by priority into n different queues and process the highest queue
- priority inversion = when a high priority thread can become starved by waiting on a low priority thread to release a resource they need
- medium priority task gets scheduled instead because the low priority one is blocking the high priority
multi-level feedback queue
- create distinct queues for ready jobs, each assigned a diff priority level
rules:
- if priority(a) > priority(b), a runs
- if == priority, a and b run RR using quantum of queue
- a new job always starts at the top of the queue
- once the job uses up the time allotment, it moves down a priority (regardless of how many times it gave up the CPU)
- prevents gaming the system by inserting IO request just before time quanta to stay on that priority level/queue
- after some time, move all the jobs back to the topmost queue (prevents starvation)
lec 12: scheduling - core concepts and classic policies
Linux O(n)
- at every context switch = scan list of processes, compute priorities, select processes to run
- issues = context switch costs increase with num processes, single queue in multi-core systems
Linux O(1)
- run in constant time with priority based scheduler
- 140 diff priorities = 0-99 real time or kernel tasks, user tasks 100 to 139 (100 highest)
- 2 ready queues
- active queue for tasks that haven’t used up the quanta
- expired queue for processes that have
- timeslices computed when jobs finish timeslice (depends on priority)
- priority adjustments
- user-task priority adjust +-5 based on sleep_avg (= sleep_time - run_time)
- higher sleep avg = more IO bound the tasks, more reward (and vice versa)
- “interactive credit” can be earned if the task is asleep or run for a long time, used to avoid changing interactivity when there are temporary changes in behavior
- maintains interactivity because they continue to get placed back into the queue
- real tasks
- SCHED_FIFO = preempt/’//s other tasks, no time slice
- SCHED_RR = preempts normal tasks, RR sheduling amongst tasks with the same priority
Real Time Scheduling
- goal = predictability of performance
- CFS = completely fair scheduler
- share CPU proportionally according to priority
- no starvation cuz every job makes SOME process, low priority goes less
- ex: lottery scheduling
- each job gets at least one ticket but some get more
- like hunger games!
- no starvation cuz each job at least has some chance to get chosen
- no priority inversion - priority WILL be honored with randomness
- each job gets at least one ticket but some get more
- temporary unfairness
- we have less control over who gets to run
- more unfair to shorter jobs cuz they have really low priority
stride scheduling
- deterministic proportional fair sharing
- stride of job = `big#W/N_i
- larger your share of tickets N_i, the smaller your stride is
- each job is a pass counter, scheduler picks lowest pass job, run it, add stride to its pass
- low-stride jobs (lots of tickets/high priority) run more often
- with larger ticket jobs, they look like they are getting less of the CPU with lower strides, so they end up getting scheduled more often
completely fair scheduler (cfs)
- cfs models an ideal multi-taking CPU
- each process gets an equal share of the CPU
- N threads simultaneously execute on 1/N of the CPU
- optimize GLOBAL PARAMETER not individual decision
- HOW IT WORKS
- track the cpu time per thread and then “repair” the threads to fairness by choosing to run the thread with the minimum cpu time
- stored in a Red-Black tree with scheduling cost O(log n)
- captures interactivity cuz sleeping threads don’t advance their CPU time, so auto get a boost when they wake up
- interactivity = when you can interact with the OS when there are 1+ programs already running
- responsiveness
- AKA low response time and starvation freedom
- constraint of target latency
- period of time over which every process gets service
Quanta = target latency / n
- throughput
- AKA avoiding excessive overhead
- constraint of minimum granularity
- minimum length of any time slice
- target latency 20 ms, min granularity 1 ms 200 processes
- SO each process gets 1 ms time slice
- BUT we want to incorporate priorities still
CFS: proportional fair sharing
nice
values range from -20 to 19- negative values are evil
- allow diff threads to have diff RATES of execution by using weights to compute the switching quanta
- basic equal share:
Q_i = target latency * 1/N
- weighted share:
Q_i = (w_i/SUMp(w_p)) * target latency
- basic equal share:
- reuses nice priority
- CFS uses
nice
values to scale weights exponentionally - weight = $1024/(1.25)^{nice}$
- CFS uses
lecture 13: deadlock
definition
- deadlock = cyclic waiting for resources
- starvation waits indefinitely but CAN end
- deadlock needs external intervention to be solved
- lock pattern exhibits non-deterministic deadlock
- a system is subject to deadlock if it has at least one execution pattern that can lead to deadlock
- threads block waiting for resources
- locks, terminals, printers, cd drives, meory
- threads often block waiting for other threads
- pipes, sockets
deadlock example
- deadlock implies starvation!
- starvation does not mean deadlock
- deadlock requires external intervention to fix, while starvation will eventually fix itself
four requirements
- mutual exclusion and bounded resources
- only one thread at a time can use a resource
- hold and wait
- thread holding at least one resource is waiting to acquire additional resources held by other threads
- no preemption
- resources are released only voluntarily by the thread holding the resource, after thread is finished with it
- circular wait
- there exists a set ${T_1, …, T_n}$ of waiting threads, where T1 waits on T2, T2 waits on T3, …, Tn waits on T1
resource allocation graph
set of threads: $T_1, T_2, …, T_n$
resource types: $R_1, R_2, …, Rn$ with $W_i$ instances
each thread uses:
Request(), Use(), or Release()
deadlock detection algorithm
- let [X] represent an m-aray vector of non-negative integers (quantities of resources of each type)
- [freeResources] = current free resources each type
- [request_x] = current requests from thread X
- [alloc_x] = current resources held by thread X
- see if tasks can eventually terminate on their own
- threads left in UNFINISHED = deadlocked
deadlock solutions
- deadlock prevention = write code in a way that isn’t prone to deadlock
- best way to go!
- deadlock recovery = let deadlock happen and then figure out how to recover from it
- deadlock avoidance = dynamically delay resource requests so deadlock doesn’t happen
- expensive cuz you have to track all your resources
- deadlock denial = ignore the possibility of deadlock
- means you need to find other ways to find deadlock prevention
deadlock prevention
conditions for success!
- provide sufficient resources (mutual exclusion and bounded resources)
- virtually infinite memory
- abort requests or acquire requests atomically (hold and wait)
-
request resources atomically
-
- preempt threads (no preemption)
- force threads to give up resources
- use database aborts (all of the actions are undone and the transaction is retried) or just retry at a new time
- order resources and always acquire resources in the same way (circular wait)
- force threads to always request resources in the same order, like x.Acquire() THEN y.Acquire()
deadlock avoidance
three states
- safe state = system can delay resource acquisition to prevent deadlock
- unsafe state = no deadlock yet but its possible to request resources in a way that will unavoidably lead to deadlock
- deadlocked state = there exists a current deadlock
banker’s algorithm
- ensures never enter an unsafe state by evaluating each request and grant if some ordering of threads is still safe after
- HOW IT WORKS
- pretend each request is granted, then run deadlock detection algo