[OSC] Day2:Process

Process


Source: Tanenbaum Section 2.1, 2.2, 2.4
Because the content for process is a lot, we’ll divide it into 4 parts.

Process 1

enter image description here

definition

process: a process is a running instance of a program, including the current values of the program counter, registers, and variables.
Process Control Block(PCB): contains all information necessary to administer the process and is essential for context switching in multiprogrammed systems.
Process’ memory image contains:
- the program code( could be shared between multiple processes running the same code)
- A data segment, stack and heap
enter image description here The data segment grows upward and the stack grows downward. Between them is a gap of unused address space.

Process states and transitions

enter image description here
how can we describe every state?
A new process has just been created(has a PCB) and is waiting to be admitted(it may not yet be in memory)
A ready process is waiting for CPU to become available(e.g. unblocked or timer interrupt or there is already a process using CPU)
A running process owns the CPU
A blocked process cannot continue, e,g, is waiting for I/O
A terminated process is no longer executable (PCB may be temporarily preserved)
how can we describe every transition?
new -> ready: admit the process and commit to execution
ready -> running: the process is selected by the process scheduler
running -> blocked: e.g. process is waiting for input or carried out a system call
blocked -> ready: event happens, e.g. I/O operation is finished
running -> ready: the process is preempted, e.h. by a timer interrupt or finish its time slice
running -> exit: process has finished, e.g. program ended or exception encountered
operating system queues:
enter image description here

Context Switching(multi-programming)

When a context switch takes place, the system saves the state of the old process and loads the state of the new process (creates overhead)
saved -> the PCB is updated
restarted -> the PCB read
A trade-off exists between the length of the time-slice and the context switch time
(if there are 99 processes and context switch time is 1ms)
Short time slices result in good response times but low effective “utilisation”(response time = 99*(1+1) = 198ms)
Long time slices result in poor response times but better effective “utilisation”(response time = 99*(100+1) = 9999ms)

PCB

A process control block(PCB) contains three types of attributes:
  • Process identification(PID, UID, Parent PID)
  • Process control information(process state, scheduling information, etc.)
  • Process state information(user register, program counter, stack pointer, program status word, memory management information, files, etc.)
    PCB are kernel data structures, i.e. they are protected and only accessible in kernel mode!
  • Allowing user applications to access them directly could compromise their integerity
  • The operating system manages them on the user’s behalf through system calls(e.g. to set process priority)

switching process


  1. Save process state(program counter, registers)
  2. Update PCB(running->ready/blocked)
  3. Move PCB to appropriate queue(ready/blocked)
  4. Run scheduler, select new process
  5. Update to running state in the new PCB
  6. Update memory management unit(MMU)
  7. Restore process
A context switch between processes invalidates the MMU!

Process creation in linux

enter image description here
after fork(), if the return value is equal to 0 , it is child process.

Process 2


Introduction to process scheduling

The definition of process scheduler is easy, it decide which process to run next, which uses a scheduling algorithm to do so.
Classification By Time Horizon:
- Long Term: applies to new processes and controls the degree of multiprogramming by deciding which process to admit to the system
- Medium term: controls swapping and the degree of multi-programming
- short term: decide which process to run next(usually called in response to clock interrupt, I/O interrupts, or blocking system calls)
Classification by Approach:
Non-preemptive: process are only interrupted voluntarily
Preemptive: processes can be interrupted forcefully or voluntarily

performance assessment criteria

User oriented criteria:
  • Response time: minimise the time between creating the job and its first execution
  • Turnaround time: minimise the time between creating the job and finishing it
System oriented criteria:
  • Throughput: maximise the number of jobs processed per hour
  • Fairness: are processing power/waiting time equally distributed?/Starvation?
    Evaluation criteria might be conflict, i.e. reducing the response time may increase context switch and may worsen the throughput and turnaround time.

Scheduling Algorithms

First come first served

concept: a non-preemptive algorithm that operates as a strict queueing mechanism and schedules the process in the same order that they were added to the queue.
Advantages: positional fairness and easy to implement
Disadvantages: 1. favours long processes over short ones 2. could compromise resource utilisation

Shortest job first

concept: a non-preemptive algorithm that starts process in order of ascending processing time using a provided/known estimate of the process.
Advantages: always result in optimal turnaround time
Disadvantage: 1. Starvation might occur 2. Fairness and predictability are compromised 3. Processing time have to be known beforehand.

Round Robin

concept: a preemptive version of FCFS that forces context switches at time slice
Advantages: 1. Improve response time 2. effective for time sharing systems
Disadvantages: 1. Increase context switching and thus overhead 2. favours CPU bound process over I/O process
3. can reduce to FCFS
what is CPU bound : process waste a lot of time on calculation
I/O bound: waste a lot of time on waiting I/O

Priority queues

concept: a preemptive algorithm that schedules processes by priority (high->low)
The process priority is saved in the PCB
Advantages: can prioritise I/O bound jobs
Disadvantages: low priority process may suffer from Starvation(with static priorities)

Process 3(Threads)

enter image description here

Threads vs. Processes

definition

A process consists of two fundamental units:

  • Resources: all related resources are grouped together(a logical address space containing the process image and files, I/O devices…)
  • Execution trace(Threads): an entity that gets executed
A process can share its resources between multiple execution traces that are interleaved, i.e., multiple threads running in the same resource environment

Threads from an OS perspective


Every thread has its own execution context(e.g. Program counter, stack, registers)
All threads have access to the process’ shared resources
  • e.g. Files, one thread opens a file, all threads of the same process can access to the file
  • Global variables, memory, etc;
Some CPUs have direct hardware support for multi-threading.
Similar to process, threads have:
  • States and transitions(new, running, blocked, ready, terminated)
  • A thread control block
A image to show the resource of process and thread

Inter-thread communication is easier/faster than interprocess communication(threads share memory by default)
No protection boundaries are required in the address space (threads are cooperating, belong to the same user, and have a common goal)
Synchronisation has to be considered carefully!

OS implementation of Threads

  • User threads
  • Kernel threads
  • Hybrid implementation

User threads(many to one)enter image description here

Thread management (creating, destroying, scheduling, thread control block manipulation) is carried out in user space with the help of a user library
The process maintains a thread table managed by the runtime system without the kernel’s knowledge
They can be implemented on OS that does not support multi-threading, using library to achieve
Advantage:
  • Threads are in user space, no mode switches required
  • full control over the thread scheduler
  • OS independent(Threads can run on OS that do not support them)
  • The runtime system can switch local blocking threads in user space
Disadvantage:
  • Blocking system calls suspend the entire process(user threads are mapped onto a single process, managed by the kernel)
  • Page faults result in blocking the process
  • No true parallelism (a process is scheduled on a single CPU)
  • Clock interrupts are non-existent(i.e. user threads are non-preemptive)

Kernel Threads(one to one) which Windows and Linux applies


The kernel manages the threads, user application accesses threading facilities through API and System calls
  • Thread table is in kernel, containing thread control blocks(subset of PCB)
  • If a thread blocks, the kernel chooses thread from same or different process(<->user threads)
Advantages:
  • True parallelism can be achieved
  • No non-blocking system calls needed
  • No run-time system needed
Disadvantage: Frequent mode switches take place, resulting in lower performance
Comparison:
User threads: when executing, the system always execute the thread in its own process to the end
Kernel threads: when executing, the kernel can choose the thread in its process or another process. But the creation and deletion of threads waste a lot

Hybrid Implementations(many to many)

User threads are multiplexed onto kernel threads
User application sees user threads and creates/schedules these (an “unrestricted” number)


Process4


multi-level feedback queues

moving beyond priority queues

Priority queues are usually implemented by using multiple queues. Different scheduling algorithms can be used for the individual queues (e.g., round robin, SJF, FCFS).
Feedback queues allow priorities to change dynamically, i.e., jobs can move between queues:

  • Move to lower priority queue if too much CPU time is used (prioritise I/O and interactive processes)
  • Move to higher priority queue to prevent starvation and avoid inversion of control
what is “inversion of control”?

Let me explain the picture above. For example, the process A first admitted by CPU and it gets the resource of X.
Then, process B which has higher priority comes and also requests resource X, because X is in process A, it will be blocked.
Then, process C with high priority come and it doesn’t need any resource. Thus, it runs and process A has no chance to release resource X. So it will be a deadlock, process B has high priority but B has no resource. The resource is in process A but A is in low priority.
If we change process A to have higher priority, the deadlock will be solved

Windows7 (multi-level feedback queues)

An interactive system using a preemptive scheduler with dynamic priority levels
Two priority classes with 16 different priority levels exist

  • “Real time” processes/threads have a fixed priority level(may be the system/kernel process)
  • “Variable” processes/threads can have their priorities boosted temporarily(Interactive I/O bound processes (e.g. keyboard) receive a larger boost)
A round robin algorithm is used within the queues

Process priority is the base level which is 4, within the process, the threads belong to the process can be higher or lower than base but is in the ±2 range of base level. In terms of the dynamic priority(which is the boost you can allocate), you can be high to the top (15) but can’t be lower than the lowest thread priority

Scheduling in Linux

Linux distinguishes between two types of tasks for scheduling:
  • Real time tasks (to be POSIX compliant), divided into:
    1.Real time FIFO tasks
    2.Real time Round Robin tasks
  • Time sharing tasks using a preemptive approach (similar to variable in Windows)
For real time tasks, it has higher priority than time sharing tasks(usually kernel tasks), which is range from 0-99(0 is highest, 99 is lowest). Then, timesharing task range from 100- 140.

real time tasks

Real time FIFO tasks have the highest priority and are scheduled using a FCFS approach, using preemption if a higher priority job shows up.
Real time round robin tasks are preemptable by clock interrupts and have a time slice associated with them.

Time sharing tasks

The CFS divides the CPU time between all processes.
If all N processes have the same priority:
They will be allocated a “time slice” equal to 1/N times the available CPU time
I.e., if N equals 5, every process will receive 20% of the processor’s time(Isn’t it completely fair?)
The length of the time slice and the “available CPU time” are based on the targeted latency (⇒ every process should run at least once during this interval)
If N is very large, the context switch time will be dominant, hence a lower bound on the “time slice” is imposed by the minimum granularity
A process’s time slice can be no less than the minimum granularity (response time will deteriorate)
A weighting scheme is used to take different priorities into account
If process have different priorities:
  • Every process i is allocated a weight wi that reflects its priority
  • The “time slice” allocated to process i is then proportional to wi/ ∑ wj

Multi-processor scheduling

When we talk about single processor machine, the scheduling is linear and we only ask : which process (thread) to run next (one dimensional)
But when we talk about multi-processor/core machine, we need to decide:
  • Which process(thread) to run where(i.e. which CPU?)
  • Which process(thread) to run when?
Then we have two scheduling queues:
1.Shared Queues
2.Private Queues

Shared Queues

A single or multi-level queue shared between all CPUs
Advantages: automatic load balancing
Disadvantages:

  • Contention for the queues(synchronization! locking is needed)
  • “All CPUs are equal, but some are more equal than others” : does not account for processor affinity
what is “process affinity”?
For one thread, we try to run it on the same CPU it previous run. Because cache will contain the information of this thread and TLB can hit.
But without process affinity, Cache becomes invalid when moving to a different CPU, TLB will become invalid too

Private Queues

Each process/thread is assigned to a queue private to an individual CPU
Advantages:
  • CPU affinity is automatically satisfied
  • Contention for shared queue is minimised
Disadvantages: less load balancing
Push and pull migration between CPUs is possible(because there are cache)

Thread types

Related: multiple threads that communicate with one another and ideally run together (e.g. search algorithm)
Unrelated: e.g. processes threads that are independent, possibly started by different users running different programs
There is one big problem in terms of thread scheduling

Process A has thread A0 and A1, A0 and A1 cooperate/Process B has thread B0 and B1, B0 and B1 cooperate
But when process A0 make a request, process A1 will receive at 100ms time, which is a latency
So how to get threads running, as much as possible, at the same time across multiple CPUs(working together?)

Space sharing

N threads are allocated to N dedicated CPUs
N threads are kept waiting until N CPUs are available
Non-preemptive, i.e. blocking calls result in idle CPUs (less context switching overhead but results in CPU idle time)
The number N can be dynamically adjusted to match processor capacity
However, this needs idle time

Gang scheduling

Time slices are synchronised and the scheduler groups threads together to run simultaneously (as much as possible), let all threads in a process run simultaneously

评论

此博客中的热门博文

[MLE] Linear Classification

[AIM] MetaHeuristics

[CS231] Neural Networks