Overview
The goal of this lab is to explore process scheduling in Linux and to explore system calls that allow users to modify scheduling.
Activities
- Work your way through the following exercises, demonstrating your knowledge of the material by answering the numbered questions.
- Submit a detailed lab report including the answers to the numbered questions. If you modify the Sample Programs to answer a question, include the updated code in the writeup.
- Several questions ask about timing. Don't forget that the shell has a built-in
time
command.
Overview of Scheduling
Study the man
page man 7 sched
. Focus on the sections titled API summary
, Scheduling policies
, SCHED_FIFO
, SCHED_RR
, SCHED_OTHER
(you do not need to read the part about nice
values), and SCHED_IDLE
.
(6 pts) How many different scheduling policies are available? List all of them.
(1 pt) How many different scheduling priorities are available?
(2 pts) Explain the difference between a policy and a priority. Which is more important for determining when a process will run?
(1 pt) In what ways is this setup similar to a multilevel queue as discussed in class?
(2 pts) Explain the main difference between
SCHED_FIFO
andSCHED_RR
. (Hint: they are not very different.) How do these algorithms compare to FIFO and Round-robin as discussed in class?
Note: When people discuss "the Linux scheduler", they are often referring to the CFS (Completely Fair Scheduler) that controls scheduling in SCHED_OTHER
. The CFS is not too complicated and is worth reading about if you are curious, but we will not go into the details in this lab.
System Calls Affecting Scheduling
Next, we are going to experiment with some of the system calls listed in sched(7)
.
TIP
The man pages of the system calls used in the following section are specific to Linux and the predefined constants that you find there are NOT by default available even when you include sched.h
. To make them available, you must add the following line before your #include
statements:
#define _GNU_SOURCE
WARNING
To demonstrate what can/can't be done, some of the system calls below may return an error. Recall the use of the function perror()
and the global variable errno
from a previous lab exercise.
(1 pt) Write a simple program that checks its own scheduling policy. Based on this, what is the default scheduling policy?
The scheduling policy you find will be a number – I am interested in the actual name corresponding to the number. Include code showing how you determined the number and explain where you looked to determine the corresponding name.
(1 pt) Write a program that tries to change its own scheduler to round-robin without changing the priority. What happens? Why? (Hint: Using
perror()
throughout your code may be helpful here.)(2 pts) Write a program that tries to change its own scheduler to round-robin with a higher priority. Which system call(s) could you use to determine the priority necessary for round-robin scheduling? What happens? Why? (Hint: using
perror()
throughout your code may be helpful here.)
Processor Affinity
When multiple CPUs are available on your system, the OS may dispatch your process(es) to any available CPUs. But, the operating system may also give users the option of requesting CPU affinity — specifying a specific processor or subset of processors that a thread should run on. One reason this may be helpful is to make better use of the cache. However, a process will not necessarily run faster simply because it has affinity for a particular CPU.
We can use system calls to observe CPU affinity in action. Sample Program 1 is designed to accept a single Boolean argument (0
or 1
) determining whether to set CPU affinity. The program then forks a number of children and finally performs some useless work to keep the processor(s) busy long enough for us to observe interesting trends.
- Determine the system call that allows you to set the affinity of a process to a specific CPU. Modify the sample program below to incorporate the correct call. Be sure to include the line you used in your writeup.
Prior to running the sample program, check how many CPUs are available on your system by running the following command:
lscpu
sysctl hw.ncpu
sysctl hw.physicalcpu
Sample Program 1
WARNING
The following sample code is not a good example to use in a real production: the parent fails to call wait()
. It is intentionally omitted for brevity.
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define NLOOPS 500000000
#define NUM_PROCS 8
int main(int argc, char *argv[])
{
cpu_set_t set;
int joinedCPU = 0;
if (argc != 2) {
fprintf(stderr, "Usage: %s {0|1}\n", argv[0]);
exit(1);
}
CPU_ZERO(&set);
CPU_SET(joinedCPU, &set);
if (atoi(argv[1]) != 0) {
// ADD LINE HERE TO SET AFFINITY
}
pid_t pid;
for (int i = 0; i < NUM_PROCS - 1; i++) {
if ((pid = fork()) < 0) {
perror("Problem forking");
exit(1);
} else if (!pid) {
break;
}
}
long sum = 0;
for (long j = 0; j < NLOOPS; j++) {
sum += j; // meaningless work to keep the CPU busy
}
return 0;
}
(2 pts) Depending on whether or not affinity is set, is the program faster, slower, or about the same? Explain why you observed the performance that you did.
(1 pt) Based on your observations, is processor affinity inherited by forked children? How can you tell by observing the program run? You can verify in the
man
pages if you are not sure.
Yielding
Determine the purpose of the system call sched_yield
— you can review sched(7)
, sched_yield(2)
, or any other source you prefer. Then, consider Sample Program 2 below. Once again, the primary purpose of the program is to perform meaningless work that lets us observe scheduling behavior.
Sample Program 2
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#define BIG_NUM 50000000
int main(int argc, char** argv) {
if (argc < 2) {
fprintf(stderr, "Program requires an integer argument.\n");
exit(1);
}
long sum = 0;
int argint = atoi(argv[1]);
for (long i = 0; i < BIG_NUM; i++) {
if (argint > 0) {
sched_yield();
}
sum += i;
}
}
- Run the program with an argument that does not cause the yield to occur. Then, run it again with a different argument that causes the yield to occur. Try to ensure that the system is otherwise mostly quiet, i.e., that the CPU is not overloaded before you run the program.
- (1 pt) You should observe a fairly drastic change in behavior. What is going on, and why might this behavior occur even on a quiet system?
Experimenting with Policies
In a previous part of the lab, we determined that not all scheduling policies are available to us as non-root users on EOS. For instance: SCHED_RR
and SCHED_FIFO
can be used only from an executable with root permission. However, the other three policies can be used by non-root users: SCHED_OTHER
, SCHED_IDLE
, and SCHED_BATCH
. We will concern ourselves with just the two policies for this lab: SCHED_OTHER
and SCHED_IDLE
.
One of the two policies should be more deferential than the other. That is, if many processes are competing for the CPU and they have these scheduling policies, processes with one policy will get more system attention than processes with the other policy.
- (5 pts) Design and run an experiment to demonstrate which of the two policies (
SCHED_OTHER
andSCHED_IDLE
) policy takes precedence. It is not enough simply to look up the answer in theman
pages — you must write code that confirms the expected behavior.
Your experiment can be heavily based on code in the previous parts of the lab. Overall, the experiment should focus on:
- *keeping all the CPUs on your machine busy,
- by running multiple processes with different scheduling policies, and
- timing the program run (or other methods to confirm your hypothesis)
You already have code that sets the scheduling policy, and you have an example of code that runs for a long time to allow us to observe the scheduling in action.
IMPORTANT
Run your experiment on the EOS machine (instead of your Mac or Windows/WSL2)
There are two aspects to consider when measuring time of a task:
- Time of completion (when? at 9pm), or
- How much time needed to complete (how long? 9 hours).
Be sure your experiment is designed to measure the right one.
The above experiment works only if you have processes with both types of scheduling are running at the same time. This calls for a parent process that spawns many children. Otherwise, the deferential processes will all share well with one another and you will not see the CPU-hogging behavior from the other processes.
Be aware that you may be running your program on a system with multiple CPU core(s).
Deliverables
Your writeup should include your source code, a description of how you ran the program(s), and any relevant screenshots. Your description should be clear enough that I can repeat your experiment if I choose to do so.
Hints:
- A scheduling policy will not come into play unless you have enough work to keep all of your processing cores busy.
- If your program runs for long enough, you can use
top
to observe CPU occupancy/activity. A process using the less deferential policy should complete much later than a process using the more deferential policy as long as both processes are running at the same time.