Linux CPU Scheduling

Overview

The goal of this lab is to explore process scheduling in Linux and to explore system calls that allow users to modify scheduling.

Activities

Work your way through the following exercises, demonstrating your knowledge of the material by answering the numbered questions.
Submit a detailed lab report including the answers to the numbered questions. If you modify the Sample Programs to answer a question, include the updated code in the writeup.
Several questions ask about timing. Don't forget that the shell has a built-in time command.

Overview of Scheduling

Study the man page man 7 sched. Focus on the sections titled API summary, Scheduling policies, SCHED_FIFO, SCHED_RR, SCHED_OTHER (you do not need to read the part about nice values), and SCHED_IDLE.

(6 pts) How many different scheduling policies are available? List all of them.
(1 pt) How many different scheduling priorities are available?
(2 pts) Explain the difference between a policy and a priority. Which is more important for determining when a process will run?
(1 pt) In what ways is this setup similar to a multilevel queue as discussed in class?
(2 pts) Explain the main difference between SCHED_FIFO and SCHED_RR. (Hint: they are not very different.) How do these algorithms compare to FIFO and Round-robin as discussed in class?

Note: When people discuss "the Linux scheduler", they are often referring to the CFS (Completely Fair Scheduler) that controls scheduling in SCHED_OTHER. The CFS is not too complicated and is worth reading about if you are curious, but we will not go into the details in this lab.

System Calls Affecting Scheduling

Next, we are going to experiment with some of the system calls listed in sched(7).

TIP

The man pages of the system calls used in the following section are specific to Linux and the predefined constants that you find there are NOT by default available even when you include sched.h. To make them available, you must add the following line before your #include statements:

#define _GNU_SOURCE

WARNING

To demonstrate what can/can't be done, some of the system calls below may return an error. Recall the use of the function perror() and the global variable errno from a previous lab exercise.

(1 pt) Write a simple program that checks its own scheduling policy. Based on this, what is the default scheduling policy?
The scheduling policy you find will be a number – I am interested in the actual name corresponding to the number. Include code showing how you determined the number and explain where you looked to determine the corresponding name.
(1 pt) Write a program that tries to change its own scheduler to round-robin without changing the priority. What happens? Why? (Hint: Using perror() throughout your code may be helpful here.)
(2 pts) Write a program that tries to change its own scheduler to round-robin with a higher priority. Which system call(s) could you use to determine the priority necessary for round-robin scheduling? What happens? Why? (Hint: using perror() throughout your code may be helpful here.)

Processor Affinity

When multiple CPUs are available on your system, the OS may dispatch your process(es) to any available CPUs. But, the operating system may also give users the option of requesting CPU affinity — specifying a specific processor or subset of processors that a thread should run on. One reason this may be helpful is to make better use of the cache. However, a process will not necessarily run faster simply because it has affinity for a particular CPU.

We can use system calls to observe CPU affinity in action. Sample Program 1 is designed to accept a single Boolean argument (0 or 1) determining whether to set CPU affinity. The program then forks a number of children and finally performs some useless work to keep the processor(s) busy long enough for us to observe interesting trends.

Determine the system call that allows you to set the affinity of a process to a specific CPU. Modify the sample program below to incorporate the correct call. Be sure to include the line you used in your writeup.

Prior to running the sample program, check how many CPUs are available on your system by running the following command:

bash

lscpu
sysctl hw.ncpu
sysctl hw.physicalcpu

Sample Program 1

WARNING

The following sample code is not a good example to use in a real production: the parent fails to call wait(). It is intentionally omitted for brevity.

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define NLOOPS 500000000
#define NUM_PROCS 8

int main(int argc, char *argv[])
{
    cpu_set_t set;
    int joinedCPU = 0;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s {0|1}\n", argv[0]);
        exit(1);
    }

    CPU_ZERO(&set);
    CPU_SET(joinedCPU, &set);

    if (atoi(argv[1]) != 0) {

        // ADD LINE HERE TO SET AFFINITY
    }

    pid_t pid;
    for (int i = 0; i < NUM_PROCS - 1; i++) {
        if ((pid = fork()) < 0) {
            perror("Problem forking");
            exit(1);
        } else if (!pid) {
            break;
        }
    }

    long sum = 0;
    for (long j = 0; j < NLOOPS; j++) {
        sum += j;  // meaningless work to keep the CPU busy
    }

    return 0;
}

(2 pts) Depending on whether or not affinity is set, is the program faster, slower, or about the same? Explain why you observed the performance that you did.
(1 pt) Based on your observations, is processor affinity inherited by forked children? How can you tell by observing the program run? You can verify in the man pages if you are not sure.

Yielding

Determine the purpose of the system call sched_yield — you can review sched(7), sched_yield(2), or any other source you prefer. Then, consider Sample Program 2 below. Once again, the primary purpose of the program is to perform meaningless work that lets us observe scheduling behavior.

Sample Program 2

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>

#define BIG_NUM 50000000

int main(int argc, char** argv) {

    if (argc < 2) {
        fprintf(stderr, "Program requires an integer argument.\n");
        exit(1);
    }

    long sum = 0;
    int argint = atoi(argv[1]);
    for (long i = 0; i < BIG_NUM; i++) {
        if (argint > 0) {
            sched_yield();
        }
        sum += i;
    }

}

Run the program with an argument that does not cause the yield to occur. Then, run it again with a different argument that causes the yield to occur. Try to ensure that the system is otherwise mostly quiet, i.e., that the CPU is not overloaded before you run the program.

(1 pt) You should observe a fairly drastic change in behavior. What is going on, and why might this behavior occur even on a quiet system?

Experimenting with Policies

In a previous part of the lab, we determined that not all scheduling policies are available to us as non-root users on EOS. For instance: SCHED_RR and SCHED_FIFO can be used only from an executable with root permission. However, the other three policies can be used by non-root users: SCHED_OTHER, SCHED_IDLE, and SCHED_BATCH. We will concern ourselves with just the two policies for this lab: SCHED_OTHER and SCHED_IDLE.

One of the two policies should be more deferential than the other. That is, if many processes are competing for the CPU and they have these scheduling policies, processes with one policy will get more system attention than processes with the other policy.

(5 pts) Design and run an experiment to demonstrate which of the two policies (SCHED_OTHER and SCHED_IDLE) policy takes precedence. It is not enough simply to look up the answer in the man pages — you must write code that confirms the expected behavior.

Your experiment can be heavily based on code in the previous parts of the lab. Overall, the experiment should focus on:

*keeping all the CPUs on your machine busy,
by running multiple processes with different scheduling policies, and
timing the program run (or other methods to confirm your hypothesis)

You already have code that sets the scheduling policy, and you have an example of code that runs for a long time to allow us to observe the scheduling in action.

IMPORTANT

Run your experiment on the EOS machine (instead of your Mac or Windows/WSL2)
There are two aspects to consider when measuring time of a task:
- Time of completion (when? at 9pm), or
- How much time needed to complete (how long? 9 hours).
Be sure your experiment is designed to measure the right one.
The above experiment works only if you have processes with both types of scheduling are running at the same time. This calls for a parent process that spawns many children. Otherwise, the deferential processes will all share well with one another and you will not see the CPU-hogging behavior from the other processes.
Be aware that you may be running your program on a system with multiple CPU core(s).

Deliverables

Your writeup should include your source code, a description of how you ran the program(s), and any relevant screenshots. Your description should be clear enough that I can repeat your experiment if I choose to do so.

Hints:

A scheduling policy will not come into play unless you have enough work to keep all of your processing cores busy.
If your program runs for long enough, you can use top to observe CPU occupancy/activity. A process using the less deferential policy should complete much later than a process using the more deferential policy as long as both processes are running at the same time.

Overview ​

Activities ​

Overview of Scheduling ​

System Calls Affecting Scheduling ​

Processor Affinity ​

Sample Program 1 ​

Yielding ​

Sample Program 2 ​

Experimenting with Policies ​

Deliverables ​