CPU Scheduler Covert Timing Channel

A pair of programs that send data between isolated processes by modulating and measuring CPU scheduler contention.

Language UsedC

FocusSystems Security, Operating Systems

StatusComplete

00 · The Problem

Processes Cannot Directly Share "Data" Without Authorization.

Communication between processes is generally straightforward when the participating processes have been granted permission to do so. Modern Operating System enforce multiple levels of privilege, with the Operating System kernel possessing the highest level of authority. Consequently, user-space processes must normally request Operating System assistance when performing privileged or potentially unsafe operations, including many forms of interaction with other processes.

This project investigates whether two processes can communicate without using an authorized, direct inter-process communication (IPC) mechanism. More specifically, it asks whether two processes can transmit information indirectly through their interaction with a shared system resource, while making that communication difficult to distinguish from ordinary system activity. The principal challenge is reliability. If indirect communication is possible, the receiver must be able to identify the transmitted information despite interference from unrelated system activity. Bandwidth, and therefore the practical capacity of the channel, presents an additional challenge.

After completing COMP 3430, I reasoned that if conventional IPC mechanisms were unavailable, the two processes would require access to some shared resource from which information could be inferred. My initial ideas involved the file system. For example, one process could create files of predetermined sizes, write recognizable sequences of values to a file, or encode information through filenames within a shared directory.

These approaches, however, would still rely on explicit Operating System-managed state. Both processes would require access to the same files, directories, or metadata, making the communication comparatively direct and observable. Such activity would also leave persistent evidence that could be inspected, logged, or traced back to the transmitting process. As a result, the file system did not appear to provide the type of indirect and difficult-to-observe communication mechanism that I wanted to learn about.

I next considered other resources shared by concurrently executing processes, particularly the processor and physical memory. I ultimately chose to investigate CPU utilization because every active process depends on processor time, and the Operating System scheduler must continually distribute that time among competing workloads. I had also recently implemented a basic multilevel feedback queue scheduler, which gave me an introductory understanding of how schedulers organize workloads and of some of the metrics used to evaluate scheduling behaviour.

The initial concept was therefore straightforward: execute a process with different workload intensities and determine whether a different process could detect the changes in processor availability caused by the other. This starting point immediately introduced two significant problems:

1) Modern systems contain multiple processor cores and logical CPUs. Unless both processes execute on the same logical CPU, the receiver may observe little or no measurable contention from the sender. For the initial implementation, this problem can be addressed using sched_setaffinity() to restrict both processes to the same logical CPU. Although this requires a system call and imposes an artificial experimental constraint, it provides a controlled environment in which the underlying communication mechanism can first be evaluated. A later project would be to determine whether this constraint can be relaxed.

2) A typical computer executes many processes and kernel activities concurrently. Context switches, interrupts, background services, and other scheduler decisions introduce variation into the amount of processor time available to the receiver. The sender must therefore create a sufficiently large and consistent change in processor contention for the receiver to distinguish the intended signal from ordinary system noise.

After formulating this initial design, I recognized two limitations in my own understanding:

1) COMP 3430 was my first substantial introduction to Operating System implementation. My understanding of scheduling, processor contention, and hardware–software interaction was still developing.

2) I did not yet know how to measure the processor time available to an individual process or how to separate sender-induced contention from Operating System overhead and unrelated background activity.

This raised a broader methodological question: how should I implement a system when I do not yet possess all of the knowledge required to design it? One possible approach would be to ask a large language model to generate the complete implementation. Although this might produce working code more quickly, it would remove much of the process through which the problem is defined, assumptions are tested, implementation difficulties are encountered, and technical understanding is developed. This would go against what this project was intended to accomplish.

Using large language models (LLMs) to obtain solutions can be compared to consulting the answer key at the back of a mathematics textbook. Although doing so can be useful, the educational value of the answer is often greatest only after the learner has first attempted to work through the problem independently. The process of struggling with a problem helps reveal gaps in understanding (by defining sub-problems) and provides the context necessary to evaluate the proposed solution critically (by solving those sub-problems and using that knowledge to analyze the initial problem).

A major difficulty in implementing a system is that my initial mental models of both the problem and its solution are often incomplete and partially inaccurate. In this project, for example, my understanding of CPU scheduling and of how two C programs might indirectly communicate through shared processor activity contains significant gaps. Unlike a mathematics textbook, an open-ended systems project does not provide a predetermined sequence of exercises designed to expose and correct these misconceptions. I must first determine which questions need to be asked before I can begin answering them. In doing so, I will hopefully understand the problem scope.

An incomplete mental model still provided me with an useful starting point. My development process therefore consists of the following stages:

Articulate my initial mental model as precisely as possible so that its assumptions and proposed mechanisms can be evaluated.
Use an LLM to identify inaccurate assumptions, missing concepts, and potential approaches that warrant further investigation.
Implement simplified versions of the proposed approaches while independently verifying the underlying mathematics and systems concepts through textbooks, documentation, experimentation, and other non-LLM sources.
Iteratively revise my understanding of CPU behaviour and the mechanisms required for reliable communication in response to experimental results.
Continue this process until the refined mental model is sufficiently accurate to support the development of a functional and reliable tool.

In this way, the LLM is not treated as a substitute for understanding, but as a tool for identifying gaps, generating questions, and guiding further study. The ultimate objective is to develop an understanding of why the resulting system works. After much back and forth with my LLM of choice (ChatGPT), I compiled a list of 4 "Major" problems, of which had roughly 58 sub problems (some of which have no bearing being uttered in a public setting. I suppose it is a requirement to play the part of the fool before you can play the master - or someone who understands the Scheduler haha).

Four Main Problems to Overcome

Problem 01

How can CPU contention be measured?

If processor contention is to serve as the medium through which one process infers information from another, the receiving process requires a reliable method of measuring changes in processor availability. Because a CPU executes machine instructions rather than source-code lines, an appropriate measurement can be based on the amount of computational work completed within a fixed interval. This problem had an immediately apparent solution, so I include it as follows:

The consumer can repeatedly execute a standardized workload and record how many iterations it completes during each sampling period. A reduction in completed work will (hopefully) indicate that another process is competing for time on the same processor.

Problem 02

How can two processes synchronize without communicating directly?

Conventional communication depends on the participants sharing enough context to determine when a message begins, what information is being transmitted, and which portion of the message is currently being interpreted. Direct communication protocols establish this context through explicit exchanges of state between the producer and consumer.

In this design, however, the two processes cannot directly exchange synchronization information. The consumer must instead identify an observable pattern in the producer’s activity that marks the beginning of a transmission and establishes the boundaries between individual symbols or bits. In practical terms, the consumer must be able to answer: “Which portion of the producer’s message am I currently listening to?”

Problem 03

How can the intended signal be distinguished from background noise?

An operating system is a large and highly concurrent software system. User processes, kernel operations, interrupts, context switches, and background services may all compete for processor time. These activities introduce variation into the consumer’s measurements, even when the producer is not transmitting information.

The consumer must therefore distinguish contention deliberately created by the producer from contention caused by unrelated system activity. This requires a way to characterize normal background variation and decide whether an observed change is strong and consistent enough to represent an intended signal.

Problem 04

How should the consumer manage its internal state?

The receiving process must maintain enough information to track its progress through the transmission while not having to worry about using too much space. It therefore needs data structures that represent both the message being reconstructed and the current stage of the decoding process.

A clearly defined state-management strategy separates the phases of a frame, records partially decoded information, and gives the consumer a predictable way to recover when a transmission is invalid.

01 · The Outcome

A Scheduler-Contention Covert Timing Channel

A covert channel is a communication channel that repurposes a system mechanism for something other than its intended function. Rather than transmitting information through a conventional interface such as a socket, pipe, or file, the communicating programs encode information into changes in otherwise legitimate system behaviour. Because this activity does not resemble ordinary communication, it may be difficult to detect and could potentially be used to bypass security policies.

In this project, two processes communicate by competing for access to a single logical processor. A processor is intended to execute machine instructions, not to carry messages. However, when two runnable processes are forced to share the same CPU, the activity of one process changes how much processor time remains available to the other. The producer deliberately creates or removes contention, while the consumer measures how much computational work it can complete during fixed-length time windows. A high work count represents one observable state, while a low work count represents another. These states are mapped to the binary symbols 0 and 1, allowing bytes and complete messages to be reconstructed without the processes directly exchanging data.

Essential Components

More formally, the channel is composed of the following elements:

A producer and consumer: producer.c creates the signal, while consumer.c observes and decodes it.
A shared resource: both processes are restricted to one logical CPU using sched_setaffinity(), ensuring that they compete for the same processor time.
A signalling mechanism: the producer transmits one state by remaining CPU-runnable and the other by sleeping.
An observable timing effect: the consumer measures how many batches of computational work it completes during each sampling interval.
A symbol-encoding scheme: high and low work-count states are mapped to the binary symbols 0 and 1.
A shared timing model: communication is divided into fixed-duration slots, with each slot representing one transmitted bit.
Synchronization and framing: a known preamble identifies the beginning of a frame, while calibration, length, payload, and CRC fields define its internal structure.
A decoding rule: the consumer derives a threshold from known busy and idle calibration slots and uses it to classify later measurements.
Noise tolerance and validation: subslot medians, normalized cross-correlation, calibration checks, and a CRC help distinguish intended signals from operating-system noise and reject corrupted frames.

Mechanisms Solving the Four Main Problems Discussed

In the previous section, four "Main Problems" were highlighted. Here are the solutions implemented to solve the related problems.

To transmit a 1, the producer remains CPU-runnable and repeatedly performs batches of integer and bitwise operations until the current slot reaches its absolute deadline. To transmit a 0, it sleeps until that same deadline. A busy producer therefore reduces the amount of work the consumer can complete, while a sleeping producer leaves more CPU time available.

The consumer measures this effect by running a standardized workload and counting the number of completed batches before each sampling deadline. Each bit slot is divided into smaller subslots. The measurements nearest the slot boundaries are discarded, and the median of the interior subslots becomes the representative value for that bit. This reduces the effect of scheduling jitter and isolated interruptions.

Shared CPU

producer.c and consumer.c are pinned to the same logical processor.

On-off keying

A 1 produces contention by remaining busy; a 0 yields the processor by sleeping.

Batched work counts

The consumer counts completed work batches instead of timing individual instructions.

Robust slot estimate

Boundary subslots are removed and the median of the remaining samples is used.

The producer assigns every transmitted bit to a fixed-duration slot measured from an absolute CLOCK_MONOTONIC epoch. The start time is rounded to a slot boundary, and each later deadline is calculated from that original epoch. This prevents small timing errors from accumulating as transmission progresses.

Every frame begins with a known 32-bit preamble. The consumer moves a candidate window across its measurements and calculates the normalized cross-correlation between the observed work-count pattern and a bipolar representation of the preamble. Once the correlation exceeds the configured minimum and the following calibration sequence is valid, the consumer records the frame's starting position and uses predefined offsets to locate the length, payload, and CRC fields.

Absolute deadlines

Each slot ends at an epoch-derived deadline rather than after a relative sleep.

32-bit preamble

A shared, recognizable busy-idle pattern marks the beginning of every frame.

Normalized cross-correlation

The consumer searches for the shape of the preamble independently of the absolute CPU speed.

Known frame offsets

After locking, the consumer can locate each field from the recorded frame start.

Within each slot, the consumer removes the first and last subslot measurements and takes the median of the interior values. During frame detection, it requires the complete 32-bit preamble pattern to produce a sufficiently strong normalized correlation. These steps reduce the influence of context switches, interrupts, and short-lived background activity.

After the preamble, the producer sends eight known busy bits and eight known idle bits. The consumer calculates the median of each group and places its decoding threshold halfway between them. It also checks that the idle and busy medians appear in the expected order and are separated by a configured minimum. Finally, the decoded length and payload must reproduce the received CRC-8 value before the frame is accepted.

Median filtering

Outlying subslot measurements have limited influence on the decoded bit.

Pattern validation

A candidate must resemble the full preamble rather than merely contain one low work count.

Adaptive threshold

Busy and idle calibration medians define a threshold for the current frame.

CRC-8

The final payload is accepted only when its reconstructed checksum matches.

Before synchronization, the consumer remains in a searching state and tests candidate preambles only after enough samples have been collected. A successful lock records the frame start, correlation, calibration medians, and decoding threshold. Once the length byte is available, its value is stored and used to calculate exactly how many additional slots are required for the payload and CRC.

The consumer does not decode a field until all measurements belonging to that field have arrived. It then reconstructs the payload byte by byte and performs the final CRC comparison. If validation fails, the failed-candidate counter is incremented, FrameState is cleared, and the consumer returns to searching for another preamble. A maximum sample-buffer size prevents indefinite listening when no valid frame is found.

FrameState metadata

The structure records lock status, frame position, calibration values, threshold, and payload length.

Phase checkpoints

Preamble, length, payload, and CRC processing occur only when each phase is complete.

Calculated frame boundary

The decoded length determines the exact sample at which the complete frame is available.

Failure recovery

An invalid CRC resets the state and returns the consumer to its synchronization search.

02 · Interactive explanation

How Can Two Programs Pass a Message without Talking to Each Other?

Imagine two undercover soldiers stationed in opposite regions of enemy territory. Their official assignment is to watch cat videos all day and confirm that the brain rot is, in fact, getting stronger. Their computers share one painfully crowded military network. One soldier always watches; the other watches to send a 1 and stops to send a 0. No message is sent directly—the first soldier only notices whether the cats arrive smoothly or the cats stops brainrotting.

Choose the soldiers’ secret message

Use up to 12 characters. The operation turns the message into a timed pattern of “watch” and “do not watch” decisions.

Step 1 of 8

The watcher

Put the first soldier on cat-video duty

Signaler

Not watching yet

The signaler has not opened the cat-video feed.

One shared military network

Signaler 0%Watcher 100%

With only one stream active, the cats arrive at full operational speed.

Watcher

Watching continuously

The watcher judges whether each video plays smoothly or becomes a slideshow.

Current decision

No secret signal yet

The watcher is learning what a quiet network feels like.

No cat-video decision is being sent during this stage.Advance to a signaling stage to inspect an individual 1 or 0.

Back to Projects

Return to projects