Sunday, December 19, 2010

An experiment with the Linux scheduler

I was curious to see how the Linux scheduler would manifest from a program's perspective, so today I did an experiment.

I wrote a single-threaded program running a simple loop. All the loop does is to compute the number of milliseconds since the last iteration, and store the result in a histogram. We are not so much interested in the performance of the loop (it does about a million iterations per second) but in the variations in the intervals between loop iterations. These variations are presumably caused by the Linux scheduler.

Here are the numbers I achieved, and the same numbers in a chart (with a logarithmic y-axis).

Interval (milliseconds)Frequency
0 450,080,302
1 909,044
2 4,642
3 1,696
4 853
5 561
6 557
7 335
8 1,098
9 152
10 86
11 52
12 98
13 17
14 13
15 6
16 21
17 5
19 3
20 1
21 2
22 2
23 0
24 2
25 0
26 0
27 0
28 1
29 0

The vast majority of iterations occur zero milliseconds after the previous iteration. No surprise there; Java's clock granularity, 1 millisecond, is coarse enough to execute over a million instructions.

If the thread was never interrupted, one would expect the loop to tick forward 1 millisecond 1,000 times per second, or about 500,000 times in all. It actually ticks 909,044 times, so interrupts are at work: about 400,000 of them.

Longer delays also occur: 2, 3, 4, up to 28 milliseconds, occurring with exponentially decreasing frequency. Only 8 delays of 20 milliseconds or longer occur in the 7.5 minute run. The chart shows the exponential decay clearly. The chart plots log-frequency, and the trend line is indeed flat from 2 milliseconds onwards, so it is accurate to characterize the line as exponential.

The one surprising thing: significant bumps at 8, 12 and 16 milliseconds. Although the trend of the line is pretty consistently down, each of those interval durations has more distinctly occurrences than the previous interval. Does anyone know anything about the Linux scheduler that might explain this?


Matt Casters said...

Hi Julian,

Are you sure it wasn't the garbage collector or some other JVM process that causes the anomalies?


Julian Hyde said...

I agree, it could be GC or something else in the JVM. But the effect would be very similar to another process in the operating system, wouldn't it? Remember, this is a dual core machine running with a load average of 6, so we expect that this thread is giving up 30-50% of its CPU time.

Rodrigo Haces said...

Have you tried with a C program?? that would support and/or deny Matt's comment!!