The 2.6 LinuxKernel includes selectable I/O schedulers. They control the way the Kernel commits reads and writes to disks – the intention of providing different schedulers is to allow better optimisation for different classes of workload.

Without an I/O scheduler, the kernel would basically just issue each request to disk in the order that it received them. This could result in massive HardDisk thrashing: if one process was reading from one part of the disk, and one writing to another, the heads would have to seek back and forth across the disk for every operation. The scheduler’s main goal is to optimise disk access times.

An I/O scheduler can use the following techniques to improve performance:

Request merging
The scheduler merges adjacent requests together to reduce disk seeking
The scheduler orders requests based on their physical location on the block device, and it basically tries to seek in one direction as much as possible.
The scheduler has complete control over how it prioritises requests, and can do so in a number of ways

All I/O schedulers should also take into account resource starvation, to ensure requests eventually do get serviced!

The Schedulers

There are currently 4 available:

  • No-op Scheduler
  • Anticipatory IO Scheduler (AS)
  • Deadline Scheduler
  • Complete Fair Queueing Scheduler (CFQ)

No-op Scheduler

This scheduler only implements request merging.

Anticipatory IO Scheduler

The anticipatory scheduler is the default scheduler in older 2.6 kernels – if you've not specified one, this is the one that will be loaded. It implements request merging, a one-way elevator, read and write request batching, and attempts some anticipatory reads by holding off a bit after a read batch if it thinks a user is going to ask for more data. It tries to optimise for physical disks by avoiding head movements if possible – one downside to this is that it probably give highly erratic performance on database or storage systems.

Deadline Scheduler

The deadline scheduler implements request merging, a one-way elevator, and imposes a deadline on all operations to prevent resource starvation. Because writes return instantly within Linux, with the actual data being held in cache, the deadline scheduler will also prefer readers – as long as the deadline for a write request hasn't passed. The kernel docs suggest this is the preferred scheduler for database systems, especially if you have TCQ aware disks, or any system with high disk performance.

Complete Fair Queueing Scheduler (CFQ)

The complete fair queueing scheduler implements both request merging and the elevator, and attempts to give all users of a particular device the same number of IO requests over a particular time interval. This should make it more efficient for multiuser systems. It seems that Novel SLES sets cfq as the scheduler by default, as does the latest Ubuntu release. As of the 2.6.18 kernel, this is the default schedular in releases.

Changing Schedulers

The most reliable way to change schedulers is to set the kernel option “elevator” at boot time. You can set it to one of “as”, “cfq”, “deadline” or “noop”, to set the appropriate scheduler.

It seems under more recent 2.6 kernels (2.6.11, possibly earlier), you can change the scheduler at runtime by echoing the name of the scheduler into /sys/block/$devicename/queue/scheduler, where the device name is the basename of the block device, eg “sda” for /dev/sda.

Which one should I use?

I've not personally done any testing on this, so I can't speak from experience yet. The anticipatory scheduler will be the default one for a reason however - it is optimised for the common case. If you've only got single disk systems (ie, no RAID - hardware or software) then this scheduler is probably the right one for you. If it's a multiuser system, you will probably find CFQ or deadline providing better performance, and the numbers seem to back deadline giving the best performance for database systems.

The noop scheduler has minimal cpu overhead in managing the queues and may be well suited to systems with either low seek times, such as an SSD or systems using a hardware RAID controller, which often has its own IO scheduler designed around the RAID semantics.

Tuning the I/O schedulers

The schedulers may have parameters that can be tuned at runtime. Read the LinuxKernel documentation on the schedulers listed in the References section below

More information

Read the documents mentioned in the References section below, especially the LinuxKernel documentation on the anticipatory and deadline schedulers.


Part of CategoryKernel