Lecture 9 -- The Disk Device

Disk Hardware

Each disk is flat and circular, like a record. This is also called a platter. It has 2 surfaces, each covered with magnetic material which is used to record information.

Disks spin at high speeds, often 3600 rpm.

There is a read/write head, which records a track on the surface. There are usually hundreds of tracks on each disk. The head is mounted on an arm, which moves or seeks from track to track.

Each track can hold a lot of information (10 to 100K), so tracks are usually broken into sectors, each holding a portion of the data. A vertical group of tracks is known as a cylinder.

To access a track, the arm must seek to it. The average seek time on drives is 10-50 milliseconds. Then, the disk must rotate to bring the data to the head: the latency time. Finally, the data is read: the transfer time.

Generally, seek time [tex2html_wrap417] latency time [tex2html_wrap419] transfer time.

washing machine: n. Old-style 14-inch hard disks in floor-standing cabinets. So called because of the size of the cabinet and the `top-loading' access to the media packs - and, of course, they were always set on `spin cycle'.

Disk Software

Disks usually use DMA. The software must tell the controller:

What operation (read, write). Which track. Which sector. Which head. Which DMA address.

The controller then performs the operation, and interrupts on success/error.

Often an OS has a queue of requests, as disk operations take a long time. If software can minimise the overall disk time, then disk I/O performance can be improved.

Arm Scheduling -- FCFS

Service requests as they arrive. Usually a lot of head movement, but the method is fair.

Shortest Seek Time First

Reorder requests so that the request with the smallest seek is next.

Thus, arm movement and seek time is minimised. This can lead to starvation on big seek requests, especially if new requests are continually arriving.

This affects requests on the extreme edge of the disk, whereas the middle tracks are referentially selected.

SCAN Algorithm

Also known as the elevator algorithm.

The head starts at one end of the disk, and moves towards the other end, servicing requests in that order, until there are no more requests in that direction.

The arm then reverses direction, and services requests the other way.

One nice property is, given any collection of requests, the upper bound on the motion is fixed at exactly 2 [tex2html_wrap421] the number of tracks.

C-SCAN Algorithm

This is a modified SCAN algorithm which lowers the average response time.

Unlike SCAN, C-SCAN always services requests in the same direction, for example, ascending.

When there are no more requests above the last request, the head is moved to the lowest request, and requests are again services in an ascending motion.

Obviously, if the queue is usually of size 1 or less, FCFS is adequate. Otherwise C-SCAN is the preferred algorithm.

walking drives: n. An occasional failure mode of magnetic-disk drives back in the days when they were huge, clunky washing machines. Those old dinosaur parts carried terrific angular momentum; the combination of a misaligned spindle or worn bearings and stick-slip interactions with the floor could cause them to `walk' across a room, lurching alternate corners forward a couple of millimeters at a time. There is a legend about a drive that walked over to the only door to the computer room and jammed it shut; the staff had to cut a hole in the wall in order to get at it! Walking could also be induced by certain patterns of drive access (a fast seek across the whole width of the disk, followed by a slow seek in the other direction). Some bands of old-time hackers figured out how to induce disk-accessing patterns that would do this to particular drive models and held disk-drive races.

Sector Queueing

If the hardware is clever enough to determine which sector is passing under the head, the OS can order requests on that cylinder to minimise latency.

For example, if we have requests for sector 3, 8 and 11, and the head is passing over sector 5, we can schedule 8 and 11 first.

Interleaving

In many cases where the OS is slow, it must spend time processing a block read in from the disk before it can read in another one.

For example, the number of DMA buffers is limited, and the OS must move out a block to free up a DMA buffer for another transfer.

Thus, if the OS wants to read sectors 0, 1 and 2, it may miss sector 1 after reading 0, and wait an entire revolution before it can read it.

If the sectors are interleaved, then there will be a gap to give the OS time to process before its next read.

Note that this can either be done in hardware (i.e within the disk controller logic) or in the software: for example, on a disk with eight sectors per track, the OS can treat physical sectors 0,1,2,3,4,5,6,7 as 0,2,4,6,1,3,5,7; in other words, a logical to physical mapping.

Error Handling

Disks are subject to a wide variety of errors:

a) programming error: e.g request for non-existent sector. Hopefully the OS is written to ensure this does not happen. If it does, halt the system?

b) transient error: e.g dust on the head. The best option is to retry the operation; if errors persist, tell the upper layers the sector is bad.

c) permanent error: e.g a physically bad sector. This is a problem as some application programs read the entire disk (e.g backup programs). Some intelligent drives keep a spare cylinder, and when permanent errors occur, internally map the bad sector to one on the spare cylinder. This can erode the arm scheduling algorithms used.

d) seek error: e.g arm went to track 7, not 6. Some drives fix these errors automatically. Others just inform the OS. Here the OS must recalibrate the head by bringing it back to cylinder 0 and retrying the seek.

e) controller error: e.g it refuses to accept commands. The OS can attempt to reset the controller. If the problem persists, give up.

disk crash: n. A sudden, usually drastic failure that involves the read/write heads dropping onto the surface of the disks and scraping off the oxide; may also be referred to as a `head crash'.

farming: [Adelaide University, Australia] n. What the heads of a disk drive are said to do when they plow little furrows in the magnetic media. Typically used as follows: ``Oh no, the machine has just crashed; I hope the hard drive hasn't gone farming again.''

Warren Toomey wkt@cserve.cs.adfa.oz.au