I/O in the prototype

Windows talks to disk drivers by means of sending I/O request packets. Diskache prototype puts them into a queue, and executes one by one in the order they arrived (FIFO).

It dequeues a packet, and, if the requested data is already cached, serves it directly from the fast disk. Otherwise, Diskache passes the request to the main storage. It might also decide to move this block to the fast disk, if the data is used often. In this case Diskache will wait for a block in cache to be ready (usually, there are plenty of them already prepared by a background thread), put the data there, and update information about cache contents.

Diskache then notifies Windows, that I/O request is completed, and picks up the next one. This is a long process with only one disk involved at the same time. The other storage unit is idle.

Asynchronous I/O

It is not necessary to wait for the last request to complete before picking up the next one. If a request is being served from cache, and the next request in the queue is known to require main storage access, it can be started immediately. While slower main storage is busy serving a request, Diskache can manage to complete many I/O packets for cached data.

I/O Scheduling

This can be extended further: even if no main storage requests have been picked up yet, Diskache could go ahead, and scan the entire queue for such a packet, and start executing it in advance.

In general, Diskache could reorder some requests to optimize load between the cache and the main storage. For example, when the packet queue is full, but no request is meant for the main storage, Diskache could pick up a request from the end of the queue, and forward it to the main storage hoping it will be completed by the time its turn would come normally.

The Challenge

While it might sound easy to do, this is actually quite hard to achieve. Diskache maintains an internal data structure, which contains information about blocks, stored in the cache. The main problem is ensuring integrity of that structure in case of power failure.

Each write operation, which allocates a new entry in the cache, requires an update to this structure, and an actual data write. Hardware reset might happen between them. If the original block address is changed for an entry, but the data is not updated, next read at this address will return data from a wrong block, which totally ruins integrity. An opposite situation has a similar problem.

In the prototype (with its sequential request execution), this situation can be easily caught by maintaining a linear operation log. After reset, it will be clear which operations succeeded, and which did not. An appropriate action can be taken to correct the situation. If Diskache schedules I/O, linear log will no longer be enough, and rolling back or retrying multiple operations, interrupted on different phases of execution, will be required.