encrypted file system issues (was Re: PGP "master keys")

Tue May 2 17:07:02 EDT 2006

[A bit off topic but I thought I'd let it through anyway. Those
uninterested in OS design should skip the rest of this message. --Perry]

On 5/1/06, perry at piermont.com (Perry E. Metzger) wrote:

>Disk encryption systems like CGD work
>on the block level, and do not propagate CBC operations across blocks,
>so if the atomic disk block write assumption is correct (and almost
>all modern file systems operate on that assumption), you have no more
>real risk of corruption than you would in any other application.

I haven't seen the failure specs on modern disk systems, but the KeyKOS
developers ran into an interesting (and documented) failure mode on IBM
disks about 20 years ago.  Those IBM systems connected disks to a
"controller" which was connected to a "channel" which was a specialized
processor with DMA access to the main storage of the system.  Note that
these systems were designed in the days when memory was expensive, so
there was an absolute minimum of buffering in the channel, controller,
and disk.

There are many possible failure modes, including power failure on the
individual components, hardware failure/microprogram failure in the
components, etc.  The failure we experienced was a microcode hang in the
channel (probably caused by a transient hardware failure), which also
stopped the CPU.  The failure occurred while the controller and disk was
writing a block, and the channel ceased providing data.  The
specification for the controller was if the channel failed to provide
data, it filled the block with the last byte received from the channel. 
If the channel and CPU had been running, the overrun would have been
reported back to the OS with an interrupt.  As it was, all we had was a
partially klobbered disk block.

Since KeyKOS was supposed to be a high reliability OS, we needed to code
for this situation.  Because of the design of the disk I/O system, there
were only two disk blocks (copies of each other) where this kind of
failure could cause a problem.  We defined the format of these blocks so
the last two bytes were 0xFF00.  By checking for this pattern, we could
determine if the block has been partially klobbered.  We then had to
ensure that we checked for correct write on one of the blocks before
starting to write the other.

Does anyone have any idea how modern disks and computers handle similar
situations?

Cheers - Bill

-----------------------------------------------------------------------
Bill Frantz        | gets() remains as a monument | Periwinkle 
(408)356-8506      | to C's continuing support of | 16345 Englewood Ave
www.pwpconsult.com | buffer overruns.             | Los Gatos, CA 95032

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com