[Cryptography] Disk encryption

Wed Mar 29 17:45:16 EDT 2023

> On Mar 27, 2023, at 10:23, Dave Horsfall <dave at horsfall.org> wrote:
> 
> I've never used disk encryption before, so I have some concerns.
> 
> My understanding is that each encrypted block depends upon the previous 
> block (if not the entire chain), so what happens should an intermediate 
> block become corrupted?
> 
> I ask because I am now using an SSD drive (which I don't really trust), 
> but I was brought up on spinning rust for decades (no encryption).

Here's my explanation, based on what the container disk did for PGP Disk, what PGP's FDE did, and also what FileVault 2 did/does. I did the crypto design for FV2 back in the day, along with Richard Murphy.

You have to do every block independently, for the reasons you intuit. The whole point of a "disk" (scare quotes because of SSDs) is that it's random-access, so you have to be able to seek to any block and do the encryption in situ, with no other context.

What you really want is a large-block, tweakable block cipher. (If I were to go back and do more raw cryptography, I'd take Threefish -- which the team of us did for the Skein hash function) and make it be at least 4K bits -- there's Threefish1024 -- because 4K bits is 512 bytes. As it turns out, 1K bits or 128 bytes is probably mostly good enough.)

A tweak is the generalization of an IV/nonce/etc. It makes the cryptographic output unique and does not weaken encryption even when under attacker control. In the disk encryption case, the obvious thing to put in the tweak is the Logical Block Number or File Extent Number of the underlying storage. Obviously in some cases -- like that you're using a cipher with shorter block size than your storage size the tweak has to include that, too. As I remember for FV2, there was 64 bits of position and a truncation of the volume's UUID as salt so that the same plaintext in the same position in two different volumes wouldn't have the same ciphertext.

I highly recommend reading this paper, <https://web.cs.ucdavis.edu/~rogaway/aez/rae.pdf>, because it's a fantastic paper. Hoang, Krovetz, and Rogaway built the AEZ construction precisely to make a true tweakable cipher out of an AES round function core, and the explanation of *why* they did what they did is one of the best papers about block cipher construction ever, even if AEZ itself didn't really pan out.

The things we tend to use are XEX/XTS (XTS is XEX with cipher text stealing to go to any length and if you're going to use it with things that are a multiple of 16, like a disk block, you're really XEX even though we'll all say XTS), or the latest EME. XTS is a tweakable construction, but not an all-or-nothing one. So a change of one bit will only affect that underlying cipher block. EME is fantastic, but it takes two encryptions (Encrypt-Mix-Encrypt as opposed to XOR-Encrypt-XOR) so you get all-at-onceness at the cost of speed. It's possible you saw a discussion of how you'd combine an underlying AES to make a true large-block and thought it was applied to the whole disk. 

Nonetheless, the underlying construction you want for disk encryption is a tweakable one because you need to seek to any block.

PGP's FDE was a disk driver that virtualized the physical storage and presented 512 byte blocks. You unlock and mount the disk, and then the upper levels just see it as a volume. Originally it used an interesting fast hash to get an IV, and then used CFB mode, because that's how we rolled in those days (there were no tweakable constructions). Later, we built it with EME+ but never shipped it because that was when SSDs were coming out and software AES couldn't keep up with even those SSD speeds. 

Core Storage for FV2 was a file system, and connected to the raw disk driver underneath. It had a basic allocation of 4K bytes, and XTS underneath, and built centered around AES-NI instructions in Core 2 CPUs and beyond. (The ARM64 instruction set also has AES round function instructions like AES-NI.) AES-NI of the time had the interesting feature that it ran even faster if you interleaved multiple AES encryptions, with the optimal speed being in around 8-10 of them. Well, guess what! The underlying Core Storage file extent was 4K bytes, and there are eight 512-byte blocks in 4K and -- yah, baybee! It all fell together to get maximal speed. 

Does this make sense? Happy to clarify anything or blither more.

	Jon