Truncating SHA2 hashes vs shortening a MAC for ZFS Crypto

Tue Nov 3 13:21:08 EST 2009

On Tue, Nov 03, 2009 at 10:12:06AM -0700, Zooko Wilcox-O'Hearn wrote:
> following-up to my own post to clarify something important and add  
> some further ideas
> 
> On Tuesday,2009-11-03, at 9:32 , Zooko Wilcox-O'Hearn wrote:
> 
> >don't allocate a lot of bits to the MAC tag which is mostly  
> >redundant.  Maybe just allocate 32 bits to it, and think of it as a  
> >double-check that you have the right key and that your AES  
> >implementation is working right.
> 
> Important note: GCM does *not* have the security properties that you  
> expect from a truncated MAC tag: [1, 2].  If you're relying on the  
> MAC tag for integrity (i.e., if the SHA256 tag is truncated to be  
> short or if the user is allowed to run with an insecure checksum),  
> then you must use a sufficiently large MAC tag.

Exactly.  I proposed to Darren that he MAC only the Merkle tree roots,
and he rejected that as too big a change at this point.  That leaves him
with the MAC/hash size trade-off.  Therefore my recommendation then is
to truncate only the hash.  Yes, that means that you'll want to enable
dedup block match verification.

> It seems like the IV field could be mostly or completely optimized  
> out by generating the IV at runtime from other data which is  
> guaranteed to be unique for this version of this block.  Note that  
> you really should use a unique IV on *every write* of the block --  
> i.e. for every unique block's worth of plaintext -- and not re-use  
> the same IV for successive contents of the same block.  Do you  
> already do that?

Note that blocks can be relocated when dataset keys are not available,
which means the IV cannot be constructed from block addresses, for
example.

> Looking at [3] I don't see anything that obviously fits the bill.   
> The Birth Transaction ID uniquely identifies this block as far as I  
> understand, but nothing uniquely identifies this particular version  
> of this block.  So maybe you could make the IV be the (64-bit) Birth  
> Transaction ID plus a  64-bit counter which gets incremented on every  
> write and is stored in the place where you are currently storing an  
> IV.  That counter could roll-over, in the hopes that someone who  
> steals your ciphertext and wants to learn something about your  
> plaintext doesn't have a copy of your ciphertext from 2^64 versions  
> ago.  Of course, a larger counter would be better, if you can fit it in.

Interesting.  If ZFS could make sure no blocks exist in a pool from more
than 2^64-1 transactions ago[*], then the txg + a 32-bit per-transaction
block write counter would suffice.  That way Darren would have to store
just 32 bits of the IV.  That way he'd have 352 bits to work with, and
then it'd be possible to have a 128-bit authentication tag and a 224-bit
hash.

And if later Darren is able to switch to MACing the Merkle roots then
he'd have 352 bits for a hash.

[*] Transactions happen a fairly low rate of about a one every few
    seconds.  At that rate 2^64 transactions means over a trillion years
    before the txg wraps (half a trillion if the rate is 1/sec).
    Therefore ZFS does not need a cleaner service to re-write really old
    blocks.

    If 32 bits for per-transaction block write counters is too low, then
    transaction rate could increase (and arguably would have to
    anyways); even with the fastest flash 2^32 IOPS seems a long way
    away, and there should be enough CPU to jack up the transaction rate
    by then to compensate.  Let's suppose that we end up with a txg
    per-microsecond: then we get down to a still comfy (though starting
    to push it) 584,542 years before we wrap.

Nico
-- 

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com