Truncating SHA2 hashes vs shortening a MAC for ZFS Crypto
Zooko Wilcox-O'Hearn
zooko at zooko.com
Mon Nov 2 00:33:34 EST 2009
Dear Darren J Moffat:
I don't understand why you need a MAC when you already have the hash
of the ciphertext. Does it have something to do with the fact that
the checksum is non-cryptographic by default (http://docs.sun.com/app/
docs/doc/819-5461/ftyue?a=view ), and is that still true? Your
original design document [1] said you needed a way to force the
checksum to be SHA-256 if encryption was turned on. But back then
you were planning to support non-authenticating modes like CBC. I
guess once you dropped non-authenticating modes then you could relax
that requirement to force the checksum to be secure.
Too bad, though! Not only are you now tight on space in part because
you have two integrity values where one ought to do, but also a
secure hash of the ciphertext is actually stronger than a MAC! A
secure hash of the ciphertext tells whether the ciphertext is right
(assuming the hash function is secure and implemented correctly).
Given that the ciphertext is right, then the plaintext is right
(given that the encryption is implemented correctly and you use the
right decryption key). A MAC on the plaintext tells you only that
the plaintext was chosen by someone who knew the key. See what I
mean? A MAC can't be used to give someone the ability to read some
data while withholding from them the ability to alter that data. A
secure hash can.
One of the founding ideas of the whole design of ZFS was end-to-end
integrity checking. It does that successfully now, for the case of
accidents, using large checksums. If the checksum is secure then it
also does it for the case of malice. In contrast a MAC doesn't do
"end-to-end" integrity checking. For example, if you've previously
allowed someone to read a filesystem (i.e., you've given them access
to the key), but you never gave them permission to write to it, but
they are able to exploit the isses that you mention at the beginning
of [1] such as "Untrusted path to SAN", then the MAC can't stop them
from altering the file, nor can the non-secure checksum, but a secure
hash can (provided that they can't overwrite all the way up the
Merkle Tree of the whole pool and any copies of the Merkle Tree root
hash).
Likewise, a secure hash can be relied on as a dedupe tag *even* if
someone with malicious intent may have slipped data into the pool.
An insecure hash or a MAC tag can't -- a malicious actor could submit
data which would cause a collision in an insecure hash or a MAC tag,
causing tag-based dedupe to mistakenly unify two different blocks.
So, since you're tight on space, it would be really nice if you could
tell your users to use a secure hash for the checksum and then
allocate more space to the secure hash value and less space to the
now-unnecessary MAC tag. :-)
Anyway, if this is the checksum which is used for dedupe then
remember the birthday so-called paradox -- some people may be
uncomfortable with the prospect of not being able to safely dedupe
their 2^64-block storage pool if the hash is only 128 bits, for
example. :-) Maybe you could include the MAC tag in the dedupe
comparison.
Also, the IVs for GCM don't need to be random, they need only to be
unique. Can you use a block number and birth number or other such
guaranteed-unique data instead of storing an IV? (Apropos recent
discussion on the cryptography list [2].)
Regards,
Zooko
[1] http://hub.opensolaris.org/bin/download/Project+zfs%2Dcrypto/
files/zfs%2Dcrypto%2Ddesign.pdf
[2] http://www.mail-archive.com/cryptography@metzdowd.com/msg11020.html
---
Your cloud storage provider does not need access to your data.
Tahoe-LAFS -- http://allmydata.org
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com
More information about the cryptography
mailing list