[Cryptography] Uniform Data Fingerprint

ianG iang at iang.org
Fri May 29 14:36:24 EDT 2015


Some comments - on the whole this is a good start!


On 27/05/2015 22:28 pm, Phillip Hallam-Baker wrote:
> We use message digests as data fingerprints in lots of places. OpenPGP
> being the most visible of course but fingerprints are also used in
> BitCoin, for software distribution and even in S/MIME
>
> The OpenPGP group was discussing approaches to a new fingerprint format
> based on Base32 so that we can squeeze more bits out of the data on a
> business card. So generalizing a bit, I came up with this:
>
> https://tools.ietf.org/html/draft-hallambaker-udf-00


2.1.  Last para seems to conflate two issues being the age/replacement 
and the weak/substitution.  Either way we arrive at the same conclusion, 
that the fingerprint mechanism should include some degree of signal that 
indicates which one it is.

I think I'd write it somewhat differently, words to effect:

Fingerprint formats have had several problems in the past.  There has 
been a proliferation of formats which has led to a potential confusion 
between the algorithm to be used on a particular format.  In particular, 
where an algorithm has also become weak, such as MD5, it is possible to 
do a substitution attack.

Therefore, representations MUST reserve the first 5 bits as an algorithm 
identifier Section 3.1.1.





> The basic function is
>
>     Fingerprint = Version-ID + H ( Content-ID  + ':' + H(Data))
>
>     Where
>
>     H(x) is the cryptographic digest function
>     Version-ID is the fingerprint version and algorithm identifier.
>     Content-ID is the MIME Content-Type of the data.
>     Data is the binary data
>
>
> Putting the MIME content type in the scope of the digest means that if
> the same data string has meaning in two different contexts, an attacker
> can't perform a substitution attack. It also means that whoever is
> interpreting the hash has to know the context in which the data is being
> used.


I'm a bit disturbed by the MIME content type but I can't quite put my 
finger on what's the difficulty.  One thing that might help is to define 
a default type in the words of the text that means "no 
information/context is implied."

Eg,

     An empty Content-ID can be used if no MIME content is to
     be delivered, but the colon ':' must always be present.

It would also be helpful (to me?) to specify some basic MIME types. 
E.g., things like:

     The following MIME types are reserved:
        text/plain
        openpgp-v5-key
        SMIME-v1
        openpgp-v4-cleartext-signed

etc (just making it up as I go).

> The fingerprint is base32 encoded and set in chunks of 5 characters for
> easier reading/verification. The precision is always a multiple of 25
> bits using simple truncation:
>
> 100 bits - MB2GK-6DUF5-YGYYL-JNY5E
>
> 150 bits - MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ-SV75J
>
>
> The version/algorithm identifier also defines the algorithm used. The
> predefined identifiers are 96 for SHA-2-512 and 144 for SHA-3-512. These
> produce mnemonics for 'Merkle' and 'Spongeworthy'



Can I suggest that M and S be output as m and s?  In this way we signal 
to the eye more easily what is going on:

     mB2GK-6DUF5

(and a caveat that it is a typographical convention only, it is case 
independent, and implementations must accept leading upper case).


> This might seem a little overdone, but the payoff is that say you have
> trust list as follows:
>
> MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ-SV75J
>
> MV75J-C4OZQ-5GIN2-GQ7FQ-EEHFI-W3RGH
>
> ...
>
> You can have a trust list embedded in a device and it can stand for
> anything you need to be trusted. Could be the operating system
> executable, could be a PKIX root cert, could be a PKIX CTL, could be a
> PGP key. We can now direct all queries of the form 'is this anchor
> trustworthy' to this one list regardless of context.
>
> That is something simple enough that we can think about silicon
> implementation someday.



I'm unsure about section 4.  What's the point in just talking about it? 
  E.g., if we want a word list, why not introduce it, just copy the PGP 
word list into an appendix and provide some text as to how it works.




iang


More information about the cryptography mailing list