[Cryptography] canonicalizing unicode strings.

jamesd at echeque.com jamesd at echeque.com
Mon Jan 15 04:34:56 EST 2018


On 1/15/2018 4:32 PM, John Levine wrote:
> In article <b9a92033-1d13-6780-4f4d-472e1e111343 at echeque.com> you write:
>> Is there somewhere a list of near duplicate unicode symbols, or existing
>> canonicalization code?
> 
> Ooh, you've cracked open a large economy size can of worms.
> 
> Unicode has four defined normalization forms, all of which are broken
> in some way:
> 
> https://www.unicode.org/reports/tr15/
> 
> I'd guess you want to use form KC but without knowing more about your
> application, it's just a guess.

I am going to have to use NFKC canonical form for the key, and NFC 
canonical form for the display of the key.

Which once in a blue moon will drive someone crazy.  "Its broken" he 
will say



More information about the cryptography mailing list