[Cryptography] canonicalizing unicode strings.
John Levine
johnl at iecc.com
Mon Jan 15 01:32:51 EST 2018
In article <b9a92033-1d13-6780-4f4d-472e1e111343 at echeque.com> you write:
>Is there somewhere a list of near duplicate unicode symbols, or existing
>canonicalization code?
Ooh, you've cracked open a large economy size can of worms.
Unicode has four defined normalization forms, all of which are broken
in some way:
https://www.unicode.org/reports/tr15/
I'd guess you want to use form KC but without knowing more about your
application, it's just a guess.
R's,
John
More information about the cryptography
mailing list