[Cryptography] canonicalizing unicode strings.
jamesd at echeque.com
jamesd at echeque.com
Sun Jan 14 06:19:18 EST 2018
I would like strings that look similar to humans to map to the same
item. Obviously trailing and leading whitespace needs to go, and
whitespace map a single space.
The hard part, however is that unicode has an enormous number of near
duplicate symbols.
Is there somewhere a list of near duplicate unicode symbols, or existing
canonicalization code?
More information about the cryptography
mailing list