Randomisation - IBM's answer to Web privacy

bear bear at sonic.net
Tue Jun 18 13:12:31 EDT 2002



On Tue, 4 Jun 2002, R. A. Hettinga wrote:

>http://www.theregister.co.uk/content/23/25551.html
>
>Randomisation - IBM's answer to Web privacy
>
>That's when IBM's special sauce kicks in. Before allowing the data to be
>input to a data mining application, IBM's software "corrects" the
>randomized data to provide a "close approximation of the true
>distribution". How, exactly, IBM was not ready to disclose, but it involves
>knowing what the range of randomization was in the first place.


I hope these guys didn't get a patent on this.  This is an
old statistics trick, where you know your data has a known
measurement error and you "correct" it in aggregate by
analyzing the distribution curve you actually got and
comparing it to the set of "natural" distribution curves
you suspect it must conform to -- normal or "bell-curve"
distribution when dealing with population demographics,
in most cases.  Random variations will make the distribution
curve proportionally different, and with elementary calculus
over distribution equations you can find the solution that
restores a "true" bell curve -- which is going to be within
epsilon of your true distribution in most cases.

Nice application to privacy, though.

				Bear



---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com



More information about the cryptography mailing list