Peter Wayner's New Book: Translucent Databases

R. A. Hettinga rah at shipwright.com
Thu May 23 11:36:51 EDT 2002


http://www.wayner.org/books/td/

Translucent Databases


	Do you have personal information in your database?

Do you keep files on your customers, your employees, or anyone else?

Do you need to worry about European laws restricting the information you keep?

Do you keep copies of credit card numbers, social security numbers, or
other information that might be useful to identity thieves or insurance
fraudsters?

Do you deal with medical records or personal secrets?

Most database administrators have some of these worries. Some have all of
them. That's why database security is so important.

This new book, Translucent Databases, describes a different attitude toward
protecting the information. Most databases provide elaborate control
mechanisms for letting the right people in to see the right records. These
tools are well-designed and thoroughly tested, but they can only provide so
much support. If someone breaks into the operating system itself, all of
the data on the hard disk is unveiled. If a clerk, a supervisor, or a
system administrator decides to turn traitor, there's nothing anyone can do.

Translucent databases provide better, deeper protection by scrambling the
data with encryption algorithms. The solutions use the minimal amount of
encryption to ensure that the database is still functional. In the best
applications, the personal and sensitive information is protected but the
database still delivers the information.


Order directly from the publisher and get a free copy of Free for All .
	"I would like to recommend this book to everyone who is storing
sensitive information in their database. Credit card numbers or other
private information from customer statistics data can fall into the wrong
hands and give someone else too valuable insights in specific customers
behavior."

-- Michael Widenius, MySQL
Now order from Amazon.com
-----------------------------------

	    Translucent Databases contains several dozen examples written in
basic SQL and Java. The code is written to be easy-to-follow and portable.
All of the code can be extended and modified to fit a number of different
applications. Here are some of the examples:

*	A database that hides the position of Navy ships from enemies while
simultaneously providing accurate information to those with proper
authorization.  
*	An anti-rape database that identifies trends without containing any
personal information.  
*	A babysitter scheduling service that matches parents with available
sitters while protecting the sitters' identities and locations'.  
*	A department store database that guards the modesty of customers.  
*	A private accounting system that detects fraud without revealing
information.  
*	A poker game for the Internet that prevents cheating.  
*	A pharmacy database for preventing dangerous drug interactions while
keeping medical records secure.  
*	A tool for travel agents to protect their clients from stalkers and
kidnappers.  
*	A stock exchange transaction mechanism designed to stop insider-trading. 
*	A website logfile tool that provides accurate counts of visitors while
protecting their identities.
*	A credit-card database for defending crucial e-commerce transactions.
*	A patent search tool that doesn't reveal the nature and focus of the
search.
*	A conference bulletin board that routes messages without helping stalkers.
*	A tool for studying the radon concentration in homes without
maintaining personal information.
*	An anti-money laundering database.


   
  Anyone who purchases the book receives an unlimited license to use the
source code from the examples on up to ten CPUs. If you have greater needs,
other licenses are available. Or just buy another copy of the book.

-----------------


A Supplementary Syllabus

If you're a professor teaching a database course, you may want to use
Translucent Databases  as an additional textbook. You are welcome to
consider this  one week module presents some of the most important concepts
from Translucent Databases.  It consists ofthree parts that roughly
correspond to the three hours spent in a classroom in a typical week.


Part I -- One-Way Functions

*	One-way functions are easy to compute but hard to reverse.
*	Some of the common ones are MD5, SHA, and raising a number to a power
modulo a prime number. This section will just use generic one-way functions
and call them h(x). There is no reason to do more with advanced mathematics.
*	Most common one-way functions are not truly impossible to reverse--
they're just practically impossible. Describe how hash functions like MD5
produce their answer. How long does it take to search for a collision? How
long does it take to do brute force attack?
*	Show how to protect passwords using this approach. Anyone can look at
the file and anyone can test a password presented as real. But no one can
take the password database and work backwards to determine the password
*	Show how to protect credit cards. (Some systems leave the last four
digits in the clear. Mention that this is a hint for how information is
treated in Part III.)
*	Show how multiple people can use h(x) to look up information instead
of just x. This can be used to synchronize schedules or protect personal
information.
*	Show how to design a store database that stores h(name) instead of name.
*	Emphasize that the regular SQL database features still work with the
fields of the database that aren't scrambled by h.





Part II -- Determining Reality


*	Digital signatures can use one-way functions. This section won't use
the more sophisticated, traditional versions like RSA or Diffie-Hellman,
although it could. It will only use simpler versions that are often called
Message Authentication Codes. Describe how this is a weaker restriction. 
*	Someone can create a signature or MAC by computing
h(password,document). Only someone with the right password can check the
signature and see if it was generated by the document.
*	Show how fake entries in the database can disguise the real ones.
*	Only someone with the password can distinguish between the real and
the fake.




Part III -- Blurring Reality with Quantization


*	Quantization is the act of taking a number from a big set and
assigning it the closest value from a smaller subset.
*	Rounding off values is one form of quantization.
*	More sophisticated algorithms don't distribute the small set of
surrogates evenly over the larger set.
*	Some basic algorithms block some fields if it makes it too easy to
identify the human behind the record.
*	Other algorithms add random amounts to the data to disguise the true
value.
*	Some encrypt this random amount so some users can get the real values.
*	Show how this can be applied to medical records used for research.
*	Show how this can help hide the position of ships.



Sample Homework Questions:


*	Write a program to try random values of x until MD5(x) ends with the
sixteen bit value FF. How many random values should it take? Run your
program. Do you come close? Repeat this 1000 times and report the average
number of samples that must be tested before one is found. Now, extrapolate
how long it will take for your computer to completely find an answer that
matches a complete 160-bit result from MD-5.
*	Create a tool for protecting medical records in a trial. Determine
which fields to scramble and which fields to leave in the clear. 
*	Describe some possible attacks against the scheduling algorithms
described in Chapter 4.
*	Describe three ideal databases where one-way functions can prevent
abuse. Describe several examples where the technique will fail.
*	Describe three ideal databases where false entries can distract
attackers. Describe several cases where the fake entries will corrupt the
database. Can this problem be avoided?
*	Describe three examples where blurring data with quantization can add
enough confusion to block attackers. Can you think of examples where too
much confusion also confounds the regular users? Are there examples where
there's no middle ground?


--------
Table of Contents



1--Translucency--
1.1--Some Examples--
  1.2--Limits--
  1.3--How to Use the Book--
  1.4--Some Examples--

2--One Way Functions--
  2.1--Pure One-Way Functions--
        2.1.1--Discrete Log--
        2.1.2--The Secure Hash Algorithm or SHA--
        2.1.3--MD-5--
  2.2--Public Key or Trapdoor Function--
  2.3--Secret Key Functions--
        2.3.1--Turning a secret key function into a pure one-way function.--
        2.3.2--Turning One-Way Functions Into Secret-Key Encryption Functions--
  2.4--Implementations--
        2.4.1--MySQL--
        2.4.2--PostgreSQL--
        2.4.3--Oracle--
        2.4.4--Client-side Applications--
  2.5--Conclusions--
        2.5.1--Lessons--

3--One Way Tables--
  3.1--An Example from a Department Store--
        3.1.1--Adding Security--
  3.2--Cleaning Up One-Way Input--
        3.2.1--Some Java Code--
  3.3--Security Trade Offs--
        3.3.1--Slowing the One-Way Functions--
        3.3.2--Salt--
  3.4--Adding Redundancy--
  3.5--An Example with Encryption for Security--
        3.5.1--Some Java Code--
  3.6--Hashing Instead of Encryption--
  3.7--Serial Queries--
  3.8--Keeping Some Information In the Clear--
        3.8.1--Inserting a Credit Card Number--
        3.8.2--Using the Information--
  3.9--Conclusions--
        3.9.1--Lessons--

4--Coordinating Users--
  4.1--A Bulletin Board Example--
        4.1.1--Adding a Shared Password--
  4.2--Special One-Way Functions--
        4.2.1--Creating A Public Key--
        4.2.2--Using the Public Key--
        4.2.3--Recovering Messages--
        4.2.4--Using Public-Key One-Way Functions--
  4.3--Conclusion--
        4.3.1--Lessons--

5--Synchronization--
        5.0.2--The BabySitter's Table--
        5.0.3--Adding More Names--
        5.0.4--Multiple Tables--
        5.0.5--Adding Extra Information--
        5.0.6--Security--
  5.1--Conclusions--
        5.1.1--Lessons--

6--Evolving Data--
  6.1--An Auction Example--
        6.1.1--The First Bid--
        6.1.2--Adding New Bids--
        6.1.3--Creating Bids--
        6.1.4--The Value of Counter--
        6.1.5--Better Hash Functions--
  6.2--Working With Encryption--
  6.3--Conclusions--
        6.3.1--Lessons--

7--Sharing--
  7.1--The Algorithms--
        7.1.1--More Precise Algorithms--
        7.1.2--More Efficient Algorithms--
        7.1.3--Adding Sophistication--
  7.2--Nuclear Launch Codes--
        7.2.1--Adding Launch Codes--
        7.2.2--Recovering the Code--
        7.2.3--Adding More Security--
  7.3--A Public-Key Example--
        7.3.1--Adding a Message--
        7.3.2--Retrieving the Message--
  7.4--Conclusions--
        7.4.1--Lessons--

8--Revelation--
  8.1--A Masquerade--
  8.2--Lottery--
        8.2.1--Paying for the Ticket--
        8.2.2--Placing Bets--
        8.2.3--Testing Winners--
  8.3--Sports Poker and Multiple Columns--
        8.3.1--Inserting Predictions--
        8.3.2--Testing and Verifying--
  8.4--Identity Cards and Selective Revelations--
        8.4.1--The Basic Mathematics--
        8.4.2--A Rental Car Example--
        8.4.3--The License--
        8.4.4--Proving Information--
        8.4.5--The Rental Car Company--
  8.5--Conclusions--
        8.5.1--Lessons--

9--Quantization--
  9.1--Algorithms--
        9.1.1--Adaptive Quantization--
        9.1.2--Projection--
  9.2--Using Quantization In Databases--
        9.2.1--Adding Random Noise--
        9.2.2--Adding Encryption--
  9.3--Quantized One-Way Functions--
        9.3.1--One-Way Functions and Noise--
  9.4--Conclusions--
        9.4.1--Lessons--

10--Authentication--
  10.1--Digital Signature Taxonomy--
        10.1.1--One-Way Functions and Signatures--
        10.1.2--Modular Exponentiation and Signatures--
  10.2--Adding Digital Signatures To SQL Databases--
        10.2.1--A Hash-based Signature--
        10.2.2--Signatures Using Exponentiation--
  10.3--Fake Information--
        10.3.1--An Appointment System--
        10.3.2--Adding Entries With Signatures--
        10.3.3--Adding Fake Entries--
        10.3.4--Finding the Results--
        10.3.5--Modifications--
  10.4--Conclusions--
        10.4.1--Lessons--

11--Accounting--
  11.1--Sales Force Accounting--
        11.1.1--Adding Values--
        11.1.2--Checking Things Out--
  11.2--Conclusions--
        11.2.1--Lessons--

12--Tokens--
  12.1--Prescription Records--
        12.1.1--Inserting Records--
        12.1.2--A Relatively Fast Mechanism for Retrieval--
        12.1.3--A More Secure Mechanism--
        12.1.4--At the client--
        12.1.5--At the database--
        12.1.6--Using transparency--
        12.1.7--Dealing with the Challenge--
  12.2--Conclusions--
        12.2.1--Lessons--

13--Private Retrieval--
  13.1--Stock Prices From Multiple Sources--
  13.2--A Single Server Example--
        13.2.1--Using More Decoys--
  13.3--A Patent Example--
  13.4--Conclusions--
        13.4.1--Lessons--

A--Further Reading--

----------------------


-- 
-----------------
R. A. Hettinga <mailto: rah at ibuc.com>
The Internet Bearer Underwriting Corporation <http://www.ibuc.com/>
44 Farquhar Street, Boston, MA 02131 USA
"... however it may deserve respect for its usefulness and antiquity,
[predicting the end of the world] has not been found agreeable to
experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire'

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com



More information about the cryptography mailing list