[Cryptography] What's a Plausible Attack On Random Number Generation?

Thu Oct 31 17:59:09 EDT 2013

On 10/31/2013 08:09 AM, Jerry Leichter wrote:
> It's become very difficult to know where sound engineering practice
> ends and tin-hat paranoia begins.  I'd like to propose one way to get
> at what reasonable boundaries are:  I'll lay out a particular, I hope
> realistic, proposal, and then ask the group to play attacker.  After
> all, the best way to understand what's needed for defense is to try
> to attack.  

Thanks, this is exactly the sort of clarity and specificity we need.

> I am the owner of a data center.  I have some hundreds of physical
> machines, with old ones being decommissioned and new ones being added
> periodically.  These run thousands of VM's that are created and
> destroyed much more frequently.  The machines are interconnected by a
> number of high-speed (say 10GigE or better) networks, further
> partitioned for security purposes into multiple VLAN's.  None of
> these networks are visible outside the boundaries of my data center.
> Connection to the Internet is via bastion routers.  There's a DMZ,
> all the usual stuff.  I have excellent physical security, and I
> regularly "patrol the perimeter", as well as the internal physical
> plant, so I assume that any attempt to run additional network links
> into my data center, plant significant hardware, etc., will be
> noticed fairly quickly.  I'm actually paranoid enough that I've
> spec'ed conductive mesh throughout the outer structure, so the data
> center is effectively inside a Faraday cage (i.e., no magic r adio
> connections to the outside), and of course I check this periodically
> as well.  As a result of all this, I choose not to consider physical
> threats; nor, because ultimately there's nothing I can do about them,
> on-going sophisticated insider attacks.

OK, that's clear.  That's sound engineering.

> I use a very simple random number generator: 

Just to state the obvious, we are talking about a PRNG unless
otherwise specified.

It should also be noted that one can imagine a TRNG without
a PRNG but not vice versa.  That is, you have to *seed* the
PRNG from somewhere.  You could perhaps seed PRNG "A" from
PRNG "B" ... but that just creates a Cat-In-The-Hat-Comes-Back
problem.  At some point you have to seed something with real
honest-to-goodness entropy.

It should go without saying that "randomness" is not synonymous
with "entropy".  The output of a PRNG may be /random enough/
in the sense that breaking it is computationally infeasible.
In contrast, the output of a proper TRNG is unbreakable in
principle.

It should also go without saying that a PRNG is good enough
for some purposes but not others.  YMMV but I would normally
be OK with using a cryptologically strong PRNG for ephemeral 
session keys ... but I would insist on genuine a TRNG with
100% entropy density for cutting a long-term RSA key.

Last but not least, it should go without saying that there 
is a huge difference between
 a) a /calibrated/ source of entropy, and
 b) an uncalibrated source of alleged entropy.
The former is a lot harder to do ... and incomparably more
useful.  There is a category I call "squish" i.e. bits that 
are neither reliably predictable nor reliably unpredictable.
See the diagram at
  http://www.av8n.com/turbid/paper/turbid.htm#sec-def-random

I get tired of people providing long lists of sources that
"might" be unknown to an attacker.  It is incomparably
better from an engineering point of view to have a short
list of sources that are absolutely, positively unknown
to the attacker.

>  There's a "random pool"
> of a few thousand bits.  At first boot, it's initially populated
> using whatever I can get my hands on principally the MAC address, any
> serial numbers available, etc.  

I wouldn't do it that way.  The smart thing to do is to
"populate" (i.e. provision) the "pool" (aka seed) with good-
quality randomness.  It is always good and very often
necessary for this to come from outside the machine, before
it is booted for the first time. Provisioning such a thing 
is not a big deal.  Whether it is a handheld device or a VM 
or whatever, it needs to be provisioned with a MAC address,
IMEI number, hostname, /etc/passwd, et cetera ... so there
is /already/ a provisioning process.  We just need to add
the seed-file to the list of things that MUST be provisioned
if you want the machine to be secure.

> The first
> boot logic will wait for a significant period of time to gather
> entropy as I'll describe in a moment before allowing any processes
> that use random numbers to proceed.  (Reboot may also wait a bit, but
> not nearly as long.)

That's not smart, unless the machine has an unusually
good /calibrated/ entropy source.  The smarter thing to 
do is to provision a proper seed, so that no blocking 
is necessary.  The seed needs to be big enough and 
random enough and not known to the attacker.

To guard against any possibility of replay, /stir/ the
seed using the RTC.  The RTC value is assumed known to
the attacker, but the seed is assumed unknown, and
hashing the RTC with the seed prevents replays and
sends the attacker back to square one.

  It must be emphasized that the RTC provides zero
  entropy, and by itself it would be useless, but
  it serves as a source of non-repeating values, which
  is plenty good enough to /stir/ the seed.  Other 
  things like MAC addresses and serial numbers are
  not worth fooling with, because they are the same
  every time you look at them, and because they are
  known to the attacker, or too-easily guessable.

In the case of a bootable DVD or CD-rom, download the 
.iso image, unpack it, provision a proper seed, pack it
back up again, and then burn the properly-provisioned
image onto the rom.

> At shutdown, the state is stored locally; 

Actually, you shouldn't wait for shutdown.  Recompute a
new seed file at the /first/ opportunity, not the last.
Do not assume the machine will undergo an orderly shutdown.
I've seen machines that essentially never undergo an orderly
shutdown.

> at restart, it's restored to the previous state.  

> There is *exactly two* sources of entropy.  

Not necessarily.

> The primary one is
> watching the network:  ....  The secondary source
> is other machines in the datacenter....

If you really are running a datacenter, you can well afford
to run turbid on one or more of the machines.  This will
provide copious amounts of hard-core industrial-strength
entropy.  This can be distributed to guest machines in
the usual variety of ways.

If I were doing it, I would be tempted to run turbid on
every host, and distribute the output to the guest VMs
via the virtio-rng interface.  This guarantees the guests
have plenty of TRNG entropy to use for seeding their PRNGs,
for cutting long-term keys, et cetera.

> OK, this is the classic setup in which all my apparently-random data
> comes from a network that *could* be observed.  The question is:  Can
> you mount a *plausible* attack?  

Well, there are lots of things that could go wrong.  Murphy
was an optimist.

> Note that attacks which start off assuming you have root on all the
> machines are out of bounds: 

Well, yes and no.

It is possible to have an in-bounds discussion of privilege
/escalation/ attacks, where somebody starts with temporarily 
"too much" privilege ... but rather less than full permanent 
root access.

For starters, imagine somebody getting read-only privileges
for only a millisecond.  He swipes a copy of the random-seed
file and then forces a reboot.  This would be a Bad Thing.
One action item here is to make sure the file-access modes
are set properly on the seed file(s).

Other categories of attack are more cryptanalytic in nature.
For instance, the PRNG might be built using SHA-1 or AES.
The security of such a thing is very sensitive to the "mode"
plumbing, not just to the primitive hash or cipher itself.

Since the PRNG per se is deterministic, anybody who captures
the state of the PRNG can predict all future outputs, up to 
the next reseeding event.  In contrast, with a well-designed
PRNG, capturing the state does not allow prediction of past
traffic.  That is to say, proper "mode" plumbing provides
forward secrecy.  This does not prevent an attack, strictly
speaking, but it makes the attack less damaging.

[1] A reasonable place to start is
  John Kelsey, David Wagner, Bruce Schneier, and Chris Hall
  “Cryptanalytic Attacks on Pseudorandom Number Generators”
  https://www.schneier.com/paper-prngs.pdf

[2] An amusing exercise is to read
  Elaine Barker and John Kelsey,
  NIST Special Publication: “Recommendation for Random Number 
      Generation Using Deterministic Random Bit Generators”
  http://csrc.nist.gov/publications/nistpubs/800-90A/SP800-90A.pdf

For example, look at the plumbing on page 51.  There are some
pieces there that may seem non-obvious at first, but are
there to block specific attacks mentioned in reference [1].

Also look at the six references starting here:
  http://www.av8n.com/turbid/paper/turbid.htm#bib-SSL

These are more than plausible, these are attacks that have
succeeded in the past.

> Hopefully Linux 3.13 will initialize some of these secrets (socket
> hashes, syncookie key, tcpfastopen key) as late as possible, e.g.
> when the secret is needed for the first time (next step is to have no
> long-lived secrets at all because it may still be too early).

That is a major issue.

The other way of dealing with it is to seed the kernel PRNG
as early as possible.  I have toyed with the idea of having
a random seed on the initrd (initial ramdisk) and making sure 
it gets loaded very very early.  It could be a plain file or
perhaps a random-seed.ko module.  Refreshing such a thing would 
require unpacking and repacking the initrd image, which is
work, but not unbearably much work.

Similarly one could imagine linking such a thing right into
the kernel image.  Refreshing would require unpacking and
repacking the bzImage, which is work, but not unbearably 
much work.