[Cryptography] Vulnerability of RSA vs. DLP to single-bit faults

Sun Nov 2 06:13:43 EST 2014

On 1 Nov 2014 15:40 -0700, from mitch at niftyegg.com (Tom Mitchell):
> The magnitude of these commonly undetected errors is easy to underestimate
> and it does make sense for code that uses the key to keep the bits in a
> data structure with multiple bit error detection both in memory and on disk.

A recent study [1] did find that memory errors are much more common
than previously thought. PDF page 11, section 7 conclusion 1:

> About a third of machines and over 8% of DIMMs in our fleet saw at
> least one correctable error per year. Our per-DIMM rates of
> correctable errors translate to an average of 25,000–75,000 FIT
> (failures in time per billion hours of operation) per Mbit and a
> median FIT range of 778 – 25,000 per Mbit (median for DIMMs with
> errors), while previous studies report 200-5,000 FIT per Mbit. The
> number of correctable errors per DIMM is highly variable, with some
> DIMMs experiencing a huge number of errors, compared to others. The
> annual incidence of uncorrectable errors was 1.3% per machine and
> 0.22% per DIMM.

Also note that memory modules experiencing detected, uncorrectable
errors are normally replaced, which means that any given DIMM will
only experience a single uncorrectable error during its effective
lifetime, which would likely lower the number of uncorrectable errors
reported compared to what would have been seen had the module been
kept in use.

With RAM that does not employ error detection, there is no way to
detect any of those errors, meaning that the user would likely just
brush it off as a software bug causing a crash, perhaps reboot, and
keep working, possibly by the time the crashes are piling up simply
buy a new computer or, if they are particularly computer savvy,
reinstall the system from scratch (thus not solving the underlying
problem, of which they are likely to be unaware).

I personally find these figures, while not worthy of panicking, to be
significant enough to warrant attention for critical data, and have
gone with ECC RAM (which has yet to report any problems) myself.

The problem with software-level memory error detection is that even if
you are "lucky" in the sense that memory corruption, if it occurs,
hits precisely the data you are protecting (which is far from certain;
you are probably more likely to hit code or cache than a few thousand
bits of key material), if you have something like a stuck bit in RAM
you will be computing the error-detection data over already-incorrect
data. So while such schemes can detect errors that arise after the
data has been stored in RAM, they cannot detect errors that exist from
the beginning. As a somewhat extreme example, if I do something
trivial like memset(&ptr,0,1024) setting my 1024-byte block of memory
to all zeroes, but _a single bit is *stuck* on 0_, I can compute any
checksum over that block I want and it'll tell me all is well. If I
then fill that block by for example reading key data from disk, let's
just say that with 50% probability, I have a bad situation because RAM
now holds something other than what came from storage. _Reliably_
detecting the problem without hardware support is a non-trivial
problem; _at a minimum_, all memory-writing operations would need to
double-check the results in a way that is immune to caching.

[1]: Schroeder, Bianca; Pinheiro, Eduardo; Weber, Wolf-Dietrich
 (2009). "DRAM Errors in the Wild: A Large-Scale Field Study" (PDF).
 SIGMETRICS/Performance (ACM). ISBN 978-1-60558-511-6.
 http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf

-- 
Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se
OpenPGP B501AC6429EF4514 https://michael.kjorling.se/public-keys/pgp
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)