[Cryptography] 0xFFFFFFFF is the loneliest number that you'll ever do

Florian Weimer fw at deneb.enyo.de
Fri Nov 1 06:24:16 EDT 2019


* Jerry Leichter:

> Or the randomest number.  At least according to the AMD RDRAND
> instruction on some chips.
>
> Microcode bug: CPUID on AMD Ryzen 3000 chips reports that RDRAND is
> implemented, and invoking it reports that the value returned is valid
> - but the value returned is always 0xFFFFFFFF.
>
> The bug was fixed in an AMD microcode patch in July, but board makers
> have been ... random about issuing BIOS updates.  At one point Asus
> had BIOS updates dated well after the microcode patch - which didn't
> include it.
>
> All sorts of merriment ensued.  Some Linux distributions fail to boot
> because systemd checks the RDRAND results and hangs if they fail.
> (There's a workaround patch.)  Wireguard loops on a second connection.
> Other system features and programs may have workarounds using
> different randomness sources or may proceed blindly with unfortunately
> results.
>
> Quite the mess.  Interesting writeup at
> https://arstechnica.com/gadgets/2019/10/how-a-months-old-amd-microcode-bug-destroyed-my-weekend/

This still doesn't say how the bug was fixed.  Does the instruction
now work correctly?  Or is the CPUID bit for RDRAND just masked to
hide the instruction?

Recent Linux versions do the latter:

commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24
Author: Tom Lendacky <thomas.lendacky at amd.com>
Date:   Mon Aug 19 15:52:35 2019 +0000

    x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
    
    There have been reports of RDRAND issues after resuming from suspend on
    some AMD family 15h and family 16h systems. This issue stems from a BIOS
    not performing the proper steps during resume to ensure RDRAND continues
    to function properly.
    
    RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
    reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
    support using CPUID, including the kernel, will believe that RDRAND is
    not supported.
    
    Update the CPU initialization to clear the RDRAND CPUID bit for any family
    15h and 16h processor that supports RDRAND. If it is known that the family
    15h or family 16h system does not have an RDRAND resume issue or that the
    system will not be placed in suspend, the "rdrand=force" kernel parameter
    can be used to stop the clearing of the RDRAND CPUID bit.
    
    Additionally, update the suspend and resume path to save and restore the
    MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
    place after resuming from suspend.
    
    Note, that clearing the RDRAND CPUID bit does not prevent a processor
    that normally supports the RDRAND instruction from executing it. So any
    code that determined the support based on family and model won't #UD.

<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c49a0a80137c7ca7d6ced4c812c9e07a949f6f24>


More information about the cryptography mailing list