[Cryptography] Other obvious issues being ignored?

Wed Oct 21 17:48:44 EDT 2015

On 10/21/15 at 2:13 PM, jmg at funkthat.com (John-Mark Gurney) wrote:

>Bill Frantz wrote this message on Wed, Oct 21, 2015 at 07:26 -0700:
>>On 10/20/15 at 8:40 PM, leichter at lrw.com (Jerry Leichter) wrote:
>>
>>>I wonder how the NSA writes its security-related code?
>>
>>Assembler is your friend.
>
>I really don't want to do register allocation by hand, and you can't
>use inline assembly in C, because clang is known to look into inline
>assembly and optimize it...

Then don't use inline assembly. Use a separate compile, and if 
necessary dynamic loading so the compiler isn't even around to 
mess with your code.

Sorry about having to manually perform register allocation, but 
it isn't really that hard. I have written assembly code for the 
IBM 650 -- ACC and MQ registers only, the IBM 1620 -- no 
registers, all memory to memory, the IBM 370 -- 16 general 
registers, and Sun SPARC -- register windows, among other 
processors. The register allocation process was basically, start 
with register 0 (or 1) and work up, paying attention to the 
registers with conventional uses. If you run out, spill them to 
the stack. If necessary look at what the compiler does when it 
compiles the algorithm as an example.

There is the question of how you divide up a function (like 
hashing or encrypt/decrypt) between a compiler that might 
optimize your memory clearing away and separately compiled 
assembler. There are a lot of engineering tradeoffs.

If your problem is zeroing memory, then the assembler solution 
is easy and separate from the more complex functions. Call zero 
memory at the end of the function. If necessary, add it's return 
value to your return value, and have the assembly code always 
return zero to ensure the call isn't optimized away.

If your problem is keeping sensitive data, like keys, out of 
registers so it isn't saved to kernel task switch data 
structures, then the problem is harder, and I think unsolvable 
on RISC machines (like ARM) and very unlikely to be solvable on 
X86. Probably the best approach is to be able to disable 
interrupts while the data is in a register and clear the 
register before re-enabling.

>>With fewer hardware architectures now than in the past, it is 
>>actually practical to write separate assembler routines for 
>>each architecture to perform simple tasks like clearing 
>>sensitive data.
>
>That's one minor aspect, and assumes that the C function doesn't
>do odd things w/ the data...

What C function? The one calling the assembler subroutine? What 
kind of odd things are you concerned about?

>>With RISC architectures it is probably impossible to write 
>>code which keeps sensitive data out of registers and therefore 
>>out of kernel memory on task switch. Is it possible on the X86 
>>architecture? In any case, assembler will offer higher 
>>assurance of what the code actually does than any compiled language.
>>
>>The key here is to keep the assembler code simple enough that 
>>you can get reasonable assurance of correctness. Saying
>
>Having looked at the SHA-256 (SSE4) or AES-GCM (AES-NI) implementions
>in assembly, they are not simple at all...  nasm/yasm helps a bit
>with this, but still, not nearly that easy...

Macro assemblers that let you write macros to implement IF - 
THEN - ELSE - FI and looping structures go a long way toward 
making assembler code easier to read. The IBM 370 Assemblers 
were wonderful tools for just that reason. Just avoid optimizing 
assemblers. :-)

Cheers - Bill

------------------------------------------------------------------------
Bill Frantz        |"Insofar as the propositions of mathematics 
refer to
408-356-8506       | reality, they are not certain; and insofar 
they are
www.pwpconsult.com | certain, they do not refer to reality.” 
-- Einstein