Rijndael in Assembler for x86?

Helger Lipmaa helger at tcs.hut.fi
Sat Sep 15 15:55:10 EDT 2001


First, my question was caused since Perry(?) did not originally specify
*why* he needs an assembly code; and secondly, since the referred 186
assembly code might be slower than the best C codes for Pentium. On the
other hand, the best (commercial) assembly implementation of Rijndael for
P3 is >50% faster (~230 cycles per block versus ~360 cycles per
block) than Brian Gladman's (free) C implementation. Brian's
implementation seems to be almost optimal for a C-code. The reasons why
assembly code achieves such a speedup was somewhat explained in

* Kazumaro Aoki, Helger Lipmaa, "Fast Implementations of AES Candidates",
  AES 3 conference, 2000.

Both this paper, and a compendium of AES implementations are available
from http://www.tcs.hut.fi/~helger/aes (if you have anything to add there,
feel free to email me!). I am *not* aware of any free Rijndael assembly
implementations that are faster than 300 cycles per block on P3. I know
that there exist some non-free (including mine) implementations that are
faster, though.

Helger

On 14 Sep 2001, Ian Goldberg wrote:

> >> > Does anyone have an open source implementation of Rijndael in
> >> > assembler for the Pentium?
> >> 
> >> Why just not to use a C code?
> >
> >Because it is typically slower by many times than hand tuned assembler.
> 
> Are you sure?  For general code, that certainly hasn't been true in a
> long time; optimizing compilers nowadays can often do *better* then
> hand-coded assembler.  However, for encryption code in particular,
> I can imagine the C primitives (which usually lack rotate, etc.
> instructions) may be suboptimal.
> 
> That being said, back when I wrote the 40-bit RC5 breaker for the RSA
> challenge, I thought the same thing.  I figured I would first write a C
> version, and then tune the resulting assembler.  When I looked at what
> gcc had output, it had already done all the tricks I had in mind.
> 
> I would severely doubt a slowdown of "many times".  I'm more likely to
> believe a few percent, and would not be surprised if the compiler's
> optimizer is smarter than most people's.
> 
>    - Ian
> 
> [Moderator's note: The best DES implementations for i386s in assembler
> are several times faster than the best in C. I'm not sure about AES
> but I'd prefer to try and see. Perhaps it's a feature of DES's odd bit
> manipulation patterns, perhaps not. I have yet to see GCC produce code
> for almost anything that was just as fast as hand tuned assembler,
> though. --Perry]
> 
> ---------------------------------------------------------------------
> The Cryptography Mailing List
> Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com
> 




---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com




More information about the cryptography mailing list