[Cryptography] XOR linked list & crypto

Thu Feb 18 17:21:09 EST 2016

> There are a lot of reasons why we do lists using the naive implementation of having the "next" and "previous" slots in the object to be the absolute pointer of the elements in the list. For one thing, you can just go look at the memory and see where it is.
> 
> But there are plenty of places where lists were done differently, as well.
> 
> Assume a naive doubly-linked list where a data structure by convention has as its first two slots the next and previous pointer. Note that a solitary element really ought to have its next and previous pointer being the address of the element itself. We can define an operation INSQUE and REMQUE that insert and remove an element from the queue. Very simple, and very powerful.
> 
> But let's suppose you want to do this in shared memory -- where two processors share some but not all memory. Modern multi-CPU systems share all memory, and they number it the same. In our system, let's suppose we want to have a co-processor that does crypto, and the start of this shared section may not have the same base address.
> 
> You can easily modify the previous system to use the offset rather than the pointer.... In my shared memory, I also want to have an interlock so that I don't get race conditions where both processors munge the things at the same time. I also really need to have separate ways to handle inserting and removing at the head and tail of the queue, as well. So I end up having INSQHI/REMQHI and INSQTI/REMQTI for handling these self-relative queues. And note that conveniently, the lone element has zeros in its default slots.
I'm sure you know this, but this isn't some wild theory:  These instructions - with exactly these names - were available on the VAX many years back.  They were very widely used with VMS (DEC's operating system for the VAX), so when DEC produced the Alpha as a successor to the VAX, the VMS version of the PAL code (essentially loadable microcode that could be tailored for a particular OS) included these instructions.

> But -- there's no reason I have to use subtraction. I could use XOR. I can store &foo ^ &next for the next pointer and &foo ^ &prev pointer. I can even observe that if I store only one thing -- &next ^ &prev, then as long as I know &foo, I can collapse the two pointers into one.
Well ... yes.  But even if you just replace subtraction with XOR, your queue is no longer location-blind - you can't map it to any location in memory with no change, as you could with difference.  (BTW, if you want to think about this in a broader way:  Using the difference essentially replaces a coordinate with a vector, which makes the choice of origin irrelevant.  We don't normally think of "one-dimensional vectors" this way, though it's a central (indeed motivating) reason to use them in higher dimensions.)

If you change things to just the XOR, you further lose the ability to use a single address to define an entry point to the queue.  You always need a pair.

Side note:  We're all so used to singly- and doubly-linked lists (they are among the first data structures covered in data structures courses) that we tend to use them even in situation where a closer analysis shows they are suboptimal.  A number of years back, I built a set of C++ data structure classes, including things like hash tables and queues.  (This was more or least contemporaneous with the development of the STL.)  There were *no* linked lists in the collection!  In their place, I had a data structure I called a sequence.  A sequence is a resizable array with a pair of pointers to the beginning and end.  You can add or remove at either end by incrementing or decrementing one or the other pointer.  Note that either pointer can wrap around the array - the actual elements start at the "beginning" pointer and continue to "increasing" values (mod the array size) until you get to the "end" pointer.  If the array fills, you resize and copy it - typical multiplicative growth.

What's not immediately obvious is that *if order doesn't matter*, you can delete from anywhere in a sequence in constant time:  Swap the element to be deleted with the first (or last) and then remove.  This is enough for many of the data structures you often use lists for - e.g., stacks, queues, buckets in a hash table.

But ... doesn't it waste memory?  If you do a detailed analysis, unless your elements are much larger than a pointer, you typically end up using *less* memory than you would with a linked list.  And traversing it on modern architectures - where the next or previous element in the array is usually already in the the cache, while whatever a pointer goes to probably isn't - tends to be much faster.

And the code is simpler to boot!
                                                        -- Jerry