[Cryptography] Two quick questions about IPsec AH

Mon Jan 10 06:50:54 EST 2022

Phillip Hallam-Baker writes:
> IPSEC as specified in the RFCs was simply unusable because it didn't work
> through NAT. The IPSEC authentication included the source address and that
> caused connections to fail through NAT boxes.

(Btw, correct spelling is IPsec, not IPSEC).

And you are mixing things here. AH (authentication header) tries to
authenticate headers, thus it does include source and destination
addresses in the authentication (they are part of the headers).

I.e., the fact that it detects someone modifying the headers is
feature not a bug, and as NATs modify those headers it will detect
those as attacks.

ESP on the other hand does NOT include source or destination address
in the integrity check value calculations thus do not care that source
or destination addresses changes. So ESP works fine through NATs, but
the problem is that NATs to other things than just change source and
destination addresses.

One of the major thing they do is that they also fix the TCP/IP
checksum when they modify the source and destination addresses. When
using ESP the NAT box can't change the TCP/IP checksum as the checksum
is inside the encrypted part, thus NAT can't see it. Thus we need
NAT-Traversal (RFC3947) mode for ESP, which will fix those TCP/IP
checksums inside the IPsec after decrypting the packet and before
forwarding it to the network. NATs also do more complicated things,
like look inside the FTP control streams and look and modify the
IP-addresses sent there so that FTP works, but those protocols are
mostly obsolete now.

Another issue with NATs is that they do not know to which device the
ESP packets coming back from the network should be passed to, i.e. as
normally NATs multiplex data based on the port numbers, and ESP
packets do not have port numbers, thus NAT does not know to which
internal node this incoming ESP packet needs to be sent to. The ESP
packets do include SPI value which could be used to multiplex data
back to original host, but the problem is that NAT does not know which
SPI is used by which internal host.

To get past this some NATs made all kind of heuristics trying to
associate outgoing SPI numbers with incoming SPI values (there is an
SPIs for each direction). I.e., they assumed that if we had outgoing
ESP packet with SPI x, and then got incoming ESP packet with SPI y,
they belong to the same flow, and the incoming ESP packet is forwarded
to the host that sent the outgoing ESP packet, and then they made
mapping for those SPI numbers. This mostly works especially if there
is very few devices using ESP inside the NAT box. On the other hand
trying to run this on carrier grade NAT box with hundreds of devices
behind it will not work reliably.

To fix this UDP encapsulation (RFC3948) was defined for the ESP, i.e.,
this will wrap all ESP packets inside the UDP packet, to allow NATs
reliably map outgoing and incoming ESP packets to be part of same
flow. This of course caused another issues, as some NATs use very
short timers for UDP flows, thus UDP encapsulation needs to send
keepalives to keep the UDP mapping alive in the NAT box. 

Also some NAT vendors read the original IKE drafts in such way that
both source and destination ports MUST be 500, and because of that
explictly disabled the UDP port mapping for port 500, i.e., when using
IKE they do not modify the source port but instead use the IKE cookies
to keep track of which returning IKE packet belongs to which internal
host. This might have also been caused by NAT vendors testing against
buggy IKE implementation which made this incorrect assumption.

This change messed up the UDP encapsulation as this method only worked
for IKE packets, not for UDP encapsulated ESP packets, so the solution
was to move both IKE and UDP encpsulated ESP to another port i.e.,
4500, from the port 500 to get rid of this NAT behavior.

And yes, there was lots of people at those times who considered NATs
as evil, and because of that there was not really discussion between
the IETF and the NAT vendors. So for protocol standardization work the
NAT vendors seemed to make everything more difficult because NAT
vendors implemented all kind of kludges and hacks to get their stuff
working, and those things then immediately broke down if protocol was
modified in any way.

It would have been much easier for both if NAT would have been
standardized in the IETF, and then we could have made modifications to
the protocols together, and not try to guess what the other part is
planning to next, and try to cope with it...

And note, that NAT functionality do not provide any kind of security,
they only provide address translation, but to do the address
translation they usually also include statefull flow detection to save
resources, and when you have statefull flow detection making statefull
firewall is trivial. Statefull firewall included in almost all NAT
boxes do provide the security features people want, but those two
functionality are separate.

> Looking back on the situation and knowing a lot more about what the
> NSA was up to back then (some very senior ex-NSA people have
> apologized to me personally not least because of the 2016 debacle),
> I am pretty sure that this peculiar ideological fixation was
> actually being promoted by a small clique as a way to make sure
> IPSEC was as useless as possible. A single person pouring poison
> into the ears of other people can be surprisingly effective.

I do not think those people who though NATs were evil considered IPsec
at all. Most of them considered the end-to-end principle as sacred,
meaning all packets originate from one end and are delivered through
the network unmodified to the final destination.

Thus in a way IPsec was also evil as in most common usage of it, IPsec
did break that end-to-end principle too. I.e., when having site to
site VPNs or similar the end-to-end principle was broken, but it was
not as evil as NATs as the damage done to the packets was undone on
the other end...

Those end-to-end people did say that we need to run IPsec on every
single device, i.e., we want to use host to host IPsec, instead of
stei to site VPN...

> Windows-NT had IPSEC built into the IP stack from very early on. But you
> couldn't use the platform implementation for remote access because it didn't
> work through NAT. So you had to install a third party client from your IPSEC
> firewall vendor with their own particular kludge to work around the IPSEC
> anti-NAT inanity. And many of those kludges were patented and...

If I remember correctly there was some corporate division issues with
the microsoft windows IPsec team, i.e., the IPsec was NOT done on the
people implementing network devices, but by the people implementing
dialup etc stuff, thus it was only usable when using such
configurations or something like that.

Anyways there were lots of IPsec clients for windows (we had one or
two ourselves at that time) providing different levels of features...
Most commonly you wanted to use the one implemented by the VPN vendor
who you needed connect to as that way you had only one vendor to
complain when something did not work :-)

> So I rather doubt EH is used because I doubt any of the kludges were
> implemented for anything besides ESP. And besides which, there is a null
> cipher for testing and for environments where you want authentication but DO
> NOT want encryption (this is very common in SCADA deployments and for
> excellent reasons)

ENCR_NULL is not only for testing, it is also in situations where you
do not want to do encryption (for example if traffic is already
encrypted, so there is no point of encrypting it second time), but do
want to do integrity and authentication checking.

Because of this reason the ENCR_NULL is still MUST for ESP in the
RFC8221.
-- 
kivinen at iki.fi