NAT Traversal Constraints
June 15, 2010 5 Comments
Whilst NAT traversal as implemented by DC++ has proven capable of rendering many formerly mutually inaccessible pairs of passive users capable of directly connecting to each other, it has limitations arising from the TCP gimmickry used to implement it. This post explicates why certain people will likely see failures if they test their NAT traversal compatibility.
Many TCP NAT traversal methods exist, but DC++ uses P2PNAT. The limitations and specific methods discussed here have alternatives, but largely less convenient alternatives. According to the 2005 paper Characterization and Measurement of TCP Traversal through NATs and Firewalls, P2PNAT is one of the only options not requiring superuser privileges, does not require raw sockets, and does not risk timing races; as a result, the its faults and limitations herein characterized still leave it as the preferred algorithm:
The most obvious such P2PNAT limitation, and one relatively easily ameliorable, involves NAT traversal users attempting to download from each other. As presently implemented, those antiparallel connections cannot be formed due to sharing the same 4-tuple of (source IP, source port, destination IP, destination port) and as such the operating system TCP/IP implementation cannot distinguish between them. Ordinarily this is not a problem because when establishing a connection to a remote machine, the initiator binds to any local, ephemeral port that might be available; thus, even if the destination IP and destination port are shared between connections or the source IP and source port are shared between connections, other members of the tuple provide the necessary uniqueness. However, the NAT traversal method DC++ implements depends on controlling this all four members of this tuple – the source and destination IPs and ports – on both sides of the connection. Therefore, it deterministically attempts to set up the same four-tuple for each user twice when two users simultaneously attempt to connect to each other. Because TCP/IP implementations demand that this tuple be unique, they reject the second connection attempt, so only one DC user in a NAT-T’d pair can transfer from or to the other at any given time.
This effect lingers, interfering with even non-simultaneous connections due to a byproduct of the TCP state machine. Whilst the C-C connection exists, both ends are in the green “ESTABLISHED” state. However, one side then actively closes the connection, at which point it goes into the FIN_WAIT_1/FIN_WAIT_2/CLOSING/TIME_WAIT state region, whereas the other client sees its connection get passively closed and enters the CLOSE_WAIT/LAST_ACK state region. The latter has no problem: the connection is gone and such tools as netstat will cease listing it. However, the former, actively closing machine gets stuck in TIME_WAIT for twice what’s called the maximum segment lifetime, which can come to as much as 4 minutes. One can reduce the TIME_WAIT interval both in Windows and Linux, but it protects against a stray, resent packet from a previous connection from breaking established TCP connections, so it’s unwise to reduce it exessively. Further, it’s a system-wide setting in both instances, so DC++ should not change it unilaterally.
During these long TIME_WAIT-delayed minutes, any new attempt to re-establish a connection between two NAT-T DC clients which had just closed a connection will fail, because one of them will associate that unique (source IP, source port, destination IP, destination port) tuple with TIME_WAIT and thus prevent the new connection from forming. Fortunately, the same ameliorative steps which reduce the incidence of this issue for simultaneous connections render this unavoidable TIME_WAIT less problematic.
This simultaneous connection and near-successive connection limitation is fixable by creating more local ports to which the NAT traversal code can bind, so limiting as it may presently be, it’s not an intrinsic problem with NAT traversal. Other, more inherent problems do exist and can be caused by operating systems and routers. The NAT traversal compatibility survey aims to discovering the prevalence of such problems.