Author Topic: TCP/IP, TIME_WAIT, and assassination on ADSL routers  (Read 1607 times)

iddu

  • Are we there yet?
TCP/IP, TIME_WAIT, and assassination on ADSL routers
« on: 15 March, 2012, 07:55:03 am »
When a server, acting as a responder for a webservice SOAP request, receives a message from a remote user, the underlying TCP/IP conversation should establish a record of the source IP:(ephemeral port #) and server IP:80 as being the endpoints.

Once a conversation on this pairing has finished, then another conversation is not supposed  to be started on this pairing until there is (no / little) possibility of late delivered packets / malicious substitution taking place.

This is known as (the server noting the pairing has been placed in) TIME_WAIT state (nominally, say 4 minutes for W2k3).

If remote users are getting a high fail rate, then there are some things that can be tweaked.

You can:

* Decrease the time the server waits before recycling a pairing for reuse
* Allow (given certainty in ascending sequencing of message numbers)  server to accept immediate reuse of pair from same source
* Increase the range of ephemeral ports available to remote user, so they have a wider range of source ports to use, taking longer to exhaust
 
However, this assumes a direct 1:1 connection relationship between endpoints.

When you place N users behind a NAT device (e.g. ADSL router) then this effectively collapses the ephemeral port ranges for N workstations to one set constrained to the routers public IP:ephemeral port range.

Under heavy loading of webservice (not random browsing) requests from say, 10 users running application on workstation, then it would seem to me, unless I'm missing something, that the remote endpoint device (ADSL router) is ALWAYS going to be likely to exhaust its ephemeral port range quicker than expected, and users will experience issue in communication, as the server notes and rejects reuse of (NAT'd) pairings within TIME_WAIT state.

I don't know of (ADSL) routers that state what ephemeral port ranges they use, or allow tweaking of same/recycle period/assassinations for NAT'd conversations they have established with a remote server.

Supposition correct? Best way around, if there is one?
I'd offer you some moral support - but I have questionable morals.

Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #1 on: 15 March, 2012, 09:04:35 am »
It's unlikely. Most home / soho router / firewalls run on a Linux kernel which uses TCP ports 32768 to 61000 as ephemeral ports so you have 28,232 to play with. With 10 users that would give 2832 ports each !

I cant see each user keeping that many ports open at once. Remember most sessions don't last very long. You ask for a web page you get it then you close the TCP connection and free up the ports. With HTTP 1.0 you would use a separate TCP session for every object on the page but with 1.1 you can use a single session for all the objects on the page (supposing they are all located on the same web server).

I assume that here you are talking about SOAP persistent connections where SOAP keeps a HTTP TCP session open. Again that permits 28,232 sessions per remote ADSL router IP address which is a hell of a lot.

In reality we actually have a lot more than 28.232 ports to play with as most state-full firewalls do not only track source and destination port but source and destination IP address as well so the limitation only applies when all the hosts behind a many to one NAT are talking to exactly the same server.

For really large scale NAT implementations you use a range of IP Addresses to NAT the clients through so you have 28,232*N ephemeral ports where N is the number of public IPs you have.

You also need to realize that TIME_WAIT timer will only be used if the session is idle. Normally a client would finish doing what it needs to then send a TCP FYN message which would close the TCP session right through the path. The server would immediately free up the connection and not have to implement the   TIME_WAIT timer as after a FYN there is no possibility of more packets arriving. Also the firewalls in the path would see the FYN too and clear the NAT translation entry.
The TIME_WAIT is basically just for those occasion where the SOAP session is sat idle because the person using the client is answering the phone or staring into space and no data is coming down the TCP session to keep it alive or has just switched off their PC without killing the session (eg by closing the application).
I think you'll find it's a bit more complicated than that.

Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #2 on: 15 March, 2012, 09:39:45 am »
Oh and the 28,232 only applies if the implementation in question sticks to the normal client ephemeral port range. Cisco IOS for example uses the a range for PAT (many to one NAT) of 1-65535.
I think you'll find it's a bit more complicated than that.

iddu

  • Are we there yet?
Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #3 on: 15 March, 2012, 09:53:17 am »
Ta.

>...when all the hosts behind a many to one NAT are talking to exactly the same server.
Yup - exactly the scenario

>The server would immediately free up the connection and not have to implement the TIME_WAIT timer as after a FIN there is no possibility of more packets arriving.
My understanding was the server notes the source pair presented to it and refuses to accept a subsequent SYN until TIME_WAIT expires (unless applying the allowed exception on perpetually ascending sequence numbers) to prevent the possibility of late/malicious packets.

>...where the SOAP session is sat idle....
There is never idle state. In consequence of user action worksation front-end app HTTP POST's a SOAP message containing service request to webservice which accepts/responds with data and status. This obviuosly decomposes from SOAP->HTTP->TCP/IP transports, but each service request is a complete and distinct action in its own right - we're never waiting on using to provide further transitory action mid-request.

>It's unlikely. Most home / soho router / firewalls run on a Linux kernel which uses TCP ports 32768 to 61000 as ephemeral ports so you have 28,232 to play with. With 10 users that would give 2832 ports each !

That's good; as long as they adhere to this 'standard' then I agree it shouldn't run out[1].

If it is restricted though, I can see issue:

* Although you can increase ephemeral port range on users workstation, the intermediate NAT'ing device must keep state on what is being used, so
* It either
   (a) NAT's with 1:1 port mapping, in which case it can't accept same port from 2nd internal IP whilst conversation is in progress
        (otherwise how would it know which way to chuck returned packets - iIP1, or iIP2?), and must therefore reject (internally)
        2nd iIP request, or
   (b) it both NAT's and PAT's for the widened range of workstation ports, increasing use of (restricted set) ephemeral  ports seen
         from Public IP perspective, leading to the receiving server holding more records in TIME_WAIT state

[1] Doesn't explain why I'm seeing rejections on attempt to recycle source pair use in < 1 s period though, unless NAT'ing device code is trying to be smart and hold onto prior use, without increasing sequence numbers :facepalm:
I'd offer you some moral support - but I have questionable morals.

Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #4 on: 15 March, 2012, 10:24:30 am »
[1] Doesn't explain why I'm seeing rejections on attempt to recycle source pair use in < 1 s period though, unless NAT'ing device code is trying to be smart and hold onto prior use, without increasing sequence numbers :facepalm:

I don't know the state machine for Linux based NAT but for Cisco stuff it works like this:

Once a Pat entry is setup it will exist for 24 hours by default. So if no more traffic is seen to the same destination IP address and TCP port the translation will remain for reuse by the client for 24 hours.
If however the firewall/router sees a TCP FIN or RESET the timer is reduced to 60 seconds.

So it would be entirely possible for you to see the same ip/tcp sorce port trying to connect to your server within a 1 minute period never mind 1 second. I suspect other firewalls behave in a similar manner but probably with differing timers.

I think you'll find it's a bit more complicated than that.

Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #5 on: 15 March, 2012, 11:04:56 am »
>The server would immediately free up the connection and not have to implement the TIME_WAIT timer as after a FIN there is no possibility of more packets arriving.
My understanding was the server notes the source pair presented to it and refuses to accept a subsequent SYN until TIME_WAIT expires (unless applying the allowed exception on perpetually ascending sequence numbers) to prevent the possibility of late/malicious packets.

Your right I was wrong (the server side of this is not my normal domain :)) It will wait before allowing reusing the same ip address port pairing unless the sequence number is higher. Your problem here is that firewalls doing NAT usually randomize the sequence numbers to improve security. On a Cisco PIX or ASA you can switch this off for particular connections but I dont think you can ob Cisco IOS and have no idea on other firewalls.

Quote
>...where the SOAP session is sat idle....
There is never idle state. In consequence of user action worksation front-end app HTTP POST's a SOAP message containing service request to webservice which accepts/responds with data and status. This obviuosly decomposes from SOAP->HTTP->TCP/IP transports, but each service request is a complete and distinct action in its own right - we're never waiting on using to provide further transitory action mid-request.

>It's unlikely. Most home / soho router / firewalls run on a Linux kernel which uses TCP ports 32768 to 61000 as ephemeral ports so you have 28,232 to play with. With 10 users that would give 2832 ports each !

That's good; as long as they adhere to this 'standard' then I agree it shouldn't run out[1].

If it is restricted though, I can see issue:

Its only Windows so far as I know that defaults to only 5000 ephemeral ports.

Quote
* Although you can increase ephemeral port range on users workstation, the intermediate NAT'ing device must keep state on what is being used, so
* It either
   (a) NAT's with 1:1 port mapping, in which case it can't accept same port from 2nd internal IP whilst conversation is in progress
        (otherwise how would it know which way to chuck returned packets - iIP1, or iIP2?), and must therefore reject (internally)
        2nd iIP request, or

That would be a rubish NAT implementation. Maybe some things out there do it but very few I would think. 

Quote

(b) it both NAT's and PAT's for the widened range of workstation ports, increasing use of (restricted set) ephemeral  ports seen
         from Public IP perspective, leading to the receiving server holding more records in TIME_WAIT state

Not sure what you mean by this last one.
I think you'll find it's a bit more complicated than that.

iddu

  • Are we there yet?
Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #6 on: 15 March, 2012, 11:53:56 am »
Not sure what you mean by this last one.

Given internal LAN of router @ 192.168.254.1 (<-> public a.b.c.d.) & workstations 192.168.254.2 & .3

If .2 communicates externally, and (randomly) choses port 5000, then .1 (if NAT without PAT) must hold state that .2:5000 <-> a.b.c.d:5000 <-> server:80

There's the possibility that .3 could chose port 5000 while .2 is communicating.

If .1 NAT's without PAT, then it can't handle .3:5000 <-> a.b.c.d:5000 <-> server:80 concurrently; how can it decide where server:80 -> a.b.c.d:5000 -> w.x.y.z:5000 gets back to - it'd have to state on/transit  .2/.3 MAC addresses across the conversation.

If .1 NAT's with PAT, then [ .2:5000 <-> a.b.c.d:40000, and .3:5000 <-> a.b.c.d:40001 ] <-> server:80, but this means the server holds a.b.c.d:40000 and :40001 in TIME_WAIT.

Get enough workstations POST'ing from behind NAT device and N*5000 :port workstation sets may quickly exhaust a.b.c.d:[restricted set], as server holds onto a.b.c.d:PAT'd pairs longer than time taken for sender(s) to cycle around NAT'd device public port set
I'd offer you some moral support - but I have questionable morals.

Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #7 on: 15 March, 2012, 12:48:04 pm »
Not sure what you mean by this last one.

Given internal LAN of router @ 192.168.254.1 (<-> public a.b.c.d.) & workstations 192.168.254.2 & .3

If .2 communicates externally, and (randomly) choses port 5000, then .1 (if NAT without PAT) must hold state that .2:5000 <-> a.b.c.d:5000 <-> server:80

There's the possibility that .3 could chose port 5000 while .2 is communicating.

If .1 NAT's without PAT, then it can't handle .3:5000 <-> a.b.c.d:5000 <-> server:80 concurrently; how can it decide where server:80 -> a.b.c.d:5000 -> w.x.y.z:5000 gets back to - it'd have to state on/transit  .2/.3 MAC addresses across the conversation.


Exactly which is why you don't get NAT behind a single public IP without PAT it just doesn't work. The NAting device will always do PAT and use the source IP address and port in its state table and do PAT on the source port.

Quote
If .1 NAT's with PAT, then [ .2:5000 <-> a.b.c.d:40000, and .3:5000 <-> a.b.c.d:40001 ] <-> server:80, but this means the server holds a.b.c.d:40000 and :40001 in TIME_WAIT.
Get enough workstations POST'ing from behind NAT device and N*5000 :port workstation sets may quickly exhaust a.b.c.d:[restricted set], as server holds onto a.b.c.d:PAT'd pairs longer than time taken for sender(s) to cycle around NAT'd device public port set

Yes this can happen. It's a known issue. The suggestions are to move the TIME_WAIT to the client end not the server. Its the device that sends the FIN that goes into TIME_WAIT so programmers should design there software so that under normal operation the client not the server sends the FIN.
Its a big issue in large scale deployments where it's not NAT/PAT on remote sites that is causing the issue but a load balancer in the data centre. The clients talk to a virtual IP address on the load balancer that then load balances the connections across a server farm. To a server all connections appear to be coming from a single IP address which is usually the server facing interface of the load balancer, queue lots of time_wait locked sockets. This is one of the things that determines how many servers you need in a load balanced farm, you need to know the number of connections per second you expect and the number of connects per second each server can process (CPU issue) per second and how mant connections per second will lead to time_wait based socket starvation (not CPU based but timer based instead)
This is one of the reasons NAT is bad :)

There isn't anything much we can do about this from a network point of view (ie routers, firewalls etc) you need to handle it in your application. As I already said you can move the TIME_WAIT to the client end plus there are several ways of clearing or preventing time_wait socket locking but they are OS / application dependant.


 
I think you'll find it's a bit more complicated than that.

iddu

  • Are we there yet?
Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #8 on: 15 March, 2012, 01:01:02 pm »
Ta

Off to ponder / do some percussive retification ;D
I'd offer you some moral support - but I have questionable morals.

Re: TCP/IP, TIME_WAIT, and assassination on ADSL routers
« Reply #9 on: 15 March, 2012, 01:08:35 pm »
It cool I learned something as I was under the impression that the FIN from the client would reset the time_wait to a low value like it resets the session timer on a firewalls NAT table. Good job I dont do server scaling for a living and just do the load balancing part  :)
I think you'll find it's a bit more complicated than that.