NAT stands for Network Address Translation. It's the technology which allows most people to have more than one computer in their home and still use a single IP address. Most of the time, a router with NAT support gets data packets from the internal network (with internal IP addresses) and sends them to Internet, changing the internal IP address of each packet to the external one.
What's RTP?
RTP stands for Real-Time Transport Protocol. Its purpose is to pass voice data between a caller and the called. The problem is that when you call someone using the RTP protocol, you need to know his IP address and port. This makes RTP quite inconvenient when used alone, since the parts have no way to find one another. This is why people invented SIP.What's SIP?
SIP (Session Initiation Protocol) looks in syntax like HTTP, human readable text. Its purpose is to help the caller find the IP address and port of the called. It also helps the negotiation of the media types and formats. For example, when you have a PC at home and you want to call your friend from Romania using Free World Dialup (which uses the SIP protocol):SIP + NAT, an unsolvable problem?
The problem with SIP and NAT is not actually a SIP problem, but the RTP problem. SIP announces the RTP address and port, but if the client is behind NAT, it announces the client's RTP port, which can be different from the port the NAT allocates externally.The first trick is to keep open the hole in the NAT from the SIP client to the server. This is normally done by making all SIP clients use a two byte packet which is sent more often than 30 seconds. Some routers remove apparently unused NAT mappings after 30 seconds; GNU/Linux typically does this after three minutes.
The second trick is one we've used for our project YATE, to figure out the RTP IP and port from the first packet that arrives on the local RTP IP and port of the server, and to use it instead of using the RTP IP and address declared in SDP. This trick solves the NAT traversal problem, no matter how many NATs the client is traversing. However, the main disadvantage is that, in some cases, the client will not receive early media (since at that point, it sends out no voice packets) and it will not hear the ringing.
If you are not a carrier and you are trying to make a peer-to-peer call and both sides are behind the NAT, you must use an external SIP proxy or gateway to pass the SIP between the two points, hoping that the NATs will open the proper ports, one to another, for the RTP connection. However, there is no ultimate solution for that. Two proposed solutions are STUN and ICE, but every solution that currently exists can get in your way sometimes. Skype has found a very simple and nice solution for this problem: They use the Skype clients that are not behind NAT to proxy all the data for clients that are behind NAT.
My personal hope is that in the near future, most SIP implementations will use the two tricks used by YATE. Skype will probably be around for a long time for home users, but enterprise seems to move slowly to VoIP providers. With a lot of work and a little bit of luck, they will become at least as reliable as PSTN providers, since the technology is better.