PROTOCOLS EXPLAINED
The Transmission Control Protocol, or TCP, , is the basis for most Internet traffic. It is a connection-oriented protocol that provides a reliable way to transfer data across a network. Because of this principal, all TCP sockets follow a similar procedure for use.
To establish a connection between two computers (to be able to send data back and forth), one computer must be set up to listen on a specific port. The other computer (called the client) then attempts to connect by specifying the network address (or IP address) of the remote machine and the port to attempt the connection on.
This means that in order to send and receive data with a remote machine, both machines must have some indication that this connection will be established. That happens by either picking a well-defined port for the listener (or server) to listen on, or by some prior arrangement (e.g. you are the author of both the server and the client program).
When a server receives a connection attempt for the port it is listening on, it accepts the incoming connection, and sends an acknowledgement back to the remote machine. Once both machines have reached an agreement (or are “Connected”), then you can begin sending and receiving data. When you close your connection with the remote machine, there is a similar handshake process that goes on, so both computers know that the connection is being terminated.
Picking a port to listen on is not always well defined. If you are implementing a well-known protocol, such as writing an FTP program, then you know you will need to support listening on port 21. But if you are writing your own protocol for your application, then how do you know what port to use? I suggest picking a port at random (at design time, NOT at runtime), and then checking to see whether that port is registered by another application. You can check this at
http://www.iana.org/assignments/port-numbers
If the number you have chosen is registered by another application, you should choose a different number.
Due to the amount of error checking, and handshakes, TCP is very reliable. When you send a packet of information out, it is guaranteed to make it to the remote machine (assuming you have not been disconnected, either abortive or orderly).
But this feature comes at the cost of high overhead. A typical TCP packet that is sent over the network has around a 40-byte header that goes with it. This header is checked and changed by all the various machines en route to its destination. This overhead makes TCP a slower protocol; it gives up speed to gain security. If it is speed you are looking for (e.g. to write a networked game), then you should look into the UDP protocol.
The User Datagram Protocol, or UDP, is the basis for most high-speed, highly distributed network traffic. It is a connectionless protocol that has very low overhead, but it is not as secure as TCP. To use a UDP socket, since there is no connection, you do not need to take nearly as many steps to prepare.
A UDP socket must be bound to a specific port on your machine. Once the bind has occurred, the UDP socket is ready for use. It will immediately begin accepting any data that it sees on the port it is bound to.
It also allows you to send data out, as well as to set UDP socket options (which will be described later). To tell which machine is sending you what data, a UDP socket receives a data structure known as a Datagram. A Datagram consists of two parts, the IP address of the remote machine that sent you the data, and the ‘payload’– the actual data itself. When you attempt to send data out, you must also specify information in the form of a Datagram. This information is the remote address of the machine you want to receive your packet (this is not entirely true; please read further), the port it should be sent to, and the data you want to send the remote machine.
UDP sockets can operate in various modes, which are all very similar, but have vastly different uses. The mode that most resembles a TCP communication is called ‘unicasting’. This occurs when the IP address you specify when you write data out is that of a single machine. An example would be sending data to “www.google.com”, or to some network address. It is a Datagram that has one intended receiver.
The second mode of operation is called ‘broadcasting’. As the name implies, this is akin to yelling into a megaphone. Everyone gets the message, whether they want to or not. If the machine happens to be listening on the specific port you specified, then it will receive the data.
As you can imagine, broadcasting can amount to huge amounts of network traffic. The good news is, when you broadcast data out, it does not leave your subnet. Basically, a broadcast send will not leave your network to travel out into the world. When you want to broadcast data, instead of sending the data to an IP address of a remote machine, you specify the broadcast address for your machine. This address changes from machine to machine, so RB provides a property of the UDPSocket class that tells you the correct broadcast address.
This brings us to the third mode of operation for UDP sockets: ‘multicasting’. It is a combination of unicasting and broadcasting that proves to be very powerful and practical to use. Multicasting is a lot like a chat room: you enter the chat room, and are able to hold conversations with everyone else in the chat room. When you want to enter the chat room, you call JoinMulticastGroup, and you specify the group you want to join. The group parameter is a special kind of IP address, called a ‘Class D IP’. It can range from 224.0.0.0 to 239.255.255.255.
Think of the IP address as the name of the chat room. If you want to start chatting with two other people, all three of you need to call JoinMulticastGroup with the same Class D IP address specified as the group. When you want to leave the chat room, you just need to call LeaveMulticastGroup, and again, specify the group you want to leave. You can join as many multicast groups as you like; you are not limited to just one at a time. When you want to send data to the multicast group, you just need to specify the multicast group’s IP address. Everyone that has joined the same group as you will receive the message.
Multicasting has some extra features that make it an even more powerful utility for network applications. If you want to receive the multicast data you sent (known as “loopback”), set the SendToSelf property on the socket. If it is true, then when you do a send (to a multicast group) you will get that data back.
You can also set the number of router hops a multicast datagram will take (known as the “Time to Live”, or TTL). When your datagram gets sent out, it runs thru a series of routers on the way to its destinations. Every time the datagram hits a router, its RouterHops property is decremented. When that number reaches zero, the datagram is destroyed. This means you can control who gets your datagrams with a lot more precision. There are some “best guesses” as to what the value of RouterHops should be:
0
same host
1
same subnet
32
same site
64
same region
128
same continent
255
unrestricted
Note that if your datagram runs through a router that does not support multicasting, it is killed immediately. Most routers do not support multicast packet forwarding, and so, as a general rule, a multicast will never escape your local network. Therefore, multicasting can be great in a large internal network (that spans many routers and switches), but it probably will not work for you if you are trying to write Internet applications.
The connectionless functionality of UDP makes no guarantee that your data will reach its destination. You can work around this by creating your own protocol, on top of the UDP protocol, that acknowledges receives.
I want to go into a little more detail about Class D IP addresses, since they seem to confuse many users. A Class D IP address is a specially reserved IP that no “real” machine can have. So you do not have to worry that your local machine’s IP address is not Class D, and similarly, you do not have to worry about IP collisions on your network. These IPs are used by the network transport layer to determine how to efficiently to send a packet. When broadcasting, the transport will simply blast the packet out to every computer on the network. But when multicasting, the transport will determine which machines are connected to the group, and it will only send the packets to those machines. This will cut down on network traffic for “chatty” protocols, to a large degree. As of this writing, there is no database of Class D IPs and which applications use them (like there are for registered ports), and so picking an IP can be somewhat hard to do. A general rule of thumb is to pick a random-looking IP in the Class D range. If you run into collisions, it’s usually pretty trivial to change the address that the socket tries to multicast to, and collisions are fairly rare.
The Point-to-Point Protocol, or PPP, is the way to gain a connection to the Internet with a dial-up modem. It is system-wide functionality that you can use to get the modem to dial out to an ISP, and upon successful connection, you TCPSocket and UDPSocket code will function. Due to the system-wide nature of PPP, the calls that were attached to Socket have been moved to the System class.
Note that if you say System.PPPDisconnect, it will terminate the connection to the Internet. This means that you will kill other applications’ connections as well as your own, so be sure to ask the user if they want the connection terminated before happily killing all connections to the Internet!
Leave a Reply
You must be logged in to post a comment.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13