Learn Socket Programming with Python
As a web developer, I haven't had the need to do much socket programming because a web application framework takes care of those details for me. However, there has been a time or two when knowing socket programming was useful in terms of providing secure data transfer to other in-house servers from over the web.
It's hard to avoid socket programming altogether because virtually every type of server needs it to host different types of server applications. If you're a Windows programmer, you probably know socket programming quite well.
Python makes socket programming easy. Other languages—such as Java or C#—make this more complicated by having to use numerous streaming and buffer methods. Using these methods can get confusing because you are often making compound method calls to do something conceptually simple.
With Python, you do not have to worry because these details are taken care of for you. As you will see, Python socket programming is rather elegant.
Socket Programming in General
Sockets use protocols to determine the connection type used to do port-to-port communication between client and server machines. There are protocols for IP addressing, Dynamic Name Servers (DNS), email, FTP, and so on. Each of these server types uses different protocol definitions to transfer data. There are two types of socket connections: stateful and stateless (or connectionless) connections.
With stateful connections, the protocol requires an acknowledgement from the target machine that the data did indeed get there and that all the data is intact. The Terminal Control Protocol is a stateful protocol because it needs packet routing confirmations between source and destination ports.
Services that require that the data be sent and acknowledged often use the TCP protocol as the basis for their specific protocol. For example, the Simple Mail Transfer Protocol (SMTP) for sending email is a TCP-based protocol. It is important to know that an email got to where it was meant to go.
Stateless connections do not require data transfer acknowledgement like TCP does. A commonly used stateless protocol is the Universal Datagram Protocol (UDP). A DNS service uses this protocol as the basis for packet routing. UDP is often used for larger data packet routing and extends the UDP protocol to include more information about packet routing.
Python Socket Programming
Python has a socket method that allows you to set up virtually any type of socket. To set up a TCP socket, use the following code:
from socket import * tcpSock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
The first parameter is the socket family type, and in this case it happens to be an Internet socket. There are also different types of socket families (Unix, Internet, etc.). The second parameter tells the Python interpreter that this is a streaming socket. Streaming sockets are stateful socket connections.
To set up a UDP socket, do the following:
udpSock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
In this example, the socket family type is still Internet, but the socket packet type has changed to Datagram, which informs the interpreter that this is a UDP socket.
After a socket is defined, there are several methods that you can use to manage connections. To demonstrate, code for a simple TCP timestamp server appears here:
from socket import * from time import ctime HOST = 'localhost' PORT = 28812 BUFSIZE = 1024 ADDR = (HOST, PORT) tcpTimeSrvrSock = socket(AF_INET,SOCK_STREAM) tcpTimeSrvrSock.bind(ADDR) tcpTimeSrvrSock.listen(50) while True: print 'waiting for connection...' tcpTimeClientSock, addr = tcpTimeSrvrSock.accept() print '...connected from:', addr while True: data = tcpTimeClientSock.recv(BUFSIZE) if not data: break tcpTimeClientSock.send('[%s] %s' % (ctime(), data)) tcpTimeClientSock.close() tcpTimeSrvrSock.close()
In this example, the local machine is used to set up the connection. After the address is defined, containing the host and port, the address is then bound to the socket controller. The socket is set to listen to up to 50 connections maximum. The while loop keeps the socket active, waiting for connections in a thread-like context. The connection is set up to read 1024 bytes at a time from the incoming data stream.
Reading chunks of data at a time is often referred to as chunking. Data chunking allows the server to process incoming data more effectively. For large streams, processing a stream all at once can result in very poor server performance. In the last part of this example, the server sends back the client's message with a timestamp using the Python format specifier.
To use the server, there has to be a client application that connects to it. You can easily build socket clients that send data to the server.
An example client socket program appears follows:
from socket import * HOST = 'localhost' PORT = 28812 BUFSIZE = 1024 ADDR = (HOST, PORT) tcpTimeClientSock = socket(AF_INET, SOCK_STREAM) tcpTimeClientSock.connect(ADDR) while True: data = raw_input('> ') if not data: break tcpTimeClientSock.send(data) data = tcpTimeClientSock.recv(BUFSIZE) if not data: break print data tcpTimeClientSock.close()
Notice a lot of this code is the same as the server code, but this time we are creating an active input message loop instead of setting up a listener. Once prompted for the input message, the user types in the message, and the server sends the message back with the timestamp prepended to the message.
The examples thus far use only the general socket methods. If you were going to do protocol-specific implementations (i.e. SMTP), these protocol types come with their own additional methods.
Conclusion
In this article you learned about Python's basic socket programming functionality. You can easily set up a TCP server and client using the general socket methods of the Python standard library.
Sending messages back and forth using different basic protocols is quite simple. Python also has a higher-level module called socketserver. With socketserver, you can set up mail servers and clients.
In the next article, I will cover this module in more depth while giving examples of client and server to demonstrate the most common methods, as was done here.