WebSockets in Windows Store Apps
The WebSocket protocol was standardized in 2011. It was designed specifically to be implemented in web browsers and web servers. The standard is TCP-based and by default runs on the same ports as HTTP and HTTPS for standard and secured implementations, respectively. A common myth is that it uses HTTP for communications. The HTTP protocol is only used for an initial handshake to establish communication and upgrade the request to use the WebSockets protocol. Once the connection is established, it provides a continuous socket between the client and the server available for real-time communications to facilitate scenarios ranging from online games to collaboration and messaging apps.
The WebSockets protocol is fully supported by Windows Store apps through the Windows Runtime (WinRT). This makes it useful to provide a way for the Windows Store app to maintain a single connection with a server and send data both ways while keeping that connection open. It is less likely to be blocked by firewalls because it uses the same ports that the HTTP protocol does. The implementation for WebSockets exists in the Windows.Networking.Sockets namespace.
A Sockets Primer
Sockets facilitate inter-process communication (IPC) between processes running on a network. Although it is possible to use IPC on the same machine, the most common case is for communication between physically separate machines. The most common protocol for IPC communication is called Transmission Control Protocol/Internet Protocol (TCP/IP). TCP/IP is often referred to as “the language of the internet.”
TCP/IP provides two layers. One layer, TCP, handles the actual messaging component. It can take large messages, or streams, break them into packets to transmit over the network, then verify that there was no data loss and reassemble the packets at the target site. The second layer, IP, handles routing to ensure the messages make it to the right destination.
A typical session involves two endpoints. An endpoint is the combination of an IP address and a port. A port is simply a unique number used to describe the type of communication that is expected. By convention, popular protocols use specific ports, although this can be overridden. For example, the HTTP protocol uses TCP on port 80, while the secured HTTPS protocol uses port 443 by default. The client endpoint will “call” server endpoint using an address and port. If the server responds, the client typically provides a dynamic port from a range of reusable ports to establish the client endpoint. This is shared with the server and two-way communications can begin.
Protocols like WebSockets and HTTP use TCP. In the case of HTTP, well-defined messages are exchanged via HTTP that defined requests and responses. When you navigate to a web page, a request is sent using TCP that the server understands. For example, if you type in the URL to my personal website, http://www.jeremylikness.com/, something like this will be sent to the server:
GET http://www.jeremylikness.com/ HTTP/1.1 Host: http://www.jeremylikness.com Connection: keep-alive Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36 DNT: 1 Accept-Encoding: gzip,deflate,sdch Accept-Language: en-US,en;q=0.8
The message indicates how the page should be fetched, what the page is, and even the types of responses the browser is capable of handling. The user agent portion of the request describes the client and is useful for both tracking analytics by the server as well as serving the appropriate page that is supported by the client. Notice the request can also include language and encoding options such as the ability to decompress data if the server should choose to compress it before sending.
The WebSocket protocol uses the same port but enables real-time, two-way messaging. Like HTTP, it uses TCP for the underlying connection. Unlike HTTP, it allows information to be streamed both ways simultaneously. This allows the client to provide continuous data to the server (such as actions that are taking place in an online game) while the server can provide its own updates, whether those describe what is happening on the virtual game universe or contain the latest numbers for a stock ticker display.
The WinRT Implementation of WebSockets
The Windows Runtime provides implementations for creating a WebSockets client or server. A server is less practical because clients typically roam and do not have well-known IP addresses, so this article will focus primarily on the client implementation. There are two types of WebSocket client APIs you can use.
The MessageWebSocket is a simpler implementation that is suitable for smaller messages. It queues an entire “message” and provides notification when those messages are received. A message can be UTF-8 or binary formatted. The StreamWebSocket is suitable for larger messages or continuous, real-time streams. You control how much of a message you read at any given moment, and because the messages are implemented as streams, they support only binary messages.
WebSockets in Action
My latest book, Programming the Windows Runtime: A Comprehensive Guide to WinRT with Examples in C# and XAML, provides extensive coverage of networking APIs in Chapter 10. The sample code for the book is open source. You can download the sample code from http://winrtexamples.codeplex.com/ and open the WebSocketsExamples project from the Chapter10 solution folder to follow along with this example.
The example project demonstrates both APIs you can use from WinRT to take advantage of the WebSockets protocol. It uses a server provided by the http://WebSocket.org website that exposes an “echo service.” This service will echo back any data sent to it by a connect client. WebSockets are accessed using a standard URI, as declared in MainPage.xaml.cs:
private readonly Uri echoService = new Uri("ws://echo.websocket.org", UriKind.Absolute);
The MessageWebSocket class is an abstraction of the protocol that focuses on sending simple messages. A message is either read or written in a single operation, as opposed to being streamed continuously. It is also the class you must use to support UTF8 messages, because the stream-based API only supports binary (although you can encode and decode the binary to and from UTF8, the MessageWebSocket class provides native support for this). In order to use any socket type within a Windows Store app, you must enable either the Home or Work Networking or the Internet (Client & Server) capability.
The ButtonBase_OnClick method in the MainPage.xaml.cs file demonstrates how to use the MessageWebSocket class. After creating an instance of the class, set the type of the message (either binary or UTF8):
this.socket.Control.MessageType = SocketMessageType.Utf8;
You can also register for events that will fire whenever a message is received and when the socket is closed. The socket uses underlying unmanaged resources and should be disposed when you are done using it. The easiest way to do this is to call Dispose in the Closed event handler.
Initiate the connection by calling and waiting for ConnectAsync to complete:
await this.socket.ConnectAsync(echoService);
The example app will accept any message you type and send it to the echo service. The message must be sent using the IOutputStream exposed by the socket. The easiest way to do this is to create an instance of a DataWriter to send the message. The DataWriter allows you to write various data types that it buffers until you call StoreAsync. This will flush the buffer to the underlying stream.
var writer = new DataWriter(this.socket.OutputStream); writer.WriteString(this.Text.Text); await writer.StoreAsync();
Not all error messages for the socket are mapped to .NET Exception class instances. Instead, you must inspect the HResult of the underlying exception to determine what went wrong. Fortunately, the class provides a static method that translates the result to the corresponding WebErrorStatus enumeration. The ToErrorMessage method returns a string with the original message and the enumeration value.
private static string ToErrorMessage(Exception ex) { var status = WebSocketError.GetStatus( ex.GetBaseException().HResult); return string.Format("{0} ({1})", ex.Message, status); }
The MessageReceived event is raised whenever a message is sent from the server to the client through the socket. In the example app, this should happen any time data is sent because the server will echo the data back. The event sends the socket that the information was received from within event arguments that provide access to the message. You can inspect the message type (binary or UTF8) and open a reader or stream to access the message. In this example, the reader is set to use UFT8 encoding, then obtains the message and displays it in the SocketMessageReceived event handler.
using (var reader = args.GetDataReader()) { reader.UnicodeEncoding = UnicodeEncoding.Utf8; var text = reader.ReadString(reader.UnconsumedBufferLength); this.Response.Text = text; }
This is the simplest method for dealing with sockets that are designed to share messages. In cases where you are using the socket to stream real-time information and don’t necessarily have simple messages, you may want to use the StreamWebSocket implementation instead. This provides a continuous two-way stream for sending and receiving information. The example app uses the same echo service to stream prime numbers and echo them back to the display when you click the Start button.
You create and connect to a StreamWebSocket the same way as a MessageWebSocket. You can also register for the Closed event. Instead of sending and receiving messages, however, the stream version expects you to interface directly with the input and output streams provided by the socket. The example app starts a long-running Task encapsulated in the ComputePrimes method. It is passed the OutputStream of the socket. It iterates through positive integers and writes out any that are computed to be primes, then delays for 1 second:
if (IsPrime(x)) { var array = Encoding.UTF8.GetBytes(string.Format(" {0} ", x)); await outputStream.WriteAsync(array.AsBuffer()); await Task.Delay(TimeSpan.FromSeconds(1)); }
If the integer is not a prime, it delays for a millisecond just to prevent hogging the CPU. Another long-running task is used to receive the echo. It allocates a buffer, waits for data to arrive in the stream, then reads and decodes the data.
var bytesRead = await stream.ReadAsync(buffer, 0, buffer.Length); if (bytesRead > 0) { var text = Encoding.UTF8.GetString(buffer, 0, bytesRead); this.DispatchTextToPrimes(text); }
This example also demonstrates the fact that you can have multiple sockets open to the same destination at once. You can run the example, click the button to start generating primes, then use the message-based version to send and receive messages without interrupting the stream of prime numbers. Both methods for communicating with the socket simplify the amount of code you have to write by not worrying about the details of the underlying transport (TCP). When you need to manage a raw TCP connection, you can use the traditional sockets components.
Advanced WebSockets
The example app demonstrates the basic use of WebSockets in Windows Store apps. The components for WebSocket communication support advanced options as well. The MessageWebSocket component provides the following settings:
- MaxMessageSize – Used to configure the maximum size of a single message sent or received.
- MessageType – Controls the type of message (UTF8 or Binary).
- OutboundBufferSizeInBytes – Controls the size of the send buffer.
- ProxyCredential – Use this to configure a credential for the proxy server (will be sent as part of the HTTP header authentication).
- ServerCredential – Used to configure a credential to be read by the WebSocket server.
- SupportedProtocols – Provides a list of sub-protocols supported by the web socket (these are protocols that use the WebSocket but provide additional functionality through their own protocol implementation).
The StreamWebSocket control provides these options:
- NoDelay – Turn the Nagle algorithm on or off.
- OutboundBufferSizeInBytes – Like the MessageWebsocket, controls the size of the send buffer.
- ProxyCredential – Credentials to authenticate to the proxy server.
- ServerCredential – Credentials to authenticate to the WebSocket server.
- SupportedProtocols – Provides a list of sub-protocols.
In addition, you can also secure the WebSocket communications. The default scheme ws:// uses the standard, unsecured protocol. If the server supports encrypted connections, use the wss:// scheme instead. This will secure the connection by encrypting the streams. It works by tunneling the stream through Transport Layer Security (TLS) and Secure Sockets Layer (SSL).
The WebSockets protocol is a new but popular protocol that makes it easy to build real-time communications into your Windows Store apps. WinRT supports a variety of network protocols that are all covered in Chapter 10 of Programming the Windows Runtime: A Comprehensive Guide to WinRT with Examples in C# and XAML. The book contains 20 chapters plus a comprehensive glossary and more than 80 open-source projects providing specific examples that use WinRT APIs.