WebSockets and sub-protocols
Friday, May 7th, 2010The WebSockets protocol needs the concept of a sub-protocol to make sure the client and server are sending messages they both understand. A quake client, for example, can only talk to another quake client, not a chat client, and a quake/3.1 client might not be able to talk to a quake/5.3 client. To make sure the clients and servers are taking the same protocol, WebSockets introduces a sub-protocol validation.
Although “sub-protocol” might sound somewhat complicated, it’s just recognition that applications will define simple protocols on top of WebSockets like they define XML formats and schema using the XML syntax, or JSON applications define objects to pass back and forth. Like XML and JSON, WebSockets is a layer that applications build on.
Some examples that WebSockets applications will create are JSON packets over WebSockets, XML over WebSockets, XMPP over WebSockets, and Hessian packets over WebSockets, as well as custom protocols like Quake or a tic-tac-toe game.
The client and server will validate the protocol to make sure a Quake/2.0 client won’t get confused talking to a Quake/1.0 server. At the beginning of WebSockets, the client HTTP handshake sends a Sec-WebSocket-Protocol header with the sub-protocol name like quake.idsoftware.com/1.0. If the server understands that version, it will respond with a Sec-WebSocket-Protocol of quake.idsoftware.com/1.0. If not, it will close the connection.
Although the protocol string is arbitrary, it’s a good idea to use unique names like “quake.idsoftware.com” with a version “/1.0″.
Sub-protocols using a HTML5 browser JavaScript will always send and receive unicode text, not binary. That text will always be encoded in UTF-8, a convention necessary for sanity, because allowing multiple character encodings would be more trouble than the small benefits, and would make implementation far more difficult.

