- What Is a Jabber Client?
- Session Mechanics
- Protocol Mechanics
- Protocol Details
- The <iq> Element
- Summary
Protocol Mechanics
As we have talked about before, within the context of XML messaging, Jabber's open XML protocol contains only three top-level XML elements:
<message/> (carrier for 'ordinary' conversational elements) <presence/> (carrier for presences and availability messages) <iq/> (carrier for informational and query messages)
With these elements, a client can create messages that accomplish all the design goals of an instant messaging system and more.
NOTE
As you'll see in later chapters, the Jabber server uses other XML elements for system management, but these three are the ones clients use most.
You can think of these message types as the basic structure of the carrier of concepts between users, applications, and between Jabber message switches. They are the equivalent of spoken language rules whereby words or other elements of sentence structure are combined to form grammatical sentences.
You may not have realized it (because we haven't discussed it yet) but all these asynchronously generated and arriving XML stanzas with presence, conversation, and queries don't simply appear at the switch completely out of context, as it were.
As shown in Figure 3.1, from a client standpoint, a session with the Jabber switch occurs in the context of an XML stream. This stream may be viewed as similar to a telephone conversation where a "hello" and "goodbye" are the signals for the beginning and end of a conversation. Everything in between is the body of a conversation; whether it's about the impending snow storm or a stock transfer is immaterial from the standpoint of the switch. It just wants well-formed "sentences" in the stream. The tagging of each individual message will aid the switch in routing your conversation.
Figure 3.1 A conversational stream between a single client and the switch.
Of course the jabberd is handling many simultaneous streams from potentially many clients at the same time. As depicted in Figure 3.2, some of your messages are destined for other clients, and the switch has to be able to route those on your behalf.
Figure 3.2 Serving multiple clients.
The switch can also, on its own initiative, send messages as a proxy for you. For example, as Figure 3.2 suggests, when you go offline, a presence message gets sent to other clients, some of whom are on your roster and who want to alert their users that you are no longer available. Thus, within the possible conversations between a client and a server, some messages are relevant to setting up and maintaining a relationship with the server, and some are concerned with maintaining the conversational context between clients, but all must be expressed in the syntax covered in this chapter and all are contained within an XML stream that begins with a <stream> tag and officially ends when the server gets the matching </stream> tag. This all makes some sense when you remember that XML documents can have only a single root tag. Thus everything sent to or via the server must be contained within the confines of an outermost tagthe <stream> </stream> pair.
To prove that this is so and that we are not just spewing nonsense, write a little shell script that does the equivalent of a "Hello world" programming example.
In an editor buffer, type this in:
<stream:stream to='localhost' xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams'>
If your development Jabber server is somewhere other than localhost, then replace that bit with the correct information.
Also, type this in on the next line:
</stream:stream>
Next, start a shell to run a Telnet session as shown in Figure 3.3.
Figure 3.3 Telnet "Hello World" example.
Make certain that you turn on local echo as shown in Figure 3.3 so that you can catch the output returned by the jabberd message switch.
Next, cut the first line from your editor buffer and paste it into the Telnet window, as shown in Figure 3.4.
NOTE
We didn't recommend typing in the initial XML because we can't type that many characters without an error; if you've a steady hand, however, you could have done so. Also note that there is a time limit imposed by the server for receiving data and logging in. If you fail to meet that time limit, it boots you off. That's another reason not to hand type it.
Figure 3.4 Completing "Hello World."
As shown, the server should return back the string:
<?xml version='1.0'?><stream:stream xmlns:stream='http://etherx.jabber.org/streams' id='3E453A18' xmlns='jabber:client' from='localhost'>
To which you reply (either by typing or pasting),
</stream:stream>
Essentially, this is the equivalent of ringing the phone, waiting for an answer, then ringing off.
The answer you got from the server contained an id field with some hex digits in it. Notice the importance and varying uses of the id as you go through the following examples.
To expand the "Hello World" example, let's exercise one of the three acceptable message types. This time after opening a session with the <stream> tag shown earlier, let's ask the server what it takes to register a new user by pasting in the following "sentence," which is an <iq> XML stanza whose type is a 'get'we're asking for information.
<iq id='AnythingYouWantHere' type='get'> <query xmlns='jabber:iq:register'/> </iq>
The server returns an <iq> message with the same ID you used to make the query, whose type is 'result':
<iq id=' AnythingYouWantHere ' type='result'> <query xmlns='jabber:iq:register'> <password/> <instructions> Choose a username and password to register with this server. </instructions> <name/> <email/> <username/> </query> </iq>
This is a sentence in Jabber-ese explaining that if you're going to create a client in these here parts, you're going to have to supply a name and password, a nickname (username), and an email address. Now you can paste back the following into your Telnet window:
<iq id='WhateverYouWantHere' type='set'> <query xmlns='jabber:iq:register'> <username>Praline</username> <password>Cleese</password> <name>Mr_Praline</name> <email>Mr_Praline@python.com</email> </query> </iq>
The Jabber server responds back to you using the same ID, and tells you that you are good to go as a new user:
<iq id='WhateverYouWantHere' type='result'/>
If you were writing a client in a programming language instead of shoving some XML in front of the server's face with Telnet, you could probably imagine that your register user method would jam the outgoing request into a dictionary structure keyed by id, put yourself into an idle loop, and then whenever the server got around to responding, you would parse the XML out of the continuously arriving stream, match up the response and see whether you were successful or not. You can imagine once again that if this were code, the iq 'result' should be interpreted by your client as 'success'. On the other hand, if you tried a second time to register, the server would slap your virtual wrist by responding as follows:
</iq><iq id='WhateverYouWantHere' type='error'> <query xmlns='jabber:iq:register'> <username>Praline</username> <password>Cleese</password> <name> Mr_Praline</name> <email> Mr_Praline@python.com </email> </query> <error code='409'>Username Not Available</error> </iq>
This is what you would fervently hope (if you were Mr. Praline) that the Jabber server would do.
Now assume you've now logged in successfully as Mr. Praline by opening a Telnet stream, pasting in the opening <stream> tag , and pasting in a login request:
<iq id='WhateverYouWantHere' type='set'> <query xmlns='jabber:iq:auth'> <username>Praline</username> <password>Cleese</password> <resource>telnet</resource> </query> </iq>
You got back the hoped for
<iq id='WhateverYouWantHere' type='result'/>
from jabberd. Now say you have some buddies on your roster, which you could have created via a set of <iq>-based messages. We're not going to show that here, as we want to keep things simple for the sake of illustration. Let's inject a <presence/> into the stream and see the effects; simply paste in or type:
<presence/>
This forces the server to take a peak at your stored profile (in this case praline.xml) and shove a roster back at you:
<presence from='jane@localhost/Home' to='Praline@localhost'> <status>available</status> <priority>0</priority> <x xmlns='jabber:x:delay' from='jane@localhost/Home' stamp='20030208T20:47:44'/> <x xmlns='jabber:x:delay' from='jane@localhost/Home' stamp='20030208T20:47:44'/> </presence> <presence from='dana@localhost/Home' to='Praline@localhost'> <status>available</status> <priority>0</priority> <x xmlns='jabber:x:delay' from='dana@localhost/Home' stamp='20030208T20:15:29'/> <x xmlns='jabber:x:delay' from='dana@localhost/Home' stamp='20030208T20:15:30'/> </presence>
Finally, let's say you wanted to re-enact the infamous Monty Python "Dead Parrot" sketch online. Here, you would use the final type of xml stanza type, <message/>. Once again, given that you have opened a Telnet to your jabberd on port 5222, have gotten the server's attention with a starting <stream> tag, and are logged in with an <iq> tag shown earlier, you can send a message:
<message from='Praline@localhost' to='dana@localhost' type='chat'> <body> I wish to complain about this parrot what I purchased not half an hour ago from this very boutique. </body> </message>
Assuming dana is using one of the several real clients and not just typing in XML stanzas as you have been doing, a chat window pops up on dana's client (see Figure 3.5), and away you go with the dialogue.
Figure 3.5 Using the "message" XML stanza.
What you actually see in the more primitive Telnet window looks like this, however, with a message ID and thread ID generated by dana's client (so that it could track subsequent messages on the same topicat least as the concept of "topics" is understood by the communicants). Note too that the apostrophes are escaped as you would expect to see in a proper XML stream: A client and the server must deal with proper XML construction.
<message id='jcl_52' to='Praline@localhost' type='chat' from='dana@localhost/Home'> <thread>fd6d59abf7970b853d09580693569290739a80ae</thread> <body> Oh yes, the, uh, the Norwegian Blue...What's,uh...What's wrong with it? </body> <x xmlns='jabber:x:event'> <composing/> </x> </message>
If you wanted to respond, you might type into the Telnet stream:
<message id='' from='Praline@localhost' to='dana@localhost' type='chat'> <body> I'll tell you what's wrong with it, my lad. 'E's dead, that's what's wrong with it! </body> </message>
More about the id attribute and the <thread/> tag later.
After a bit more of this, you would end the conversation with the server by typing in the closing </stream:stream> tag. If you were actually typing in the example, you might have noticed that as soon as you typed in the final > character, the Telnet session closed immediately. The Jabber server reads every character sent to it and uses an event-driven XML parser. As soon as the "hang up" event is fully detectedpoof, your session is gone.