Abstract Syntax Notation Using ASN.1
ASN.1 provides a means of defining and then encoding data elements. It is a machine-independent data description language with a defined set of keywords such as BOOLEAN, INTEGER, SEQUENCE, STRING, and so on. The Basic Encoding Rules (BER) are used to specify the actual representation of data sent from one network entity to another (often referred to as on-the-wire representation). Perhaps not surprisingly, ASN.1 is widely used in network management protocols, such as SNMP, in which data is routinely passed from machine to machine.
ASN.1 data objects are not dissimilar from data types such as int, char, and so on. The main difference is that ASN.1 has its own encoding scheme.
All ASN.1 values can be translated using the BER into a series of three simple items:
- Tag: What ASN.1 type is it?
- Length: How long is the data object?
- Value: What is the value of the data object?
Figure 1 illustrates the idea of tag, length, value, or TLV as it’s called.
Figure 1 A TLV (or tag length value) data entity
Figure 1 also illustrates two example encodings:
- One of an integer value of 266 (or hexadecimal 0x010A)
- A string value of "HELLO" (or hexadecimal 0x48454C4C4F)
Figure 1 shows that integers have a tag value of 2. The length is determined from the size of the integer (2 bytes in the case of 266), and the value is simply the integer value itself (0x010A in the case of 266).
Figure 1 also illustrates the encoding of a string value (technically called an OCTET STRING). The string is a friendly "HELLO", which is encoded in ASCII as hexadecimal: 0x48454C4C4F.
Suppose that we want to put these two entities (integer and string) together and send them across the network as one item. This packaging of multiple ASN.1 entities can be done using another ASN.1 type called a SEQUENCE (tag value 0x30). This type also has a TLV structure, as illustrated in Figure 2.
Figure 2 Constructed TLV entity containing an integer and a string
Although Figure 2 looks complicated, we can break it down as follows in Table 1. In Table 1, I use the word Contents in place of Value.
Tag |
Length |
Contents |
|
|
0x30 |
0x0C |
Tag |
Length |
Contents |
|
|
0x02 |
0x02 |
0x010A |
|
|
Tag |
Length |
Contents |
|
|
0x04 |
0x05 |
0x48454C4C4F |
Table 1. Deciphering the Contents of the SEQUENCE TLV
After you realize that the SEQUENCE type is nested, you can see that its constituents are simply encoded in the value field.
So, I hope you agree with that it’s pretty simple and straightforward! The individual SEQUENCE elements are just packed at the end of the SEQUENCE TLV, as shown in Figure 2.
The nested nature of Figure 2 provides a clue about how a recursion mechanism can be used for decoding. In fact, you can probably begin to see how I’ll use recursion to decode these entities! Listing 1 illustrates a sneak preview of the code that contains the ASN.1 encoded data that my example program will decode using recursion.
Listing 1 ASN.1 Data for Decoding
const char dataSet [] = { ’\x30’, ’\x0C’, ’\x02’, ’\x02’, ’\x01’, ’\x0A’, ’\x04’, ’\x05’, ’\x48’, ’\x45’, ’\x4C’, ’\x4C’, ’\x4F’, ’\0’};
If you look carefully at Listing 1, you’ll see that it matches the ASN.1 entities in Figure 2. So, I’m hard-coding the ASN.1 data rather than requiring that this be read from a network interface. Obviously, the normal case for ASN.1 processing is as illustrated in Figure 3.
Figure 3 ASN.1 Messages in a network
The network management system in Figure 3 sends and receives ASN.1-encoded messages to and from the network devices. Encoding and decoding of ASN.1 messages takes place in all devices in Figure 3.
That’s the theory over and done with! Now, we can take a look at some code. To start, we need some code that processes each of the individual TLV types of interest to us. Clearly, this is a much-reduced version of the ASN.1 message handling that would occur in an operational situation, but the principles remain the same.