- TCG Device Driver Library
- TPM 1.1b Specification Device Interface
- TPM 1.2 Specification Device Interface
- Summary
TPM 1.2 Specification Device Interface
To reduce the myriad of TPM device interfaces and device drivers, TCG decided to standardize the device interface as part of its TPM 1.2 specification. This standard is called the TCG PC Client Specific TPM Interface Specification, or TIS for short.
As a result of this standard, firmware and operating system vendors need to implement only one device driver to support all the available TIS-compliant devices. It is exactly for this reason that Microsoft decided to only support TPM 1.2-compliant devices in its Microsoft Vista operating system, even though Vista currently uses only TPM 1.1b features.
The TIS defines two different kinds of device interfaces. The first is the legacy port I/O interface and the second is the memory mapped interface. Because the port I/O interface is similar to the one discussed in the previous section, we will only concentrate on the memory mapped interface in the next section.
Technical Details
A TPM 1.2-compliant device uses memory mapped I/O. It reserves a range of physical memory that the host operating system can map into its virtual address range. Instead of executing explicit I/O instructions to communicate to the device, it is sufficient to use traditional memory accesses.
The memory mapped interface for the TPM is shown in Figure 4.1. It consists of five 4KB pages that each presents an approximately similar register set. Each page corresponds to a locality—that is, commands that are sent to the TPM through the register set at address FED4.0000 are implicitly associated with locality 0. Similarly, commands sent using the register set at FED4.2000 use locality 2. The reason for the duplication of register sets and the alignment on page boundaries is to enable the host operating system to assign different localities to different levels of the system and regulate their access by controlling their virtual memory mappings. For example, a security kernel would typically only have access to the register set associated with locality 1, while the applications would be associated with the register set belonging to locality 3. Locality 0 is considered legacy and is used by systems that use the static root of trust measurements.
Figure 4.1 TPM memory map
Locality 4 is special in the sense that it is used by the processor to store the measurement of a secure loader that was started as part of the dynamic root of trust mechanism. The secure loader measurement is stored in PCR17 and can only be set by using special LPC bus cycles that cannot be generated by software. This ensures that the secure loader measurement originated from the processor itself.
Unlike most devices that follow the TPM 1.1b specification, the TIS standard specifies that a TPM device should be able to generate interrupts. That is, whenever data becomes available or the device is ready for its next command, it generates an interrupt. However, sometimes interrupt-driven device drivers are inappropriate, especially during early bootstrap. In these cases, the TPM device can still be used in polled I/O mode.
An interesting aspect of the memory mapped interface is that multiple consumers can access the TPM device at the same time, leading to potential concurrency issues. For example, a trusted operating system and an application can concurrently issue commands to the TPM to update a particular PCR. To ensure the consistency of the internal TPM state, the TIS designers developed a locking protocol. Before a consumer at a certain locality can use the TPM, it needs to request access to the TPM by setting the RequestUse bit in the access register. If no locality is currently using the TPM, then access is granted. If the TPM is being used, then the access request remains pending until the owning locality relinquishes control. The highest-level locality is then given access first. If a locality has crashed or is malicious and does not relinquish control, a higher-level locality can take over the TPM by setting the Seize bit.
In the next section, we will describe how to implement a function that is similar to the Tddli_TransmitData function. Because the TIS interface is a lot more complicated than the TPM 1.1b driver, we will focus on a simplified driver that uses the memory mapped interface and only uses the legacy locality 0. A driver with these restrictions uses only a core set of the TIS interface, which we will describe later in the chapter.
Among the set of registers the TPM provides are the access, status, and data FIFO registers. These registers reside at offsets 000, 018, and 024 (hexadecimal), respectively. The access register is used to gain access to a particular locality and is used by the locking protocol described previously. When written, bit 2 signals a request to use and bit 6 relinquishes control. When reading the access register, bit 6 signals that this locality is currently active.
The status register indicates that the device is ready for a command (bit 6), has a response available (bit 5), or is expecting more data (bit 4). The high-order bit of the status byte (bit 8) indicates that the status is valid, and this bit can be used by a driver that uses polled I/O to wait for a particular status to occur. When writing the status register, bit 6 indicates that the recently written command blob should be executed. The status register is an 8-bit register. The remaining bytes of the status word (byte 1 and 2) form a burst count. That is essentially the size of the data FIFO and specifies the number of bytes that can be written or read consecutively. The TPM's data FIFO is available through the register at offset 024.
Some register sets, such as the set for locality 0, contain additional registers. An example of that is the device and vendor ID register at offset F00. This register is typically used to determine the presence of the TPM and, if one is present, the vendor. The assigned vendor IDs currently follow the PCI-Express vendor designations.
In the following code examples, we use this shorthand to read and write memory locations: The address argument is relative to the base of the memory range where physical address FED4.0000 is mapped and is only used to read and write the TPM register sets. The routines also include the necessary memory barrier operations to ensure coherency.
unsigned char read8(unsigned long addr); void write8(unsigned char val, unsigned long addr); unsigned long read32(unsigned long addr); void write32(unsigned long val, unsigned long addr);
For convenience, we use the following definitions in the code to make it easier to follow:
/* macros to access registers at locality ''l'' */ #define ACCESS(l) (0x0000 | ((l) << 12)) #define STS(l) (0x0018 | ((l) << 12)) #define DATA_FIFO(l) (0x0024 | ((l) << 12)) #define DID_VID(l) (0x0F00 | ((l) << 12)) /* access bits */ #define ACCESS_ACTIVE_LOCALITY 0x20 /* (R)*/ #define ACCESS_RELINQUISH_LOCALITY 0x20 /* (W) */ #define ACCESS_REQUEST_USE 0x02 /* (W) */ /* status bits */ #define STS_VALID 0x80 /* (R) */ #define STS_COMMAND_READY 0x40 /* (R) */ #define STS_DATA_AVAIL 0x10 /* (R) */ #define STS_DATA_EXPECT 0x08 /* (R) */ #define STS_GO 0x20 /* (W) */
Device Programming Interface
Before we can use the TPM, we need to initialize the device. The following routine performs a trivial initialization and presence check:
int init(void) { unsigned vendor; int i; for (i = 0 ; i < 5 ; i++) write8(ACCESS_RELINQUISH_LOCALITY, ACCESS(i)); if (request_locality(0) < 0) return 0; vendor = read32(DID_VID(0)); if ((vendor & 0xFFFF) == 0xFFFF) return 0; return 1; }
The initialization routine is executed only once when the driver is instantiated. It first forces all localities to relinquish their access and then requests the use of the legacy locality 0. When access is granted, it checks for the presence of a valid vendor ID. When this check succeeds, the TPM is ready to accept commands. If you were writing a driver that uses interrupts, this would be the place where you would check for the supported interrupts and enable them.
int request_locality(int l) { write8(ACCESS_RELINQUISH_LOCALITY, ACCESS(locality)); write8(ACCESS_REQUEST_USE, ACCESS(l)); /* wait for locality to be granted */ if (read8(ACCESS(l) & ACCESS_ACTIVE_LOCALITY)) return locality = l; return -1; }
To request a locality that is specified by parameter l, we relinquish the current locality (stored in the global variable locality) and request the use of the specified locality. For a TPM driver that only provides access to the legacy 0 locality, this access should be granted immediately with the locality active bit set in the status register. If this routine is used for the non-legacy localities, it would have to wait until the access is granted. This wait can be done by either polling the active bit or by waiting for the locality changed interrupt.
To send a command to the TPM, we use the following function:
int send(unsigned char *buf, int len) { int status, burstcnt = 0; int count = 0; if (request_locality(locality) == -1) return -1; write8(STS_COMMAND_READY, STS(locality)); while (count < len - 1) { burstcnt = read8(STS(locality) + 1); burstcnt += read8(STS(locality) + 2) << 8; if (burstcnt == 0){ delay(); /* wait for FIFO to drain */ } else { for (; burstcnt > 0 && count < len - 1; burstcnt—) { write8(buf[count], DATA_FIFO(locality)); count++; } /* check for overflow */ for (status = 0; (status & STS_VALID) == 0; ) status = read8(STS(locality)); if ((status & STS_DATA_EXPECT) == 0) return -1; } } /* write last byte */ write8(buf[count], DATA_FIFO(locality)); /* make sure it stuck */ for (status = 0; (status & STS_VALID) == 0; ) status = read8(STS(locality)); if ((status & STS_DATA_EXPECT) != 0) return -1; /* go and do it */ write8(STS_GO, STS(locality)); return len; }
In order for us to send a command to the TPM, we first have to ensure that the appropriate locality has access to the device. This is especially important when the driver supports multiple localities with concurrent access. Once we get access to the requested locality, we tell the TPM that we are ready to send a command and then proceed to write the command into the DATA FIFO at burst count chunks at the time. After each chunk, we check the status register for an overflow condition to make sure the TPM kept up with us filling the FIFO. As an added safety measure, we write the last byte separately and check whether the TPM is no longer expecting more data bytes. If it is not, then the TPM is ready to execute the command blob we just transferred to it. This is done by setting the status to GO.
Receiving a response from the TPM is slightly more complicated, and its code is divided into two parts: a helper function to get the data from the TPM and a receiver function to reconstruct the response.
The helper function is listed next:
int recv_data(unsigned char *buf, int count) { int size = 0, burstcnt = 0, status; status = read8(STS(locality)); while ((status & (STS_DATA_AVAIL | STS_VALID)) == (STS_DATA_AVAIL | STS_VALID) && size < count) { if (burstcnt == 0){ burstcnt = read8(STS(locality) + 1); burstcnt += read8(STS(locality) + 2) << 8; } if (burstcnt == 0) { delay(); /* wait for the FIFO to fill */ } else { for (; burstcnt > 0 && size < count; burstcnt—) { buf[size] = read8(DATA_FIFO (locality)); size++; } } status = read8(STS(locality)); } return size; }
Receiving a response from the TPM follows the same principles as writing a command to it. As long as there is data available from the TPM and the buffer count has not yet been exhausted, we continue to read up to burst count bytes from the DATA FIFO.
It is up to the following receive function to reconstruct and validate the actual TPM response:
int recv(unsigned char *buf, int count) { int expected, status; int size = 0; if (count < 6) return 0; /* ensure that there is data available */ status = read8(STS(locality)); if ((status & (STS_DATA_AVAIL | STS_VALID)) != (STS_DATA_AVAIL | STS_VALID)) return 0; /* read first 6 bytes, including tag and paramsize */ if ((size = recv_data(buf, 6)) < 6) return -1; expected = be32_to_cpu(*(unsigned *) (buf + 2)); if (expected > count) return -1; /* read all data, except last byte */ if ((size += recv_data(&buf[6], expected - 6 - 1)) < expected - 1) return -1; /* check for receive underflow */ status = read8(STS(locality)); if ((status & (STS_DATA_AVAIL | STS_VALID)) != (STS_DATA_AVAIL | STS_VALID)) return -1; /* read last byte */ if ((size += recv_data(&buf[size], 1)) != expected) return -1; /* make sure we read everything */ status = read8(STS(locality)); if ((status & (STS_DATA_AVAIL | STS_VALID)) == (STS_DATA_AVAIL | STS_VALID)) { return -1; } write8(STS_COMMAND_READY, STS(locality)); return expected; }
The receive function is called either after a data available interrupt occurs or, when the driver uses polling, the status bits valid and data available were set. The first step for the receive function is to ensure that the TPM has a response that is ready to be read. It then continues to read the first 6 bytes of the response. This corresponds to the TPM blob header and contains the length of the total response in big endian byte order. After converting the response size to native byte order, we continue to read the rest of the response. As with the send function, we treat the last byte separately so that we can detect FIFO underflow conditions. Once the entire response is read, the TPM is ready to accept its next command. This is achieved by setting the command ready bit in the status register.