- .dbg and .pdb Symbol Files
- Global PDB File Layout
- Scanning the Root Stream
- Decomposing a PDB File
- Sample Code Archive
- Bibliography
Scanning the Root Stream
Once you have located the pages that comprise the root stream, you can easily access the data stream directory. The root stream is composed of two main parts:
A variable-length header in the form of a PDB_ROOT structure, as outlined in Listing 3
An array of 16-bit zero-based page numbers
The wCount member of the PDB_ROOT structure states the number of stored data streams, and as such defines the number of PDB_STREAM items in the aStreams[] array. The PDB_ROOT__() macro at the end of Listing 3 is a convenient shorthand notation to calculate the overall size of the root stream header, given the number of data streams. The Windows 2000 PDB symbol files contain eight streams, so their header size is 4 + (8 * (4 + 4)) = 68 bytes.
Listing 3 PDB Root Stream Structure
typedef struct _PDB_ROOT { WORD wCount; // < PDB_STREAM_MAX WORD wReserved; // 0 PDB_STREAM aStreams []; // stream #0 reserved for stream table } PDB_ROOT, *PPDB_ROOT, **PPPDB_ROOT; #define PDB_ROOT_ sizeof (PDB_ROOT) #define PDB_ROOT__(_n) (PDB_ROOT_ + ((_n) * PDB_STREAM_))
The page number array follows immediately after the header. One problem with this design is that you cannot randomly access the data streams because the root stream doesn't specify the data streams' start indices within the array. To locate the page numbers that are associated with the third stream, you have to compute the number of pages occupied by the first two streams from their stream sizes with respect to the current PDB page size. This is where the pwStreamPages of the PDB_STREAM structure (see Listing 2) comes in handya PDB file processor needs to look up the page number subsets only once after loading the PDB data to virtual memory and can set up the pwStreamPages member of each stream to point to its first page number within the array. Note that this is just one out of the possible uses for this structure member, and not necessarily the one that Microsoft designed it for.
The rest of the story is quickly told: Read any data stream of interest just as you have read the root stream before, (that is, walk down the list of page numbers and look up each page by multiplying the page number by the page size). Isn't it strange? Although the PDB file format looks quite convoluted on first sight, it turns out, after closer examination, that it is actually quite simple to process. All you need is a subroutine that takes an array of page numbers, and concatenates all pages referenced by it. In the remaining sections, I will present a sample implementation of a PDB file processor that doesn't do much more than this.