- Introduction
- DOM Basics
- Using the MSXML DOM from C++
- A Simple XML Document and Its Schema
- Consuming XML
- Producing XML
- Conclusion
- For More Information
Using the MSXML DOM from C++
The first and perhaps most important thing you need to consider is that MSXML is a service running under Microsoft's Component Object Model (COM). Developers who have previously worked with COM from C++ will probably be fairly comfortable with how the DOM is implemented in MSXML. Those who haven't previously worked with COM may have a bit of a learning curve, but it shouldn't be too bad. There are really only a few basic COM elements to master.
There are a few ways to deal with COM objects, but for simplicity in coding I prefer to use the #import compiler directive and smart pointers. This is the style that's used in the two sample programs I present in this article. The DOM objects used in the programs are created within the scope of a main routine try block as smart pointers, so their destructors are automatically called when the try block is exited.
NOTE
It's a good idea to include a catch block to specifically catch COM exceptions, as DOM coding errors during development are likely to show up as COM exceptions.
When using the MSXML DOM implementation, we need to use only three general COM methods; all of the others are specific to the MSXML DOM implementation. The COM library must also be initialized and released using CoInitialize and CoUninitialize, respectively. CreateInstance is used to create a DOM Document object, and that object drives nearly everything else related to the DOM.
Being a COM object, MSXML doesn't use C character arrays or the standard string class for manipulating strings. It instead uses the COM BSTR binary string datatype, sometimes within a COM VARIANT. The two sample programs in this article show how strings in C character arrays are converted to and from these types.
Finally, the DOM specifies several set and get methods in addition to interface properties. Not all DOM implementations allow properties to be accessed or manipulated directly without using a set or get method, but in many cases MSXML does. Where it's more convenient to read a property rather than using the get method, and safe to manipulate the property rather than using the set method, you may prefer to take these approaches.