Retrieving the Data
You have plenty of options to retrieve individual values from the XML-formatted data stored in your database:
- If you have very tight control over the actual XML markup, you could use simple string-matching functions.
- You can parse the XML string into a DOM object and use standard DOM functions or XPATH extensions to retrieve individual elements. This approach is most viable if you have to extract values from the database record and send them to an AJAX client in XML format.
- You can use XML extensions or XML support functions provided by your database. This option will be covered in an upcoming article.
Simple String Matching
As you know, the XML text values can never contain the less-than character (it has to be quoted); it’s perfectly safe to use string matching functions to extract values from simple XML markup (remember: no attributes, no multi-value fields, no hierarchical tags, no CDATA sections).
To extract the text enclosed in an XML tag, use these steps:
- Find the position of the start tag in the string. If there is no start tag, return NULL.
- Find the position of the end tag in the string. If there is no end tag, you were given invalid XML markup, report an error.
- Retrieve the text between the start tag and the end tag.
- De-quote the string.
A sample ASP function that extracts the text from an XML tag is shown in Listing 5. The de-quoting function is shown in Listing 6.
Listing 5 Extract the value from an XML-formatted string
Function ExtractTagFromXML(XML,Tag) Dim StartTag,EndTag,StartPos,EndPos If IsNull(XML) Then ExtractTagFromXML = NULL : Exit Function StartTag = "<" & Tag & ">" EndTag = "</" & Tag & ">" StartPos = InStr(XML,StartTag) If StartPos = 0 Then ExtractTagFromXML = NULL : Exit Function EndPos = InStr(XML,EndTag) If EndPos = 0 Then _ Err.Raise vbObjectError+1,"ExtractTagFromXML","Invalid XML Markup" ExtractTagFromXML = Dequote(Mid(XML,StartPos+Len(StartTag),_ EndPos-StartPos-Len(StartTag)) End Function
Listing 6 De-quote the XML text
Function Dequote(Txt) Txt = Replace(Txt,"<","<") Txt = Replace(Txt,">","<") Txt = Replace(Txt,""","""") Dequote = Txt End Function
Our code would be even simple if, following the advice given in the "Seven Steps to Improving Your VB Code" article by John Traenkenschuh, we would use regular expressions (see Listing 7).
Listing 7 Extract the tag value with a regular expression
Function Regexp_Extract(XML,Tag) Dim Reg,Match,Matches Set Reg = New RegExp Reg.Pattern = "<" & Tag & ">(.*)</" & Tag & ">" Set Matches = Reg.Execute(XML) For Each Match in Matches Regexp_Extract = Dequote(Match.Value) : Exit Function Next Regexp_Extract = Null End Function
Using DOM objects
If you have decided to use XML markup that is slightly more complex (attributes and multiple tags with the same name), you should parse the XML markup into a DOM tree and extract the values from the tree elements using DOM functions.
The simplistic approach to this problem is shown in Listing 8.
Listing 8 Simplistic text-extracting function
Function ExtractDOM(XML,Tag) Dim XDOM,XNode ExtractDOM = NULL Set XDom = Server.CreateObject("MSXML.DOMDocument") If Not XDom.LoadXML(XML) Then _ Err.Raise vbObjectError+1,"ExtractTagFromXML",XDom.parseError.reason Set XNode = XDom.selectSingleNode("//" & Tag) If Not XNode Is Nothing Then ExtractDOM = XNode.text End Function
The load that such a function would place on the server where it would execute is ghastly; every time a value is extracted from the XML string, a new DOM object would be created and the whole XML string parsed.
Obviously, the XML string has to be parsed only once—either you do it somewhere else and pass the DOM object to the extraction function or you deploy a simple caching scheme (see Listing 9).
The caching scheme could be implemented even better in a proper object-oriented language.
Listing 9 Optimized function with simple caching
Dim CachedDOM,CachedXML Function ExtractCachedDOM(XML,Tag) Dim XNode If Not IsObject(CachedDOM) Then _ Set CachedDOM = Server.CreateObject("MSXML.DOMDocument") If XML <> CachedXML Then If Not CachedDom.LoadXML(XML) Then _ Err.Raise vbObjectError+1,"ExtractTagFromXML",CachedDom.parseError.reason CachedXML = XML End If ExtractCachedDOM = NULL Set XNode = CachedDom.selectSingleNode("//" & Tag) If Not XNode Is Nothing Then ExtractCachedDOM = XNode.text End Function