- Flash Context
- Name
- Start Tag
- End Tag
- Attributes
- Text (Character Data)
- Entity References
- Comments
- Processor Instructions
- Conclusion
Entity References
It is frequently convenient and often necessary to perform string substitution during the parsing of an XML file. The XML entity concept allows such substitution. We will encounter other entity types later, but the only references found in an XML file proper are general entity references. These are references that replace short symbols with legitimate XML character strings. They serve many purposes.
The simplest (and perhaps most common) case is the need to insert into a text stream characters that are meant literally as character data but that the parser would consider as markup. The two characters susceptible to such misinterpretation are the < character, which introduces a tag, and the & character, which introduces, in fact, an entity reference. These two characters can appear in an XML document only when they are serving their specific functions. XML defines entities that substitute for these two. In addition, it provides for a substitute for the > character to satisfy an obscure SGML requirement and for both single and double quote marks, for reasons of convenience that should be obvious.
Additionally a similar escape sequence can exist for any character. The character code (e.g., ASCII or Unicode value) can be expressed in decimal or hexadecimal form. This lets us embed characters that are difficult to produce on the keyboard or unreliable to display.
If the DTD (a preprocessor file we will encounter later) defines it, we can use our own notation, like ¥ to represent the x symbol more clearly than ¥ does.
Of course we are not limited to single characters. The entity can be invoke a string of arbitrary length. An entity is a useful way to represent frequently used and lengthy text. It is even more valuable for representing volatile data. For example, an entity called &webmaster; might list the name and contact information for the person responsible for supporting an XML document. Data encoders enjoy the ability to represent this lengthy, frequently repeated data with a short clear symbol. And if the webmaster position has high turnover, everyone will appreciate the entity's ability to manage volatility.
Flash Context
Few of the abilities of the entity reference are actually realized in the version of Flash available at the time of this writing. The ActionScript parser is declared to be nonvalidating. A nonvalidating parser does not check the XML file against the declarations in the reference DTD file.
The complex capabilities of entity processingthe surface of which we have merely scratcheddepend on use of DTD. So it is not surprising to find that user-defined entities such as ¥ or &webmaster; do not function as of the time of this writing.
The five predefined escape sequence entities work as expected, and so does the decimal encoding of single characters. The hexadecimal equivalent is not functional at the time of writing but might be fixed soon, as it seems to be the product of oversight, not architecture.
Syntax
The entity reference generally presumes a matching entity declaration in a referenced document: the DTD.
Alternatively there are five symbolic predefined entities and a mechanism for specifying them with decimal or hexadecimal numbers. All these invoke only a single character:
XML
&#ddd; decimal character code ddd &#xhh; hex character code hh & & < < > > " " &apo; '
Rules
Each entity reference must have
an opening ampersand (&)
a valid entity reference
a closing semicolon (;)
It may have (in the entity reference position)
the # followed by a decimal number representing a character code
the pair #x followed by a hexadecimal character code
one of five standard character entities: lt, gt, apos, amp, quot
any token defined in the DTD
It may not have
any characters or sequence not valid in an XML name.
Examples of General Entity References
if (x>min && x< <max) |
if( x >min && x <max) print("it's ok"); |
print("it's ok"); |
|
©2000 Jacobson & Jacobson |
©2000 Jacobson & Jacobson |
©rt;2000 Jacobson & Jacobson |
Same as above if DTD defines copyrt as #xA9 |
Bad Examples
|
|
©rt;2000 Jacobson & Jacobson |
If no definition of copyrt exists, this resolves to ©rt;2000 Jacobson & Jacobson |
<ELEMENT attribute="value"e;> |
Entity references cannot be terminators. |