3.2 Headings
Brief highlighted text fragments that preface or summarize longer pieces of text are very common on web pages. A heading may apply to the entire page, a section within the page, or even a single sentence or link - but it must apply to something, for a heading only exists as a member of a "head and body" pair.
3.2.1 Element type names
Look up the number. HTML has long used the h1 to h6 element types for six levels of headings. You can borrow these names, or you can make them less cryptic by using head1, head2, and so on. In any case, this approach only works if you really need several levels of headings and if these levels are free of any additional semantics - that is, if you can more or less freely move a branch of your headings tree upward or downward in the hierarchy.
If this is not true - for example, if your third-level headings are reserved specifically for sidebars that cannot be promoted to second-level sections - then the number-based naming scheme is not a good idea at all. Imagine that one day you need to add sections inside a sidebar - this will look ugly if your sidebar headings are marked up, say, as h4, while sections are h2.
Ask my parent who I am. It is vastly more convenient to use some descriptive names, such as chapter-head, section-head, or sidebar-head. An even better approach is to take advantage of the "head and body" duality mentioned above. If you've defined different element types for the complete structural units (section, sidebar, etc.), then the single head element type can be used for headings at any level:
<section> <head>This is a section heading</head> ... <subsection> <head>And this is a subsection heading</head> ... </subsection> </section>
This scheme is intuitive, easy to remember, and therefore easy to use. Even though there is only one heading element type, XSLT or Schematron will have no problem determining the role of each particular heading by checking its parent element. At the same time, implementing processing that is common to all headings is very straightforward with this approach.
XHTML 2.01 implements a similar scheme except that its element type for a heading, h, is always a child of a section (although sections can nest). This is understandable - XHTML cannot realistically cover all possible kinds of structural units that might require headings. On the other hand, this brings us back to an "anonymous" naming scheme that is only slightly better than the old h1...h6: Now you can easily move sections around with their headings, but still no useful semantics is attached to each heading. You can, however, use the CSS class attribute to designate exactly what kind of a heading or section this is.
3.2.2 Attributes
The next question is, what is the auxiliary information to be stored with your headings?2 In most cases, the plain text of the heading itself is sufficient, but there are exceptions. For example, a heading usually has a unique (either within the page or, more usefully, within the entire site) id attribute used in cross-references or hyperlinks to this section from elsewhere.
In fact, a typical reference is supposed to refer to the section (or other structural unit) to which the heading belongs, not to the heading itself. Still, most authors prefer to use headings for linking, partially due to the HTML inertia (there are no sections in today's HTML) and partially because this allows them to more easily reuse the text of the heading in the textual part of the link.
For example, if your heading is marked up as
<head id="attrib">Attributes</head>
and you have a reference to it from somewhere, written as
...see <link to="attrib"/> for more on this.
this can be easily transformed into
...see 2.1, "Attributes" for more on this.
in plain text, or to
...see <a href="#attrib">2.1, Attributes</a> for more on this.
in HTML (here, "2.1" comes from an automatic count of preceding and ancestor sections). On the other hand, given that XSLT can easily traverse from a heading element to its parent, there's no real reason to use headings for linking in XSLT-based projects. The same link rendering could just as well be obtained from
<section id="attrib"> <head>Attributes</head> ... </section>
which looks less tautological and better reflects the fact that both the head and the id are properties of the section.
If necessary for your site's design, you may need to store a reference to a graphic file for each heading (see 3.6 for a discussion of image references), but only if the correspondence between headings and images is not automatic. The image may be used, for example, as a background or an icon-like visual alongside the heading.
3.2.3 Children
The question of what children to allow within headings boils down to the question of how far beyond plain text you are willing to go. Would you need textual emphasis within headings? What about links? The laziest approach is to allow everything that is allowed within a paragraph of text - and it will work fine in most cases. Only if you think you may encounter problems with complex markup in headings and want to guard against them, might a different content model for headings be necessary.
Depending on your requirements, other children may be necessary for heading elements. For example, you may want to store the same heading in two or more languages, with the stylesheet selecting one of the languages for presentation depending on a global language parameter (see also 2.3.5).
You may also want to keep both full and abridged versions of a heading. For example, newspapers often use a specific abbreviated English syntax for their headlines (as in U.S. Patriot Act attacked as threat to freedom), but for the purposes of automatic indexing and natural language processing, the fully grammatical version of the same heading might be required (The U.S. Patriot Act was attacked as a threat to freedom).
3.2.4 Web page title
A special kind of a heading specific to HTML documents is the title of a page, normally displayed in the title bar of a web browser window as well as in bookmarks or search results listing this page. Even though, as a general rule, your target vocabulary must not influence your semantic source vocabulary, you should plan ahead as to what source element(s) will be transformed into the web page title.
If each of your pages has a visible on-page heading that applies to the
entire page, it is natural to duplicate it as a web page title. Otherwise,
it is always a good idea to provide a heading for any sufficiently
large information unit. Even if in your target rendition this heading will
Multistage titles. A web page title is often used for orientation within the site. A sequence of parent sections' headings, culminating in the name of the entire site, may be appended or prepended to the current page's title (e.g., "Foobar Corporation - Products - Foobar Plus"). Such a hierarchical title may be informative and useful, especially with deep site trees (even if the same information is duplicated on the page itself). Of course, it is the stylesheet that builds such a compound title, while the XML source of each page only provides that page's unique part of the title.