- Information Is Interesting Stuff
- Information and Structure Are Inseparable
- Formal Languages Are Easier to Compute Than Natural Languages
- Generic Markup Makes Natural Languages More Formal
- A Brief History of the Topic Maps Paradigm
- Data and Metadata: The Resource-Centric View
- Subjects and Data: The Subject-Centric View
- Understanding Sophisticated Markup Vocabularies
- The Topic Maps Attitude
- Summary
Subjects and Data: The Subject-Centric View
The notion of "shoe-ness" has already been mentioned as a notion that is eternal but ineffable, while any given shoe is ephemeral but concrete. As Plato might have pointed out, only our minds can sense shoe-ness, and only directly; we cannot sense shoe-ness with any of our five physical senses, even though we can certainly sense a given shoe in a variety of ways. We can be aware of shoe-nesseven the shoe-ness of a particular shoe only with our minds. For Plato, shoe-ness exists in a plane of existence that is somehow more exalted, perhaps because it is more permanent than anything our five senses can sense. Plato's idea that there is a plane of existence that is accessible only by our minds is exploited by the topic maps paradigm in order to make data resources federable without endless layers of metadata upon metadata.
The topic maps paradigm recognizes that everything and anything can be a subject of conversation, and that every subject of conversation can be a hub around which data resources can orbit. Unlike the resource-centric view in which metadata orbits data resources, in the subject-centric view, data orbits subjects. If the subject itself happens to be a data resource, the orbiting data can, of course, be called metadata. But one of the essential lessons of the topic maps paradigm is that all data is data about subjects, but only some subjects are themselves data; most subjects are not information resources. When the problem of global knowledge interchange is approached with this subject-centric attitude, the solution becomes much simpler and easier. Indeed, for many people, and particularly for the people who have used it the most, the topic maps paradigm passes the most convincing test of all: the solution, once finally found, is obvious.
There is one problem: computers cannot access subjects unless those subjects happen to be information resources themselves. A computer cannot access the Statue of Liberty, for example, or love, or hot chocolate, or shoe-ness. There is no computer-processable pointer to any of these things. As a practical matter, there is no human-processable pointer to these things eitherpeople can't wave their hands and produce these things out of thin air. However, people have another gift that makes it unnecessary to produce concrete things in order to discuss them: the ability to communicate symbolically, to understand each other on the basis of symbols. It's an everyday miracle that I can say to you the words, "Statue of Liberty," and you will immediately know I'm talking about a certain large greenish statue of a woman, created by Gustav Eiffel, that is situated on Liberty Island in New York Harbor, with a somewhat smaller prototype located in Paris, France. There is very little chance that you will misunderstand me (although it's possible that I could be referring to a certain unconventional pattern of play in American football).
If you've followed this discussion so far, you're ready to understand some imagery that was pivotal in the development of the topic maps paradigm. Imagine a chasm with two high cliffs, one on the left side of the chasm and one on the right. There is no physical bridge across the chasm. On the left-hand cliff is the universe of symbols and expressions. All written, pictorial, and other symbolic expressions exist on the left-hand cliff. On the right-hand cliff is the world of subjects of conversation. (The conversations themselves, since they are in the universe of symbolic expressions, are found only on the left-hand cliff.) On the right-hand cliff we find love, the Statue of Liberty, shoe-ness, the smell of hot chocolate, Minnie Mouse's high-heeled shoes, and every other thing that is or can ever be symbolized by the expressions found on the left-hand cliff: every actual and possible topic of conversation, without exception.
The first thing to realize about this imagery is that, while there is no bridge across the chasm, crossing it is the everyday miracle that our brains accomplish whenever we successfully understand any symbolic expression. We sense certain symbols, and somehow we intuit the corresponding thing on the right-hand cliff. Human intuition (the human brain, if you like) is the only transportation facility that can cross the chasm. This means that it must be true that it's possible for symbols to represent reality or, at least, that we constantly assume that symbols represent reality. (As engineers, we are compelled to admit that the fact that everybody assumes that it's true is good enough to get the job done.) As in the case of monetary information, for example, the validity of that assumption is what the high priests at the Federal Reserve Bank are supposed to ensure. Actually, civilization itself rests entirely on the unprovable assumption that information has some bearing on reality, so maybe we can afford to take a chance on it.
The second thing to realize about this imagery is that all data and all metadata are entirely on the left-hand cliff. The left-hand cliff has some reality, too, because information (expressions) do indeed exist. Wondrous to say, there is no "missing bridge to reality" problem on the left-hand cliff. When a subject happens to be an information resource, even an inanimate computing device can take us where we want to go by understanding and executing the symbols (Web addresses, for example) that uniquely identify that information resource. Indeed, history seems to show that the ease of accessing such addressable subjectsinformation resourceshas in fact seduced us into thinking that only resourcessymbolic expressions that can be addressed by computerscan be the hubs around which data can be organized.
And here is where the topic maps paradigm performs a bit of chicanery. Computers can't directly address the Statue of Liberty, for example, but they can address information about the Statue of Liberty. More to the point, they can address an information resource that serves as a surrogate for the Statue of Liberty. Since we're stuck with the limitations of computers (and the underlying limitations of symbolic expressions), the key is to allow anyone and everyone to establish conventions for such surrogates, according to their own needs and convenience, whereby arbitrary subjects can be uniquely represented by specific addressable information resources. The topic maps paradigm accomplishes this trick by taking the position that a certain specific kind of reference to an information resource must be interpreted not as a reference to that resource but rather as a reference to whatever subject of conversation is indicated by that information resource, when that information resource is perceived and understood by a properly qualified human being. In some sense, then, the topic maps paradigm lets the computer take a virtual journey across the chasm by riding on human perception and intuition.15 The referenced resource becomes more than a resource: it becomes a symbolic surrogate, on the left-hand cliff, for something on the right-hand cliff, on the other side of the chasm, where only human intuition can reach.