Information Integration and Distribution
As already discussed, an organization will have many applications and other kinds of IT systems. Each of these systems will host their own information processes, information services, and information collections.
We use the INFORMATION NODE pattern to represent the general concept of a system. You may want wish to think of this as a physical computer, or server. However, with the increasing use of virtual systems and cloud provisioning, the notion of physical hardware being tied to a particular system is becoming less common. So an information node is simply an identifiable “system” that the organization runs.
The information node is the lead pattern in a large information group that describes different types of systems. The application is represented by a pattern from the group called APPLICATION NODE. The STAGING AREA pattern is also in the same group.
The information node provides an execution environment for the information processes, information services, and information collections. Calls between these components can occur totally within the information node. However, it is also possible for information processes to access information from different information nodes. This capability is provided a specialist pattern within the information service pattern group called REMOTE INFORMATION SERVICE.
Figure 1.8 illustrates this mechanism. The remote information service uses an INFORMATION REQUEST pattern to retrieve information from an information collection located in another information node. The information request pattern consists of two message flows: one from the remote information service to the information node that hosts the information to request the information, and another flowing in the opposite direction to return the requested information.
Figure 1.8. Accessing information from a different information node.
The information node that receives the request for information routes it to an appropriate information service to extract the information and return a response. In Figure 1.8, this is shown as a LOCAL INFORMATION SERVICE—that is, one that uses information collections from the same information node—but it could be another remote information service.
The information request pattern retrieves information from its original location on demand. This means both the calling and the called information node must be available at the same time. When information must be copied from one information collection to another—for example, for performance or availability reasons—the information flow pattern is used instead. This introduces another kind of information node called an INFORMATION BROKER that calls remote information services to extract information from one or more information collections, transform it, and store it in other information collections. The effect is that information flows between the information nodes in what we call an information supply chain.
Figure 1.9 illustrates this flow of information. The numbers on the diagram refer to these notes:
- Here, an information process calls a remote information service to retrieve information from information collection A. Under the covers, the remote information collection uses an information request to contact the information node where information collection A is located.
- The information process then works with some information users to update the information and store the results in information collection B.
- Information collection B stores the information in a new entry in the information collection.
- An information broker now starts an information process to extract the information from information collection B and transform it and save it as a new entry in information collection C.
- Another information process starts to retrieve the information from information collection C.
- This process may make changes to the information and update it in information collection C.
Figure 1.9. Flowing information between systems.
This example illustrates how multiple copies of information are created—and also how these copies quickly become slightly different from one another. The differences could be as follows:
- Superficial—Such as a reformatting
- Enriching—Where additional information is added to the original information
- Localized—Where updates made are only relevant to the location where they are made
- Managed—Where the best source of information (called the authoritative source) is well known at all times
- Conflicting—Where it is hard to know which information collection is the best to use or retrieve the latest information from as changes are coming in to each of the copies in an unpredictable way
A well-defined information supply chain should avoid having information collections with conflicting differences in them. We aim to minimize the number of copies. Where copies are made, each should have a clear purpose and guidelines on when it should be used. Copies should be synchronized when updates are made, and where differences are unavoidable, there should be at least one copy that is known to have all of the latest information in it.
The INFORMATION SUPPLY CHAIN pattern is the lead pattern in a pattern group that describes different patterns of information movement between the information collections and how to synchronize the information to avoid conflicting differences. Designing information supply chains is a key challenge for both information architects and solution architects.