- The Business Case for a New Design Process
- Improving the Development Process
- Overview of Data Integration Modeling
- Conceptual Data Integration Models
- Logical Data Integration Models
- Physical Data Integration Models
- Tools for Developing Data Integration Models
- Industry-Based Data Integration Models
- Summary
- End-of-Chapter Questions
Overview of Data Integration Modeling
Data integration modeling is a technique that takes into account the types of models needed based on the types of architectural requirements for data integration and the types of models needed based on the Systems Development Life Cycle (SDLC).
Modeling to the Data Integration Architecture
The types of process models or data integration models are dependent on the types of processing needed in the data integration reference architecture. By using the reference architecture as a framework, we are able to create specific process model types for the discrete data integration processes and landing zones, as demonstrated in Figure 3.2.
Figure 3.2 Designing models to the architecture
Together, these discrete data integration layers become process model types that form a complete data integration process. The objective is to develop a technique that will lead the designer to model data integration processes based on a common set of process types.
Data Integration Models within the SDLC
Data integration models follow the same level of requirement and design abstraction refinement that occurs within data models during the SDLC. Just as there are conceptual, logical, and physical data models, there are conceptual, logical, and physical data integration requirements that need to be captured at different points in the SDLC, which could be represented in a process model.
The following are brief descriptions of each of the model types. A more thorough definition along with roles, steps, and model examples is reviewed later in the chapter.
- Conceptual data integration model definition—Produces an implementation-free representation of the data integration requirements for the proposed system that will serve as a basis for determining how they are to be satisfied.
- Logical data integration model definition—Produces a detailed representation of the data integration requirements at the data set (entity/table)level, which details the transformation rules and target logical data sets (entity/tables). These models are still considered to be technology-independent. The focus at the logical level is on the capture of actual source tables and proposed target stores.
- Physical data integration model definition—Produces a detailed representation of the data integration specifications at the component level. They should be represented in terms of the component-based approach and be able to represent how the data will optimally flow through the data integration environment in the selected development technology.
Structuring Models on the Reference Architecture
Structuring data models to a Systems Development Life Cycle is a relatively easy process. There is usually only one logical model for a conceptual data model and there is only one physical data model for a logical data model. Even though entities may be decomposed or normalized within a model, there is rarely a need to break a data model into separate models.
Process models have traditionally been decomposed further down into separate discrete functions. For example, in Figure 3.3, the data flow diagram's top process is the context diagram, which is further decomposed into separate functional models.
Figure 3.3 A traditional process model: data flow diagram
Data integration models are decomposed into functional models as well, based on the data integration reference architecture and the phase of the Systems Development Life Cycle.
Figure 3.4 portrays how conceptual, logical, and physical data integration models are broken down.
Figure 3.4 Data integration models by the Systems Development Life Cycle