- The Business Case for a New Design Process
- Improving the Development Process
- Overview of Data Integration Modeling
- Conceptual Data Integration Models
- Logical Data Integration Models
- Physical Data Integration Models
- Tools for Developing Data Integration Models
- Industry-Based Data Integration Models
- Summary
- End-of-Chapter Questions
Tools for Developing Data Integration Models
One of the first questions about data integration modeling is, "What do you build them in?" Although diagramming tools such as Microsoft Visio® and even Microsoft PowerPoint® can be used (as displayed throughout the book), we advocate the use of one of the commercial data integration packages to design and build data integration models.
Diagramming tools such as Visio require manual creation and maintenance to ensure that they are kept in sync with source code and Excel spreadsheets. The overhead of the maintenance often outweighs the benefit of the manually created models. By using a data integration package, existing data integration designs (e.g., an extract data integration model) can be reviewed for potential reuse in other data integration models, and when leveraged, the maintenance to the actual data integration job is performed when the model is updated. Also by using a data integration package such as Ab Initio, IBM Data Stage®, or Informatica to create data integration models, an organization will further leverage the investment in technology it has.
Figure 3.16 provides examples of high-level logical data integration models built in Ab Initio, IBM Data Stage, and Informatica.
Figure 3.16 Data integration models by technology
Experience in using data integration packages for data integration modeling has shown that data integration projects and Centers of Excellence have seen the benefits of increased extract, transform and load code standardization, and quality. Key benefits from leveraging a data integration package include the following:
- End-to-end communications—Using a data integration package facilitates faster transfer of requirements from a data integration designer to a data integration developer by using the same common data integration metadata. Moving from a logical design to a physical design using the same metadata in the same package speeds up the transfer process and cuts down on transfer issues and errors. For example, source-to-target data definitions and mapping rules do not have to be transferred between technologies, thereby reducing mapping errors. This same benefit has been found in data modeling tools that transition from logical data models to physical data models.
- Development of leveragable enterprise models—Capturing data integration requirements as logical and physical data integration models provides an organization an opportunity to combine these data integration models into enterprise data integration models, which further matures the Information Management environment and increases overall reuse. It also provides the ability to reuse source extracts, target data loads, and common transformations that are in the data integration software package's metadata engine. These physical data integration jobs are stored in the same metadata engine and can be linked to each other. They can also be linked to other existing metadata objects such as logical data models and business functions.
- Capture of navigational metadata earlier in the process—By storing logical and physical data integration model metadata in a data integration software package, an organization is provided with the ability to perform a more thorough impact analysis of a single source or target job. The capture of source-to-target mapping metadata with transformation requirements earlier in the process also increases the probability of catching mapping errors in unit and systems testing. In addition, because metadata capture is automated, it is more likely to be captured and managed.