UDM: Linking Relational and Multidimensional Databases
IN THIS CHAPTER
- Unified Dimensional Model (UDM)
The Unified Dimensional Model (UDM) of Microsoft SQL Server Analysis Services 2005 makes it possible for you to set up your system so that different types of client applications can access data from both the relational and the multidimensional databases in your data warehouse, without using separate models for each.
It's been a common industry practice for some time now to build data warehouses that include a relational database for storing data and a multidimensional database for analyzing data. This practice developed because the large volumes of data that multidimensional databases were developed to analyze are typically stored in relational databases. The data would be moved to the multidimensional database for analysis, but relational database would continue to serve as primary storage.
Thus, it makes sense that the interaction between the stored data and the multidimensional database where it can be analyzed has been an important component of multidimensional database architecture. Our goal for Analysis Services 2005, put simply, is speedy analysis of the most up-to-date data possible.
The speedy and up-to-date parts are what present the challenge. The data in OLTP systems is constantly being updated. But we wouldn't want to pour data directly from an OLTP system into a multidimensional database because OLTP data is easily polluted by incomplete transactions or incomplete data entered in a transaction. In addition, you don't want your analysis engine to access the OLTP data directly because that could disrupt work and reduce productivity.
In a data warehouse, OLTP data is typically transformed and stored in a relational database and then loaded into a multidimensional database for analysis. To connect the two databases, you can choose from three methods, each one employing a different kind of interaction:
- Relational systems (ROLAP), in which no data is stored directly in the multidimensional database. It is loaded from the relational database when it is needed.
- Multidimensional systems (MOLAP), in which data is loaded into the multidimensional database and cached there. Future queries are run against the cached data.
- Hybrid systems (HOLAP), in which the aggregated data is cached in the multidimensional database. When the need arises for more detailed information, that data is loaded from the relational database.
In earlier versions of Analysis Services, the multidimensional part of the data warehouse was a passive consumer of data from the relational database, restricted to whatever structures the relational database contained. The functions of storing data and analyzing data were not only separate, but you had to understand two models—one for accessing a relational database and one for accessing a multidimensional database.
Some client applications would use one model and others would use the other model. For example, reporting applications traditionally would access the data in a relational database. On the other hand, an analysis application that has to look at the data in many different ways would probably access the data in the multidimensional database, which is designed specifically for that sort of use.
Now, the UDM offers a substantially redefined structure and architecture so that the one model (UDM) serves the purposes of any client application. You no longer have to understand two models; we're providing a unified model. You can see, in Figure 3.1, how many different client applications can use UDM to access data in a variety of different data stores.
Figure 3.1 The UDM provides a unified model for accessing and loading data from varied data sources.
Analysis Services uses proactive caching to ensure that the user of the client application is always working with predictable data latency. In essence, proactive caching is a mechanism by which the user can schedule switching from one connection mode (ROLAP, MOLAP, or HOLAP) to another. For example, the user might set his system to switch from MOLAP to ROLAP if the data in the MOLAP system is older than, say, four hours.
UDM owes its flexibility to the expanded role of dimension attributes in Analysis Services 2005. (For more information about dimension attributes, see Chapter 6, "Dimensions in the Conceptual Model.") Because of this expanded role of attributes in dimensions, you can base your UDM on the logic of your business functions instead of the logic of your database. With this flexibility, the UDM produces data that is considerably more informative than the relational model—and more understandable to the user.
With UDM at the center of the multidimensional model, you no longer need to have different methods of data access for different data sources. Before UDM, every system had a number of specialized data stores, each one containing data that was stored there for a limited number of users. Each of these data sources would likely require specific methods of data access for loading data into the multidimensional model. With Analysis Services 2005 all the data of the enterprise is available through the UDM, even if those data sources are located on different types of hardware running different operating systems or different database systems. OLAP now serves as an intermediate system to guarantee effective access to the data.
More detailed information about UDM is spread across the chapters of Part II, "Creating Multidimensional Models"; Part IV, "Creating a Data Warehouse"; Part V, "Bringing Data into Analysis Services"; and Part VI, "Analysis Services Architecture."
Summary
UDM makes it possible to set up a system so that different types of client applications can access data from both the relational and the multidimensional databases. With UDM at the center of the multidimensional model, you no longer need to have different methods of data access for different data sources. UDM offers a substantially redefined structure and architecture so that one model (UDM) serves the purposes of any client application. The expanded role of dimension attributes in Analysis Services 2005 makes possible the flexibility of UDM.