- SQL Server Reference Guide
- Introduction
- SQL Server Reference Guide Overview
- Table of Contents
- Microsoft SQL Server Defined
- SQL Server Editions
- SQL Server Access
- Informit Articles and Sample Chapters
- Online Resources
- Microsoft SQL Server Features
- SQL Server Books Online
- Clustering Services
- Data Transformation Services (DTS) Overview
- Replication Services
- Database Mirroring
- Natural Language Processing (NLP)
- Analysis Services
- Microsot SQL Server Reporting Services
- XML Overview
- Notification Services for the DBA
- Full-Text Search
- SQL Server 2005 - Service Broker
- Using SQL Server as a Web Service
- SQL Server Encryption Options Overview
- SQL Server 2008 Overview
- SQL Server 2008 R2 Overview
- SQL Azure
- The Utility Control Point and Data Application Component, Part 1
- The Utility Control Point and Data Application Component, Part 2
- Microsoft SQL Server Administration
- The DBA Survival Guide: The 10 Minute SQL Server Overview
- Preparing (or Tuning) a Windows System for SQL Server, Part 1
- Preparing (or Tuning) a Windows System for SQL Server, Part 2
- Installing SQL Server
- Upgrading SQL Server
- SQL Server 2000 Management Tools
- SQL Server 2005 Management Tools
- SQL Server 2008 Management Tools
- SQL Azure Tools
- Automating Tasks with SQL Server Agent
- Run Operating System Commands in SQL Agent using PowerShell
- Automating Tasks Without SQL Server Agent
- Storage – SQL Server I/O
- Service Packs, Hotfixes and Cumulative Upgrades
- Tracking SQL Server Information with Error and Event Logs
- Change Management
- SQL Server Metadata, Part One
- SQL Server Meta-Data, Part Two
- Monitoring - SQL Server 2005 Dynamic Views and Functions
- Monitoring - Performance Monitor
- Unattended Performance Monitoring for SQL Server
- Monitoring - User-Defined Performance Counters
- Monitoring: SQL Server Activity Monitor
- SQL Server Instances
- DBCC Commands
- SQL Server and Mail
- Database Maintenance Checklist
- The Maintenance Wizard: SQL Server 2000 and Earlier
- The Maintenance Wizard: SQL Server 2005 (SP2) and Later
- The Web Assistant Wizard
- Creating Web Pages from SQL Server
- SQL Server Security
- Securing the SQL Server Platform, Part 1
- Securing the SQL Server Platform, Part 2
- SQL Server Security: Users and other Principals
- SQL Server Security – Roles
- SQL Server Security: Objects (Securables)
- Security: Using the Command Line
- SQL Server Security - Encrypting Connections
- SQL Server Security: Encrypting Data
- SQL Server Security Audit
- High Availability - SQL Server Clustering
- SQL Server Configuration, Part 1
- SQL Server Configuration, Part 2
- Database Configuration Options
- 32- vs 64-bit Computing for SQL Server
- SQL Server and Memory
- Performance Tuning: Introduction to Indexes
- Statistical Indexes
- Backup and Recovery
- Backup and Recovery Examples, Part One
- Backup and Recovery Examples, Part Two: Transferring Databases to Another System (Even Without Backups)
- SQL Profiler - Reverse Engineering An Application
- SQL Trace
- SQL Server Alerts
- Files and Filegroups
- Partitioning
- Full-Text Indexes
- Read-Only Data
- SQL Server Locks
- Monitoring Locking and Deadlocking
- Controlling Locks in SQL Server
- SQL Server Policy-Based Management, Part One
- SQL Server Policy-Based Management, Part Two
- SQL Server Policy-Based Management, Part Three
- Microsoft SQL Server Programming
- An Outline for Development
- Database
- Database Services
- Database Objects: Databases
- Database Objects: Tables
- Database Objects: Table Relationships
- Database Objects: Keys
- Database Objects: Constraints
- Database Objects: Data Types
- Database Objects: Views
- Database Objects: Stored Procedures
- Database Objects: Indexes
- Database Objects: User Defined Functions
- Database Objects: Triggers
- Database Design: Requirements, Entities, and Attributes
- Business Process Model Notation (BPMN) and the Data Professional
- Business Questions for Database Design, Part One
- Business Questions for Database Design, Part Two
- Database Design: Finalizing Requirements and Defining Relationships
- Database Design: Creating an Entity Relationship Diagram
- Database Design: The Logical ERD
- Database Design: Adjusting The Model
- Database Design: Normalizing the Model
- Creating The Physical Model
- Database Design: Changing Attributes to Columns
- Database Design: Creating The Physical Database
- Database Design Example: Curriculum Vitae
- NULLs
- The SQL Server Sample Databases
- The SQL Server Sample Databases: pubs
- The SQL Server Sample Databases: NorthWind
- The SQL Server Sample Databases: AdventureWorks
- The SQL Server Sample Databases: Adventureworks Derivatives
- UniversalDB: The Demo and Testing Database, Part 1
- UniversalDB: The Demo and Testing Database, Part 2
- UniversalDB: The Demo and Testing Database, Part 3
- UniversalDB: The Demo and Testing Database, Part 4
- Getting Started with Transact-SQL
- Transact-SQL: Data Definition Language (DDL) Basics
- Transact-SQL: Limiting Results
- Transact-SQL: More Operators
- Transact-SQL: Ordering and Aggregating Data
- Transact-SQL: Subqueries
- Transact-SQL: Joins
- Transact-SQL: Complex Joins - Building a View with Multiple JOINs
- Transact-SQL: Inserts, Updates, and Deletes
- An Introduction to the CLR in SQL Server 2005
- Design Elements Part 1: Programming Flow Overview, Code Format and Commenting your Code
- Design Elements Part 2: Controlling SQL's Scope
- Design Elements Part 3: Error Handling
- Design Elements Part 4: Variables
- Design Elements Part 5: Where Does The Code Live?
- Design Elements Part 6: Math Operators and Functions
- Design Elements Part 7: Statistical Functions
- Design Elements Part 8: Summarization Statistical Algorithms
- Design Elements Part 9:Representing Data with Statistical Algorithms
- Design Elements Part 10: Interpreting the Data—Regression
- Design Elements Part 11: String Manipulation
- Design Elements Part 12: Loops
- Design Elements Part 13: Recursion
- Design Elements Part 14: Arrays
- Design Elements Part 15: Event-Driven Programming Vs. Scheduled Processes
- Design Elements Part 16: Event-Driven Programming
- Design Elements Part 17: Program Flow
- Forming Queries Part 1: Design
- Forming Queries Part 2: Query Basics
- Forming Queries Part 3: Query Optimization
- Forming Queries Part 4: SET Options
- Forming Queries Part 5: Table Optimization Hints
- Using SQL Server Templates
- Transact-SQL Unit Testing
- Index Tuning Wizard
- Unicode and SQL Server
- SQL Server Development Tools
- The SQL Server Transact-SQL Debugger
- The Transact-SQL Debugger, Part 2
- Basic Troubleshooting for Transact-SQL Code
- An Introduction to Spatial Data in SQL Server 2008
- Performance Tuning
- Performance Tuning SQL Server: Tools and Processes
- Performance Tuning SQL Server: Tools Overview
- Creating a Performance Tuning Audit - Defining Components
- Creating a Performance Tuning Audit - Evaluation Part One
- Creating a Performance Tuning Audit - Evaluation Part Two
- Creating a Performance Tuning Audit - Interpretation
- Creating a Performance Tuning Audit - Developing an Action Plan
- Understanding SQL Server Query Plans
- Performance Tuning: Implementing Indexes
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 1
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 2
- Performance Monitoring Tools: Windows System Monitor
- Performance Monitoring Tools: Logging with System Monitor
- Performance Monitoring Tools: User Defined Counters
- General Transact-SQL (T-SQL) Performance Tuning, Part 1
- General Transact-SQL (T-SQL) Performance Tuning, Part 2
- General Transact-SQL (T-SQL) Performance Tuning, Part 3
- Performance Monitoring Tools: An Introduction to SQL Profiler
- Performance Tuning: Introduction to Indexes
- Performance Monitoring Tools: SQL Server 2000 Index Tuning Wizard
- Performance Monitoring Tools: SQL Server 2005 Database Tuning Advisor
- Performance Monitoring Tools: SQL Server Management Studio Reports
- Performance Monitoring Tools: SQL Server 2008 Activity Monitor
- The SQL Server 2008 Management Data Warehouse and Data Collector
- Performance Monitoring Tools: Evaluating Wait States with PowerShell and Excel
- Practical Applications
- Choosing the Back End
- The DBA's Toolbox, Part 1
- The DBA's Toolbox, Part 2
- Scripting Solutions for SQL Server
- Building a SQL Server Lab
- Using Graphics Files with SQL Server
- Enterprise Resource Planning
- Customer Relationship Management (CRM)
- Building a Reporting Data Server
- Building a Database Documenter, Part 1
- Building a Database Documenter, Part 2
- Data Management Objects
- Data Management Objects: The Server Object
- Data Management Objects: Server Object Methods
- Data Management Objects: Collections and the Database Object
- Data Management Objects: Database Information
- Data Management Objects: Database Control
- Data Management Objects: Database Maintenance
- Data Management Objects: Logging the Process
- Data Management Objects: Running SQL Statements
- Data Management Objects: Multiple Row Returns
- Data Management Objects: Other Database Objects
- Data Management Objects: Security
- Data Management Objects: Scripting
- Powershell and SQL Server - Overview
- PowerShell and SQL Server - Objects and Providers
- Powershell and SQL Server - A Script Framework
- Powershell and SQL Server - Logging the Process
- Powershell and SQL Server - Reading a Control File
- Powershell and SQL Server - SQL Server Access
- Powershell and SQL Server - Web Pages from a SQL Query
- Powershell and SQL Server - Scrubbing the Event Logs
- SQL Server 2008 PowerShell Provider
- SQL Server I/O: Importing and Exporting Data
- SQL Server I/O: XML in Database Terms
- SQL Server I/O: Creating XML Output
- SQL Server I/O: Reading XML Documents
- SQL Server I/O: Using XML Control Mechanisms
- SQL Server I/O: Creating Hierarchies
- SQL Server I/O: Using HTTP with SQL Server XML
- SQL Server I/O: Using HTTP with SQL Server XML Templates
- SQL Server I/O: Remote Queries
- SQL Server I/O: Working with Text Files
- Using Microsoft SQL Server on Handheld Devices
- Front-Ends 101: Microsoft Access
- Comparing Two SQL Server Databases
- English Query - Part 1
- English Query - Part 2
- English Query - Part 3
- English Query - Part 4
- English Query - Part 5
- RSS Feeds from SQL Server
- Using SQL Server Agent to Monitor Backups
- Reporting Services - Creating a Maintenance Report
- SQL Server Chargeback Strategies, Part 1
- SQL Server Chargeback Strategies, Part 2
- SQL Server Replication Example
- Creating a Master Agent and Alert Server
- The SQL Server Central Management System: Definition
- The SQL Server Central Management System: Base Tables
- The SQL Server Central Management System: Execution of Server Information (Part 1)
- The SQL Server Central Management System: Execution of Server Information (Part 2)
- The SQL Server Central Management System: Collecting Performance Metrics
- The SQL Server Central Management System: Centralizing Agent Jobs, Events and Scripts
- The SQL Server Central Management System: Reporting the Data and Project Summary
- Time Tracking for SQL Server Operations
- Migrating Departmental Data Stores to SQL Server
- Migrating Departmental Data Stores to SQL Server: Model the System
- Migrating Departmental Data Stores to SQL Server: Model the System, Continued
- Migrating Departmental Data Stores to SQL Server: Decide on the Destination
- Migrating Departmental Data Stores to SQL Server: Design the ETL
- Migrating Departmental Data Stores to SQL Server: Design the ETL, Continued
- Migrating Departmental Data Stores to SQL Server: Attach the Front End, Test, and Monitor
- Tracking SQL Server Timed Events, Part 1
- Tracking SQL Server Timed Events, Part 2
- Patterns and Practices for the Data Professional
- Managing Vendor Databases
- Consolidation Options
- Connecting to a SQL Azure Database from Microsoft Access
- SharePoint 2007 and SQL Server, Part One
- SharePoint 2007 and SQL Server, Part Two
- SharePoint 2007 and SQL Server, Part Three
- Querying Multiple Data Sources from a Single Location (Distributed Queries)
- Importing and Exporting Data for SQL Azure
- Working on Distributed Teams
- Professional Development
- Becoming a DBA
- Certification
- DBA Levels
- Becoming a Data Professional
- SQL Server Professional Development Plan, Part 1
- SQL Server Professional Development Plan, Part 2
- SQL Server Professional Development Plan, Part 3
- Evaluating Technical Options
- System Sizing
- Creating a Disaster Recovery Plan
- Anatomy of a Disaster (Response Plan)
- Database Troubleshooting
- Conducting an Effective Code Review
- Developing an Exit Strategy
- Data Retention Strategy
- Keeping Your DBA/Developer Job in Troubled Times
- The SQL Server Runbook
- Creating and Maintaining a SQL Server Configuration History, Part 1
- Creating and Maintaining a SQL Server Configuration History, Part 2
- Creating an Application Profile, Part 1
- Creating an Application Profile, Part 2
- How to Attend a Technical Conference
- Tips for Maximizing Your IT Budget This Year
- The Importance of Blue-Sky Planning
- Application Architecture Assessments
- Transact-SQL Code Reviews, Part One
- Transact-SQL Code Reviews, Part Two
- Cloud Computing (Distributed Computing) Paradigms
- NoSQL for the SQL Server Professional, Part One
- NoSQL for the SQL Server Professional, Part Two
- Object-Role Modeling (ORM) for the Database Professional
- Business Intelligence
- BI Explained
- Developing a Data Dictionary
- BI Security
- Gathering BI Requirements
- Source System Extracts and Transforms
- ETL Mechanisms
- Business Intelligence Landscapes
- Business Intelligence Layouts and the Build or Buy Decision
- A Single Version of the Truth
- The Operational Data Store (ODS)
- Data Marts – Combining and Transforming Data
- Designing Data Elements
- The Enterprise Data Warehouse — Aggregations and the Star Schema
- On-Line Analytical Processing (OLAP)
- Data Mining
- Key Performance Indicators
- BI Presentation - Client Tools
- BI Presentation - Portals
- Implementing ETL - Introduction to SQL Server 2005 Integration Services
- Building a Business Intelligence Solution, Part 1
- Building a Business Intelligence Solution, Part 2
- Building a Business Intelligence Solution, Part 3
- Tips and Troubleshooting
- SQL Server and Microsoft Excel Integration
- Tips for the SQL Server Tools: SQL Server 2000
- Tips for the SQL Server Tools – SQL Server 2005
- Transaction Log Troubles
- SQL Server Connection Problems
- Orphaned Database Users
- Additional Resources
- Tools and Downloads
- Utilities (Free)
- Tool Review (Free): DBDesignerFork
- Aqua Data Studio
- Microsoft SQL Server Best Practices Analyzer
- Utilities (Cost)
- Quest Software's TOAD for SQL Server
- Quest Software's Spotlight on SQL Server
- SQL Server on Microsoft's Virtual PC
- Red Gate SQL Bundle
- Microsoft's Visio for Database Folks
- Quest Capacity Manager
- SQL Server Help
- Visual Studio Team Edition for Database Professionals
- Microsoft Assessment and Planning Solution Accelerator
- Aggregating Server Data from the MAPS Tool
In this series on Business Intelligence (BI) landscapes, I've explained what BI is, some information to help you develop a security plan, and how to create your requirements. In the last tutorial I explained how to prepare your source systems to deliver data. And we haven't even used any software yet!
I say that half-humorously, but it does show that creating a Business Intelligence infrastructure requires an enormous amount of planning. In the current "do it now" workplace, planning isn't often held in high esteem. But I can promise you this: if you don't thoroughly plan your BI implementation, it will fail. Oh, everything will load, process and create reports, but the information won't be correct. Guaranteed. The reason for this sad fact is that IT simply can't beat the solution out of the technology – Business Information (even when used in science or other disciplines) is information, not data.
Think of a bank. Banks will keep your money, tell you how much you have, move it into other accounts for you, or even distribute it as you say. But they can't tell you what that means for your financial situation. The bank doesn't call you and say "you're not spending enough on Buck's database books this month!"
That's just like IT. Our part of the organization stores the organization's Microsoft Excel spreadsheets, moves them around, backs them up, secures them, and even allows users to add and delete them. But we don't know what they mean, or how the user will interpret them. So the only planning we normally do involves the mechanics of storing, moving and securing the organization's data. The users know what they want to do with the data, so they assign the meaning to it. They make data into information in their heads.
Not so with the BI system. In the BI system, IT assigns meaning to the data. We have to do this because we will extract data from various source systems and then transform the data prior to loading it into a series of larger systems. All this transforming and loading is to present the data in a coherent way, which involves knowing what the data means. You'll see what I mean by the end of this article.
To move the data from the source systems (see the last tutorial in this series) through to the end reports, there is a variety of mechanisms to choose from.
File Store
The first choice is to extract the data using source system tools into files. The main advantage here is that there is nothing to buy. Most enterprise software packages have mechanisms to pump out data from their databases using the included platform tools. If they don't, you can use the export tools built into every commercial database engine on the market.
Another advantage in exporting data from the source system to a set of files is that you don't have to ask for permission to access a source system to get them. Many administrators face licensing costs for creating a new login, and would rather give you what you want rather than let you come in and get it. They're also leery of letting a foreign system into their carefully balanced software.
Still another advantage is that you have a lot of control over how and when the data comes out of the system. You can "trickle" the data out frequently, or wait and "batch" the whole process at a time when the system isn't as busy.
Using a file store isn't perfect, however. If you use this method you should prepare for the inevitable issues when the export program on the source system doesn't run. Do you stop all imports until the problem on the source system is fixed, or do you continue importing from the other systems and let the data "catch up"? What if the source system that failed holds the parts information, and the next import to run is the purchase orders that depend on the parts information in the first place?
Recall that in the last tutorial I mentioned that you'll want to get an "on call" list for the administrators of the source systems. You'll need that in case the exports don't work. It's no good finding out that the lady you dealt with when you designed the extract system is on vacation when the source extract fails.
Also, leaving files laying around in an FTP site or some other share comes with its own set of problems, not to mention the complexity of the various steps of moving the data out, storing it, picking it up, and importing it. Remember, the fewer steps you have, the easier it is to comprehend the system, diagnose problems and correct them.
But there are times when a file store is a great solution. If you decide to use a file store as your extract method, make sure that you keep the files as generic as possible. They should be a "straight throw" from the original tables or views in the source system. Resist the urge to put intelligence for the combination of data at this level of the process. This is because you want experts in the source system to be able to help you when changes in their systems occur. You don't want to have to educate them on all of the changes you made to the extract process. Leave that for your side of the equation. Just get the data, agree on a format, and ensure that you have a good method to handle interruptions in the data flow.
That often means naming the files (or rows in tables, if you're extracting data into a client-side database) with a unique identifier, usually involving a date/time stamp. This will keep the extracts from stepping on one another if they do fail and then catch up.
Make sure that the staging area where you receive the files is reliable and secure. Document that security, and the timeline that the files will be ready for pickup in a meta-data tool or document.
ETL Tool
Some BI systems have built-in extraction tools. You can also purchase tools that can reach into a source system natively and move the data to your system.
The advantage to this kind of software is that they've built the intelligence for dealing with changes and failures right into the functionality it provides. This kind of software often uses a simple point-and-click paradigm that is easy to understand, explain and document.
Another advantage is that these tools normally have a good meta-data store built right in. This allows you to enter lots of information about the data structure, how it is changed in the process, what stage it is in at a given moment, and more. You'd have to code all this yourself if you use a file store.
But there are disadvantages here as well. These tools cost money, and they aren't always cheap. There's also a learning curve involved in installing yet another software package (YASP). The source system may require another license, and security becomes an issue.
If you are using same-to-same platforms, this choice becomes a little easier. SQL Server can talk directly to another SQL Server, and you can even set up a replication schema between systems to keep them up to date all the time. SQL Server can also talk to Oracle by using distributed queries or Data Transformation Services.
That brings up the choice of where to transform the data, and how much. You should work very hard to leave the data as generic as possible for as long as you can. The first time you change a "True" to "T" or "1", you're diluting the original meaning of the data a little. To be sure, at some point you're going to have to combine data from system A with data from system B, and unless they are the same kind of software odds are good that they won't store the same data the same way. But leave the data alone during the transfer process. Get it into a series of tables in the destination system (more on those next time) as virgin as possible. Once you've transferred the data you can begin to transform it, and if you do it on your side there you're the only one that is affected if the transformation isn't correct. I don't advocate ETL as defined by Extract, Transform and Load. I actually prefer to define it as Extract, Transfer and Load. Transform the data using the rich set of tools in your destination systems, and the process will be much smoother.
Finally you'll need to develop a timeline for the data to be current. The data is current when all systems have reported in, and you've moved the data beyond the staging tables (landing places for the data in your BI system) and on up into the first levels where it can be used.
In the next tutorial I'll explain those levels and where the ETL process should place the data. You have a few choices in your landscape, depending on your needs. I'll explain those choices and how you select the one for your situation.
Informit Articles and Sample Chapters
Larissa T. Moss and Shaku Atre have a great book on BI that also discusses security called Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications.
Online Resources
The ETL guy no longer keeps his site up to date, but it's a wealth of information on Extracts, Transforms and Loads. He's got a bunch of links from ETL products as well.