- SQL Server Reference Guide
- Introduction
- SQL Server Reference Guide Overview
- Table of Contents
- Microsoft SQL Server Defined
- SQL Server Editions
- SQL Server Access
- Informit Articles and Sample Chapters
- Online Resources
- Microsoft SQL Server Features
- SQL Server Books Online
- Clustering Services
- Data Transformation Services (DTS) Overview
- Replication Services
- Database Mirroring
- Natural Language Processing (NLP)
- Analysis Services
- Microsot SQL Server Reporting Services
- XML Overview
- Notification Services for the DBA
- Full-Text Search
- SQL Server 2005 - Service Broker
- Using SQL Server as a Web Service
- SQL Server Encryption Options Overview
- SQL Server 2008 Overview
- SQL Server 2008 R2 Overview
- SQL Azure
- The Utility Control Point and Data Application Component, Part 1
- The Utility Control Point and Data Application Component, Part 2
- Microsoft SQL Server Administration
- The DBA Survival Guide: The 10 Minute SQL Server Overview
- Preparing (or Tuning) a Windows System for SQL Server, Part 1
- Preparing (or Tuning) a Windows System for SQL Server, Part 2
- Installing SQL Server
- Upgrading SQL Server
- SQL Server 2000 Management Tools
- SQL Server 2005 Management Tools
- SQL Server 2008 Management Tools
- SQL Azure Tools
- Automating Tasks with SQL Server Agent
- Run Operating System Commands in SQL Agent using PowerShell
- Automating Tasks Without SQL Server Agent
- Storage – SQL Server I/O
- Service Packs, Hotfixes and Cumulative Upgrades
- Tracking SQL Server Information with Error and Event Logs
- Change Management
- SQL Server Metadata, Part One
- SQL Server Meta-Data, Part Two
- Monitoring - SQL Server 2005 Dynamic Views and Functions
- Monitoring - Performance Monitor
- Unattended Performance Monitoring for SQL Server
- Monitoring - User-Defined Performance Counters
- Monitoring: SQL Server Activity Monitor
- SQL Server Instances
- DBCC Commands
- SQL Server and Mail
- Database Maintenance Checklist
- The Maintenance Wizard: SQL Server 2000 and Earlier
- The Maintenance Wizard: SQL Server 2005 (SP2) and Later
- The Web Assistant Wizard
- Creating Web Pages from SQL Server
- SQL Server Security
- Securing the SQL Server Platform, Part 1
- Securing the SQL Server Platform, Part 2
- SQL Server Security: Users and other Principals
- SQL Server Security – Roles
- SQL Server Security: Objects (Securables)
- Security: Using the Command Line
- SQL Server Security - Encrypting Connections
- SQL Server Security: Encrypting Data
- SQL Server Security Audit
- High Availability - SQL Server Clustering
- SQL Server Configuration, Part 1
- SQL Server Configuration, Part 2
- Database Configuration Options
- 32- vs 64-bit Computing for SQL Server
- SQL Server and Memory
- Performance Tuning: Introduction to Indexes
- Statistical Indexes
- Backup and Recovery
- Backup and Recovery Examples, Part One
- Backup and Recovery Examples, Part Two: Transferring Databases to Another System (Even Without Backups)
- SQL Profiler - Reverse Engineering An Application
- SQL Trace
- SQL Server Alerts
- Files and Filegroups
- Partitioning
- Full-Text Indexes
- Read-Only Data
- SQL Server Locks
- Monitoring Locking and Deadlocking
- Controlling Locks in SQL Server
- SQL Server Policy-Based Management, Part One
- SQL Server Policy-Based Management, Part Two
- SQL Server Policy-Based Management, Part Three
- Microsoft SQL Server Programming
- An Outline for Development
- Database
- Database Services
- Database Objects: Databases
- Database Objects: Tables
- Database Objects: Table Relationships
- Database Objects: Keys
- Database Objects: Constraints
- Database Objects: Data Types
- Database Objects: Views
- Database Objects: Stored Procedures
- Database Objects: Indexes
- Database Objects: User Defined Functions
- Database Objects: Triggers
- Database Design: Requirements, Entities, and Attributes
- Business Process Model Notation (BPMN) and the Data Professional
- Business Questions for Database Design, Part One
- Business Questions for Database Design, Part Two
- Database Design: Finalizing Requirements and Defining Relationships
- Database Design: Creating an Entity Relationship Diagram
- Database Design: The Logical ERD
- Database Design: Adjusting The Model
- Database Design: Normalizing the Model
- Creating The Physical Model
- Database Design: Changing Attributes to Columns
- Database Design: Creating The Physical Database
- Database Design Example: Curriculum Vitae
- NULLs
- The SQL Server Sample Databases
- The SQL Server Sample Databases: pubs
- The SQL Server Sample Databases: NorthWind
- The SQL Server Sample Databases: AdventureWorks
- The SQL Server Sample Databases: Adventureworks Derivatives
- UniversalDB: The Demo and Testing Database, Part 1
- UniversalDB: The Demo and Testing Database, Part 2
- UniversalDB: The Demo and Testing Database, Part 3
- UniversalDB: The Demo and Testing Database, Part 4
- Getting Started with Transact-SQL
- Transact-SQL: Data Definition Language (DDL) Basics
- Transact-SQL: Limiting Results
- Transact-SQL: More Operators
- Transact-SQL: Ordering and Aggregating Data
- Transact-SQL: Subqueries
- Transact-SQL: Joins
- Transact-SQL: Complex Joins - Building a View with Multiple JOINs
- Transact-SQL: Inserts, Updates, and Deletes
- An Introduction to the CLR in SQL Server 2005
- Design Elements Part 1: Programming Flow Overview, Code Format and Commenting your Code
- Design Elements Part 2: Controlling SQL's Scope
- Design Elements Part 3: Error Handling
- Design Elements Part 4: Variables
- Design Elements Part 5: Where Does The Code Live?
- Design Elements Part 6: Math Operators and Functions
- Design Elements Part 7: Statistical Functions
- Design Elements Part 8: Summarization Statistical Algorithms
- Design Elements Part 9:Representing Data with Statistical Algorithms
- Design Elements Part 10: Interpreting the Data—Regression
- Design Elements Part 11: String Manipulation
- Design Elements Part 12: Loops
- Design Elements Part 13: Recursion
- Design Elements Part 14: Arrays
- Design Elements Part 15: Event-Driven Programming Vs. Scheduled Processes
- Design Elements Part 16: Event-Driven Programming
- Design Elements Part 17: Program Flow
- Forming Queries Part 1: Design
- Forming Queries Part 2: Query Basics
- Forming Queries Part 3: Query Optimization
- Forming Queries Part 4: SET Options
- Forming Queries Part 5: Table Optimization Hints
- Using SQL Server Templates
- Transact-SQL Unit Testing
- Index Tuning Wizard
- Unicode and SQL Server
- SQL Server Development Tools
- The SQL Server Transact-SQL Debugger
- The Transact-SQL Debugger, Part 2
- Basic Troubleshooting for Transact-SQL Code
- An Introduction to Spatial Data in SQL Server 2008
- Performance Tuning
- Performance Tuning SQL Server: Tools and Processes
- Performance Tuning SQL Server: Tools Overview
- Creating a Performance Tuning Audit - Defining Components
- Creating a Performance Tuning Audit - Evaluation Part One
- Creating a Performance Tuning Audit - Evaluation Part Two
- Creating a Performance Tuning Audit - Interpretation
- Creating a Performance Tuning Audit - Developing an Action Plan
- Understanding SQL Server Query Plans
- Performance Tuning: Implementing Indexes
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 1
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 2
- Performance Monitoring Tools: Windows System Monitor
- Performance Monitoring Tools: Logging with System Monitor
- Performance Monitoring Tools: User Defined Counters
- General Transact-SQL (T-SQL) Performance Tuning, Part 1
- General Transact-SQL (T-SQL) Performance Tuning, Part 2
- General Transact-SQL (T-SQL) Performance Tuning, Part 3
- Performance Monitoring Tools: An Introduction to SQL Profiler
- Performance Tuning: Introduction to Indexes
- Performance Monitoring Tools: SQL Server 2000 Index Tuning Wizard
- Performance Monitoring Tools: SQL Server 2005 Database Tuning Advisor
- Performance Monitoring Tools: SQL Server Management Studio Reports
- Performance Monitoring Tools: SQL Server 2008 Activity Monitor
- The SQL Server 2008 Management Data Warehouse and Data Collector
- Performance Monitoring Tools: Evaluating Wait States with PowerShell and Excel
- Practical Applications
- Choosing the Back End
- The DBA's Toolbox, Part 1
- The DBA's Toolbox, Part 2
- Scripting Solutions for SQL Server
- Building a SQL Server Lab
- Using Graphics Files with SQL Server
- Enterprise Resource Planning
- Customer Relationship Management (CRM)
- Building a Reporting Data Server
- Building a Database Documenter, Part 1
- Building a Database Documenter, Part 2
- Data Management Objects
- Data Management Objects: The Server Object
- Data Management Objects: Server Object Methods
- Data Management Objects: Collections and the Database Object
- Data Management Objects: Database Information
- Data Management Objects: Database Control
- Data Management Objects: Database Maintenance
- Data Management Objects: Logging the Process
- Data Management Objects: Running SQL Statements
- Data Management Objects: Multiple Row Returns
- Data Management Objects: Other Database Objects
- Data Management Objects: Security
- Data Management Objects: Scripting
- Powershell and SQL Server - Overview
- PowerShell and SQL Server - Objects and Providers
- Powershell and SQL Server - A Script Framework
- Powershell and SQL Server - Logging the Process
- Powershell and SQL Server - Reading a Control File
- Powershell and SQL Server - SQL Server Access
- Powershell and SQL Server - Web Pages from a SQL Query
- Powershell and SQL Server - Scrubbing the Event Logs
- SQL Server 2008 PowerShell Provider
- SQL Server I/O: Importing and Exporting Data
- SQL Server I/O: XML in Database Terms
- SQL Server I/O: Creating XML Output
- SQL Server I/O: Reading XML Documents
- SQL Server I/O: Using XML Control Mechanisms
- SQL Server I/O: Creating Hierarchies
- SQL Server I/O: Using HTTP with SQL Server XML
- SQL Server I/O: Using HTTP with SQL Server XML Templates
- SQL Server I/O: Remote Queries
- SQL Server I/O: Working with Text Files
- Using Microsoft SQL Server on Handheld Devices
- Front-Ends 101: Microsoft Access
- Comparing Two SQL Server Databases
- English Query - Part 1
- English Query - Part 2
- English Query - Part 3
- English Query - Part 4
- English Query - Part 5
- RSS Feeds from SQL Server
- Using SQL Server Agent to Monitor Backups
- Reporting Services - Creating a Maintenance Report
- SQL Server Chargeback Strategies, Part 1
- SQL Server Chargeback Strategies, Part 2
- SQL Server Replication Example
- Creating a Master Agent and Alert Server
- The SQL Server Central Management System: Definition
- The SQL Server Central Management System: Base Tables
- The SQL Server Central Management System: Execution of Server Information (Part 1)
- The SQL Server Central Management System: Execution of Server Information (Part 2)
- The SQL Server Central Management System: Collecting Performance Metrics
- The SQL Server Central Management System: Centralizing Agent Jobs, Events and Scripts
- The SQL Server Central Management System: Reporting the Data and Project Summary
- Time Tracking for SQL Server Operations
- Migrating Departmental Data Stores to SQL Server
- Migrating Departmental Data Stores to SQL Server: Model the System
- Migrating Departmental Data Stores to SQL Server: Model the System, Continued
- Migrating Departmental Data Stores to SQL Server: Decide on the Destination
- Migrating Departmental Data Stores to SQL Server: Design the ETL
- Migrating Departmental Data Stores to SQL Server: Design the ETL, Continued
- Migrating Departmental Data Stores to SQL Server: Attach the Front End, Test, and Monitor
- Tracking SQL Server Timed Events, Part 1
- Tracking SQL Server Timed Events, Part 2
- Patterns and Practices for the Data Professional
- Managing Vendor Databases
- Consolidation Options
- Connecting to a SQL Azure Database from Microsoft Access
- SharePoint 2007 and SQL Server, Part One
- SharePoint 2007 and SQL Server, Part Two
- SharePoint 2007 and SQL Server, Part Three
- Querying Multiple Data Sources from a Single Location (Distributed Queries)
- Importing and Exporting Data for SQL Azure
- Working on Distributed Teams
- Professional Development
- Becoming a DBA
- Certification
- DBA Levels
- Becoming a Data Professional
- SQL Server Professional Development Plan, Part 1
- SQL Server Professional Development Plan, Part 2
- SQL Server Professional Development Plan, Part 3
- Evaluating Technical Options
- System Sizing
- Creating a Disaster Recovery Plan
- Anatomy of a Disaster (Response Plan)
- Database Troubleshooting
- Conducting an Effective Code Review
- Developing an Exit Strategy
- Data Retention Strategy
- Keeping Your DBA/Developer Job in Troubled Times
- The SQL Server Runbook
- Creating and Maintaining a SQL Server Configuration History, Part 1
- Creating and Maintaining a SQL Server Configuration History, Part 2
- Creating an Application Profile, Part 1
- Creating an Application Profile, Part 2
- How to Attend a Technical Conference
- Tips for Maximizing Your IT Budget This Year
- The Importance of Blue-Sky Planning
- Application Architecture Assessments
- Transact-SQL Code Reviews, Part One
- Transact-SQL Code Reviews, Part Two
- Cloud Computing (Distributed Computing) Paradigms
- NoSQL for the SQL Server Professional, Part One
- NoSQL for the SQL Server Professional, Part Two
- Object-Role Modeling (ORM) for the Database Professional
- Business Intelligence
- BI Explained
- Developing a Data Dictionary
- BI Security
- Gathering BI Requirements
- Source System Extracts and Transforms
- ETL Mechanisms
- Business Intelligence Landscapes
- Business Intelligence Layouts and the Build or Buy Decision
- A Single Version of the Truth
- The Operational Data Store (ODS)
- Data Marts – Combining and Transforming Data
- Designing Data Elements
- The Enterprise Data Warehouse — Aggregations and the Star Schema
- On-Line Analytical Processing (OLAP)
- Data Mining
- Key Performance Indicators
- BI Presentation - Client Tools
- BI Presentation - Portals
- Implementing ETL - Introduction to SQL Server 2005 Integration Services
- Building a Business Intelligence Solution, Part 1
- Building a Business Intelligence Solution, Part 2
- Building a Business Intelligence Solution, Part 3
- Tips and Troubleshooting
- SQL Server and Microsoft Excel Integration
- Tips for the SQL Server Tools: SQL Server 2000
- Tips for the SQL Server Tools – SQL Server 2005
- Transaction Log Troubles
- SQL Server Connection Problems
- Orphaned Database Users
- Additional Resources
- Tools and Downloads
- Utilities (Free)
- Tool Review (Free): DBDesignerFork
- Aqua Data Studio
- Microsoft SQL Server Best Practices Analyzer
- Utilities (Cost)
- Quest Software's TOAD for SQL Server
- Quest Software's Spotlight on SQL Server
- SQL Server on Microsoft's Virtual PC
- Red Gate SQL Bundle
- Microsoft's Visio for Database Folks
- Quest Capacity Manager
- SQL Server Help
- Visual Studio Team Edition for Database Professionals
- Microsoft Assessment and Planning Solution Accelerator
- Aggregating Server Data from the MAPS Tool
A data dictionary is, at its most simple, a mapping of data elements your organization stores, where it stores them, and what each element means. While it’s simple to describe, it’s a bit more complicated and time-consuming to implement.
There are multiple reasons to take the time to document the data elements in your organization. The first is for High-Availability and Disaster Recovery (HA/DR) or what is often called “Business Continuity.” If you don’t know where your data is or how it is recorded, you certainly can’t back it up or restore it in a timely manner when needed.
Another reason for a data dictionary is that you need to determine authoritative sources for your data elements when you need systems to interact. At one location I worked as a data professional, we had no less than three “authoritative” inventory reports — all of which displayed a different number for “inventory on hand”. This was a shop desperately in need of a good data dictionary. The report data discrepancy only came to light when a vendor got two of the reports from different parts of our company — and asked which number she should use to send us new material.
Whenever you implement a Business Intelligence (BI) system in your organization, you will need a data dictionary to gather the disparate elements and cull them into a single reporting entity. In other words, you need a single source of data documented somewhere so that you can put all of it together to report and analyze it.
Another interesting reason you need a data dictionary is for systems optimization. You’ll find in your discovery efforts that I detail below that you are more than likely storing the same data multiple times. That may be entirely acceptable, but in some cases you’ll find a department re-recording data from a database for their own downstream use. While that’s acceptable from a read-only standpoint, it becomes problematic if the department modifies that original data, and doesn’t feed it back into the general data flow. In this case, you may be able to integrate their department-level system into the larger one, cleaning the data and de-duplicating it at the same time. That saves backup and optimization time, since the department no longer has to maintain a set of maintenance processes for themselves. By cleaning the data real-time, you save the process of re-importing back into the main data system. It reduces size, time and effort all the way around.
Creating a data dictionary goes beyond simply identifying broad data elements. And you’re not able to do this alone. From the very first step, you’ll need to involve multiple teams to get an accurate data dictionary. No one group knows where all of the data is, which of it is authoritative, and which you should track. This will definitely be a place where you want to involve business teams, development teams and data teams, however those are laid out in your organization.
So a data dictionary has multiple uses in an organization, and it will be a cooperative effort to create one. Here are some practical steps on implementing the effort.
Selecting a Format for the Data
This might not seem like the obvious place to start. In fact, I normally caution against picking a tool or some other implement as the starting place for a technical challenge, but this is an exception.
The reason you want to start with the format of data — often persisted using a particular tool or methodology — is that it will define the rest of your efforts. For instance, if you are dealing with another organization, and they have standardized on ISO 11179, for instance, you will need to ensure that your efforts match that standard. Using that standard as an example, there are well-recognized elements and descriptions for laying out the data.
It’s best if you define the “meta-data” (data about your data) that you want to collect at the outset of the effort. Since this work will not be trivial to do, you want to originate the effort once and then maintain the system and adapt it as you move along. This isn’t to say that you can’t create your own standards for the data dictionary. The point is to have a single referencing, maintainable system so that you can query and locate where data is stored in your organization.
Once you’ve defined the tool or standard you plan to use, it’s on to the next step of defining the data elements for your data dictionary.
Defining Data Elements for the Data Dictionary
A data dictionary is normally restricted to a particular set of data. After all, every column in every spreadsheet on every user’s desk, every cell in a word-processing document, even lines in a text file is a data element. It doesn’t make a lot of sense to try and track every single one of those elements, but there are some that you should track.
There are several methods to use to define the elements for your data dictionary. If you have picked an automated tool for your system, you may find that it can “watch” the data flowing through your systems and help you identify them very quickly. If you’re using a manual process, follow these tips to help as you go along.
First, keep the elements you want to collect as small as possible. Start broadly — have the Business Analysts help you define the “mission critical” systems (meaning logical systems, not physical hardware or software systems) you have, and tease out the large elements from there, such as “Purchase Order”, “Inventory”, “Time Spent on Project”, Billing” and so on. At this point the task won’t be as daunting, because you’re only working with a few scenarios — things that the organization tells you they must know in order to do business. And keep it at a high-level at this point, assuming your tool or process allows you to do that.
Second, let the business drive the original requirements. They should understand the need for this data, for all of the reasons I outlined above. It will take time and money to perform this audit, so they need to understand why you’re doing this work instead of other work. Once they understand the reason for the effort, they can drive the data requirements for you. Keep in mind that creating the data dictionary is not always the most difficult part of this exercise — maintaining it is.
Finally, make sure you have buy-in from as many teams as possible. You cannot do this work alone, even when “alone” means the entire technical team. Your focus is on technology and where it is applied, but there is a surprising amount of data that is not captured in formal IT systems — they are in spreadsheets, on Access databases and “in the cloud”. You’ll need help locating these. The development teams also need this information, since they are often asked to include information on a single screen from multiple systems. The temptation is to re-create those data systems, rather than looking to see if they already exist.
With the broad areas defined, It’s time to discover where the authoritative bits and bytes live, and who owns them.
Locating Sources for the Data Elements
If you’re using an automated Master Data Management (MDM) system such as SQL Server’s Master Data Services, you may perform this step a bit out of order. Those automated systems normally have a checklist of tasks that identify the source systems by watching data patterns, and where they flow to and from.
You can simulate this yourself, by starting at the user’s end of the data flow. Once the business identifies the critical systems, it’s a simple matter to find out what the users (or systems) do to get that data into those systems. Find the application owners for those programs, and trace the data being entered back to their source systems.
Repeat this process, but keep in mind that the data will probably be very normalized at first. You’ll need to find out what data store (or data stores) reach element ends up in, and then look for views or stored procedures that bring the elements back into a meaningful whole. This is usually the granularity you’re more interested in. For instance, a Purchase Order object is usually represented by several elements joined together, such as the Purchase Order header, line items and so on. Unless these are coalesced from several systems, you normally only need to know the PO object, not each detail breakout, at least for the data dictionary.
So at this point you have the following defined:
- What you want to track
- Where it is
Next, you need to define the meaning to each of these larger elements.
Assigning Meaning to the Elements
It is at this point that the data professional must involve the business. A database is similar to a bank. A bank can store your money, allow you to put money in and take it out, or even covert it to another currency. What a bank can’t do is tell you if a certain amount of money is “good” or “bad”, without having more background. In fact, even with that it’s usually not the place to ask that question.
Where the database is similar is that it stores data. As data professionals we make it possible to store data, edit it, remove it, report on it, even convert it to other formats. But we can’t give it meaning — nor should we. Our guarantee is that what goes into the database is what we store.
So this is where you need a representative from the business, usually a Business Analyst that will give the elements meaning. You might think that if a data element is marked “Purchase Order” then you can state that it’s a Purchase Order — no reason to involve a Business Analyst at that point, correct?
But you’d be wrong. Recall that company I worked at that had multiple numbers for the word “inventory”. In fact, the issue revolved around the fact that it was “on-hand” inventory. It turns out that each of the reports with the different numbers on them were correct! Each was using a different definition of the word “on-hand”. In one country, anything on the shelf of the company was considered (and taxed) as on-hand inventory. In another country, the source manufacturer would stage parts on our company’s shelves, but we actually only entered them into on-hand inventory when we paid for them for our use. Reporting came from multiple systems, and each was correct, hence the different definitions for on-hand inventory.
And as the data professional, you can’t really be expected to track international tax law. So the point is that you need to make sure a business representative evaluates your list and gives each data element meaning.
Developing a System for Updating the Data Dictionary
Remember, the Data Dictionary you’re creating is not the process of using it. In fact, the more difficult work is yet to come — maintaining and using the Data Dictionary is actually far more time consuming.
This is where a Master Data Management system is superior to using your own manually created one. In an ideal setup, developers no longer even use the database system — they point directly at the MDM system and feed and take data from that. Of course, this isn’t always possible if the application is “canned” or not controlled by your organization.
In that case, you need to ensure that as data systems are added, removed or altered within your organization that your team is made aware of the changes. Otherwise the data dictionary will be out of date, and could even cause errors.
The keys to keeping the system up to date are cooperation and keeping the meta-data set as small as possible. Whenever a new system is put into place or an older one is changed, make sure there’s a tight process for notifying your team. Again, the entire data dictionary process feeds important business processes like HA/DR, BI and integration, so the importance is there.
Also, remember that it is comparatively easy (and tempting) to add more and more elements to track in a data dictionary. Resist this urge. Every element you add brings an exponential level of maintenance and upkeep.