Designing for BLOB Storage
As mentioned in the earlier section on filegroups, special purpose I/O technologies are available for those use-cases where standard approaches to I/O management are less optimized. A very common situation is found in handling BLOB (binary large object) files. Suppose you are building an application to correlate terrain maps with overlays of data about all the objects that are noteworthy on the map. Your database has data describing all the roads, railroads, electricity, infrastructure, towns, buildings, and so forth. The database also has tables for the metadata and geospatial data types that correlate all of the important objects to exact positions on the maps. The maps themselves, however, are huge multigigabyte image files.
Now the problem arises: Should we store the map image file into a varbinary(max) column within the database, or should we store only a pointer in the database linking the metadata to the map image file on the Windows file system? If we choose the first option, the database will be enormously bloated and take a very long time to back up, restore, and perform other preventative maintenance tasks. If we choose the second option, we have to deal with a much more complex backup and recovery scenario, and we risk having a situation where transactions involving both the map files and the database data may not fully commit or rollback depending on the responsiveness of the Windows file system. What’s a person to do?
The good news is that Microsoft has implemented FILESTREAM data storage specifically for this scenario.
Managing Filestream Data
In previous years, organizations had to invent and maintain their own mechanisms to store unstructured data. Now SQL Server supports a file type that enables organizations to store unstructured data such as bitmap images, music files, text files, videos, and audio files in a single data type, which is more secure and manageable.
From an internal perspective, FILESTREAM creates a bridge between the Database Engine and the NTFS file system of the Windows Server. It stores varbinary(max) binary large object (BLOB) data as files on the file system, while enabling the database engine to interact with the file system through Transact-SQL inserts, updates, queries, as well as full backup and recovery of FILESTREAM data.
In other words, you get all the benefits of a full, transactional database with all the benefits of storing the BLOBs on the file system. FILESTREAM can be enabled at the instance-level and the database-level.
Enabling Filestream Data at the Instance-Level
The first step for managing FILESTREAM data is to enable it on the instance of SQL Server. The following steps indicate how to enable FILESTREAM data:
- Choose, Start, All Programs, Microsoft SQL Server 2012, Configuration Tools, and then select SQL Server Configuration Manager.
- In SQL Server Configuration Manager, highlight SQL Server Services, and then double-click the SQL Server Instance for which you want to enable FILESTREAM. The SQL Server Instance is located in the right pane.
- In the SQL Server Properties dialog box, select the FileStream tab.
- Enable the desired FILESTREAM settings, and then click OK. The options include Enable FILESTREAM for Transact-SQL Access, Enable FILESTREAM for File I/O Streaming Access, and Allow Remote Clients to Have Streaming Access to FILESTREAM Data.
- The final step is to fire the following Transact-SQL code in Query Editor:
Exec sp_configure_filestream_access_level, 2 RECONFIGURE
Using the Database Properties Option Page to Enable FILESTREAM
The following options are also available in the SSMS Object Explorer by right-clicking a specific database name, selecting Properties, and then selecting the Database Properties Option page:
- FILESTREAM Directory Name—The directory name for FILESTREAM data in the specified database.
- FILESTREAM Nontransacted Access—The value to be entered may be OFF, READ_ONLY, or FULL. OFF is the setting when FILESTREAM is disabled for the instance. READ_ONLY and FULL are used to enable FileTables. FileTables are a way to programmatically interact with BLOBs on the Windows Server file system. FileTables are configured independently of FILESTREAM. Because they are programming constructs, they are beyond the scope of this book. Refer to the SQL Server Books Online for more information.
Following is an example using Transact-SQL. To set the AdventureWorks2012 database to enable nontransaction access, you can use this ALTER DATABASE syntax:
USE [master] GO ALTER DATABASE [AdventureWorks2012] SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME= N'AdventureWorksFST' ) WITH NO_WAIT GO
Administering the FILESTREAM from the Advanced Page of the Server Properties Dialog
The Advanced Page, shown in Figure 3.11, contains the SQL Server general settings that can be configured. The only important settings on this page, in terms of I/O, are the FILESTREAM settings.
Figure 3.11. Administering the Server Properties Advanced Settings page.
Two items can be configured via the Advanced page:
- Filestream Access Level—This setting displays how the SQL Server instance will support FILESTREAM. FILESTREAM allows for the storage of unstructured data. The global server options associated with FILESTREAM configuration include the following:
- Disabled—The Disabled setting does not allow Binary Large Object (BLOB) data to be stored in the file system.
- Transact-SQL Access Enabled—FILESTREAM data is accessed only by Transact-SQL and not by the file system.
- Full Access Enabled—FILESTREAM data is accessed by both Transact-SQL and the file system.
- FILESTREAM Share Name—This setting displays the read-only share name configured during installation and setup of the SQL Server instance.
Enhancements to FILESTREAM in SQL Server 2012
Previous versions of SQL Server allowed only one FILESTREAM container per filegroup. This limitation hampered I/O performance and scalability. SQL Server 2012 now supports multiple FILESTREAM containers per filegroup. Other improvements include the ability to set a maximum size for the container, a DBCC SHRINKFILE EMPTYFILE command to shrink and empty FILESTREAM containers, support for multiple storage drives and multiple disks, and enhancements to the CREATE DATABASE and ALTER DATABASE to support the new features.
Just as database performance improves with multiple data files, FILESTREAM I/O scalability and performance improve with multiple disks. Many users are reporting a doubling in I/O performance for write speed and as much as a fivefold improvement in read speed, when writing and reading 1Mb BLOB files in a test environment, compared to SQL Server 2008 R2. Of course, your mileage may vary.
Not only can you enable FILESTREAM at the server level, as shown in Figure 3.11, you may also enable FILESTREAM at the database level. To do so, right-click on a database name in the SSMS Object Explorer and select Properties; then click the Options setting. The Options Page, similar to that shown in Figure 3.12, will appear:
Figure 3.12. Administering FILESTREAM on the Database Options page.
Refer to the earlier descriptions associated with Figure 3.11, since the options are identical in meaning.
Overview of Remote BLOB Store (RBS)
The Remote BLOB Store, or RBS, is an optional add-on component to manage and store BLOBs on cheap, commodity storage. It is frequently used with SQL Servers that support SharePoint and is not, by default, installed with SQL Server. You can find RBS online on the SQL Server 2008 R2 Feature Pack page at http://go.microsoft.com/fwlink/?LinkID=210168.
RBS provides similar benefits as FILESTREAM by moving BLOB files to cheaper storage. RBS provides a number of programmatic means of manipulating the data. RBS also supplies a FILESTREAM provider to store the BLOB data within SQL Server, but also offers the flexibility to use different storage solutions other than SQL Server. RBS provides an API so that you can write your own providers, such as another database platform, and includes a sample provider to the Windows NTFS file system, complete with source code. RBS can be found on Codeplex at http://sqlrbs.codeplex.com/.