How to Deploy a SAN
The introduction of digital technology has propelled document printing from a mechanical to a computerized function, bringing opportunities for new imaging techniques, customized documents specific to each ad campaign or promotion, and online order processing. These new functions, however, take up vast amounts of storage space. The original computer systems purchased to handle office applications were not intended to accommodate the huge imaging files that digital printing requires, nor did they have backup systems in place to protect 24-hour online transaction data. Print vendors are scrambling to find cost-effective ways to increase storage and backup capabilities yet stay competitive in a very close market. The three vendors discussed here are very happy with their completely distinct solutions to the same basic storage problem.
Expanding Storage the SAN Way
The printing industry's data storage needs have surged since the onset of print-on-demand (POD) order fulfillment. The introduction of digital printing technology brought the ability to quickly change and customize printed documents on demand. Businesses no longer need to inventory printed in-stock brochures now that they've discovered how easily customized trade-show handouts, menus, and promotional pieces can be obtained. The cost-effectiveness of POD digital printing allows companies to do smaller runs of specific materials instead of ordering large amounts of generic brochures (50% of which lose relevancy before being used).
For example, MediaFlex2 of Campbell, California, provides the e-commerce infrastructure for a number of print vendors and their customers. A user can now log on to his print vendor's e-business Web site, connect to a Web page of his own previously ordered print templates, select a product, make changes to his company's documents right on the screen, and place his order. He can check price quotes, get order confirmation, and view a proof copy in real time, 24 x 7.
To provide the customer that ease requires a data storage methodology with 100% file availability, large storage capacity (a single typical document template is 25MB to 50MB), scalability for a growing client base, and complete security for online server data. MediaFlex chose a switched-fabric storage area network (SAN) to meet its needs. Four existing enterprise servers were integrated with a new RAID device in a SAN Fibre Channel switching environment. Redundant hardware throughout the file-delivery subsystem over a dual-switched, multiple server-to-RAID storage channel provides automatic failover protection and fault isolation, guaranteeing continual operation. In that way, the configuration ensures network availability, allows centralized backup functions with quick failover disaster-recovery capabilities, permits file sharing between servers, and provides scalability without online interruptions.
The SAN contains four Sun 400MHz E450 enterprise servers running version 7 of the Solaris operating system. Each server is equipped with Fibre Channel host bus adapters (HBAs), 124MB of RAM, and 9GB of hard disk space. Web traffic is funneled to two servers, and the other two are reserved for network files and applications. All four are active, but a failure in one will prompt another to take over in approximately 15 seconds.
NOTE
All of the HBAs available today can interface to both fabric and loop topologies.
A Hitachi 5846 full Fiber Channel RAID with 140GB of total usable file storage space is connected directly to the Sun servers through two Brocade Fiber Channel switches. The RAID device is equipped with 2 controllers and 10 drives, and it is used for all storage needs, freeing up the servers' hard disks for applications. The controllers connect to each other and to the servers' host bus adapter fabric cards via redundant Fibre Channel bus paths. If one controller fails, the RAID automatically and transparently switches to the remaining controller.
In addition, a tape library allows near-online secondary storage, conserving online storage space by migrating rarely used files from the RAID through storage zoning and a continual automated backup process. Veritas Volume Manager software handles all storage-management and backup tasks.
The SAN storage solution was installed and running in two days. To date, MediaFlex, its print vendors, and their customers continue to benefit from the system's speed (the SAN and Redundant Array of Independent (or Inexpensive) Disks [RAID] can read a 25MB file in 1 second), enhanced file accessibility (client files must be available to several applications and both Web servers), state-of-the-art online scalability, and complete reliability (system downtime is virtually eliminated).
A Packaged Solution
Dallas-based Blanks Color Imaging3, which specializes in printing, also found its storage capacity inadequate for its expanding business. Founded in 1940 as a letterpress shop, Blanks expanded and diversified by combining prepress, sheet-fed printing, and digital photography services into a turnkey print delivery format. The company operates seven days a week to produce brochures, advertising supplements, posters, and other marketing materials for direct mail businesses, department stores, and large national printing companies. Blanks's data storage capacity could not meet the company's growing workload.
The company had twofold system needs. First, Blanks was still relying on 2GB digital audio tape (DAT) and a 1.2GB optical disk. With the large, image-based files generated by the current turnkey format, those did not meet the need. Second, too much time was used shuttling 100MB to 150MB high-resolution files over the 10MB Ethernet network. Initially, the company attempted to save time by placing active files on a 2GB removable hard drive that was hand carried as needed between imaging workstations, but that method proved very inefficient.
After researching other options, Blanks resolved its system needs with the purchase of a Scitex Server System package. The company's 100 nodes now reside on a 100MB Ethernet LAN. The new server is a Scitex Ripro 5000 AIX with a 640GB RAID disk array, backed up by a Breece Hill automated tape library with four Quantum Digital Linear Tape (DLT) 7000 drives. Each cartridge holds up to 70GB of compressed data, and the library currently holds 4.2TB (1 TB = 1,024GB) of data in near-online storage.
Legato NetWorker software electronically manages all files, migrating them at predetermined intervals from the RAID device to the tape library. That archiving and backup software, linked with the Scitex Timna database, manages 150 DLTape cartridges that replace the original 1,800 DAT tapes. The cartridges provide ready availability to 7TB of recent and current images, layouts, and advertisements (representing $6 million in business). Designers waited three to six hours for that same data when it was stored on DAT. Seventy-five DLTape cartridges are reserved for RAID array backup; 50 are used to archive past jobs and image files.
Twice-a-day incremental backups are implemented. A full system backup, which requires eight to nine hours, is performed weekly, and files are safeguarded by storing tape copies both onsite and offsite. Three sets of all files are maintained and are rotated continuously. At any one time, one set is in the RAID device, one is in the tape library, and one is offsite. That provides excellent protection for client files, and, if a large disaster ever occurs, no more than six hours of work can be lost.
With the Legato and Scitex software, employees are able to track archived files. The automation software, faster network, and DLTape library system have improved efficiency, increasing throughput by 30% to 40% and dramatically improving turnaround time to clients.
Using HSM Technology
Banta Digital Group's4 expertise involves customized digital imaging and content-management services. Customers' desktops are connected directly to Banta's WAN, from which they're able to manage their own documents' entire creative design, digital prepress, and digital printing processes. The company guarantees its customers 100% file availability, complete data protection, and ample storage capacity. Over 200 users may be logged on to the Banta network at any one time, resulting in an enormous number of files being updated and saved continuously. The company needed a system tailored specifically for its unique and extremely high-volume data storage and availability requirements.
Banta chose to install a hierarchical storage-management (HSM) data-protection and availability system consisting of an enterprise server connected to three levels of storage. The HSM software automatically, seamlessly, and transparently manages the network's three-level storage hierarchy. New and frequently used files are saved on a RAID device (online storage), while less frequently used files are migrated to a large-capacity, near-online magneto-optical (MO) jukebox. Rarely used files are archived to a high-capacity 34TB DLT tape library.
As an added protection, the HSM software package is configured to automatically mirror RAID data onto the MO library within minutes of being received. In the case of extremely large customer files, RAID space is conserved by saving only 40GB of the file on the RAID device, automatically offloading the overage to MO. A second copy of the entire file is immediately saved with the archived files on tape.
Banta's existing LAN has been upgraded to Fast Ethernet/ATM. The connections coming from external nodes into the network were upgraded from T1 and Frame Relay to higher-capacity DS3 lines. To ensure complete file availability, redundant hardware was installed throughout the file delivery subsystem (access paths, controllers, and so on).
The enterprise server chosen by Banta is a Sun 336MHz E6500, containing 10GB of RAM and running on Solaris 2.6. The E6500 has 10 CPUs, eliminating the possibility that a CPU malfunction could result in system failure or downtime.
The first tier of online data storage is a StorageTek CBNT-CO1 Fibre Channel RAID device with a 1TB capacity and 100MBps access speed. Banta is currently using 60 18GB drives, but the RAID device is scalable to 120 drives for each of its two controllers. Because each controller has its own Fibre Channel bus path, if one fails, the other (failover) controller can take over automatically.
An SCSI-attached DISC (alternative spelling of disk) El050 read/write jukebox is the second storage tier. The jukebox contains 1,000 platters (5TB of data storage) and 16 disc drives; it's scalable to 32 drives to accommodate future needs. Offloading RAID online storage onto the near-online MO jukebox is a low-cost method of improving the performance and access speed of the RAID device while continuing to keep less frequently used files accessible.
The final tier of storage in the hierarchy is a StorageTek 9710 tape library. Banta uses 6 of the possible 10 DLT 7000 drives. The library has a total data capacity of 28TB. Although the rarely used files are now archived, all data in the HSM-controlled tape library remains available to the user. In contrast with typical tape archives, the HSM system eliminates the need to search for and reload offline tapes.
The bulk of Banta Digital Group's customers are cataloguers, advertisers, and creative designers, all of whom have benefited from the new HSM system. Besides enjoying 100% guaranteed file access and the disaster-recovery safety net provided by the backup hierarchy and redundant hardware, Banta has seen new product introduction time drop from 32 to 19 days, and the prepress production cycle have been shortened from 3 days to 1 day. Transparent to the clients, Banta has also been able to reduce staffing requirements by eight people, giving the company another advantage in a competitive industry.
Each company's unique situation dictates the storage solution that's best for it. No one solution will work in each case. The three storage styles, though not at all alike, incorporate cutting-edge technology and state-of-the-art hardware and software, as well as advanced file availability, storage capacity, and data protection.