Implementation
The National Science Foundation (NSF), Department of Energy, and other government agencies, often in collaboration with industry and academia, have virtually unlimited resources available for developing and maintaining networks that have bioinformatics applications. However, for small- to medium-sized biotech firms and bioinformatics departments within pharmaceutical companies, implementing in-house databases presents a formidable challenge. Part of this challenge is that the traditional information services department is ill-equipped to deal with the throughput issues that typically must be addressed by a bioinformatics-compatible network. The typical corporate CIO needs background education on how to implement gigabit fiber networks dedicated to data storage as well as high-speed routers and associated network electronics.
Despite the differences between bioinformatics computing and traditional institutional computing, the process for implementing a high-speed bioinformatics network is identical to that of implementing any other major network. The major steps in the implementation process are the same, regardless of whether they are performed by staff in the bioinformatics laboratory or corporate information services staff. These steps include:
Create a Requirements Specification. This document includes a high-level description of the tasks to be supported by the network, such as routing sequencing data from sequencing machines to analysis workstations and data warehouses, as well as the desired response times and storage capacities. For example, the requirements specification document may stipulate the need to support 35 workstations, provide access to storage in excess of 1 terabyte with an access time of less than 50 milliseconds, with tiered password protection, and secure, high-speed access to the Internet.
Create a Functional Specifications Document. The functional specifications document defines, in detail, how the high-level needs outlined in the requirements specification will be met. This document quantifies many of the qualitative terms in the requirements specification to the degree that anyone competent in information sciences can determine exactly what equipment, personnel, and costs will be associated with the project. Once the functional specifications document has been finalized, the remaining steps are largely straightforward.
Select Hardware. Assuming the functional specifications document is complete, the next step is selecting network and workstation electronics and media. Often the functional specifi- cations document is authored with particular hardware and software in mind, which further simplifies the selection process.
Select Software. Again, following the functional specifications document, this step of the implementation process involves selecting the network operating system, as well as database publishing software and tools such as PHP, XML, CGI, Java, or JavaScript editors and runtime systems.
Select Utility. Software and hardware utilities, such as network monitors and antiviral utilities, should be defined during the design process, not as an afterthought.
Select Internet Access Service. Most larger institutions have high-speed Internet access available throughout their offices. However, bandwidth requirements may necessitate alternate Internet services, such as supplementing a corporate-wide cable modem service with a high-speed dedicated line, satellite link, or high-speed microwave link.
Each of the steps in the implementation process requires different levels of expertise with the bioinformatics requirements, the information technology capabilities, and the likely return on investment of each approach. As a result, network implementation is necessarily a collaborative process involving programmers, hardware technicians, vendors, management, and perhaps the assistance of a consultant.