Security
Network security is an increasingly important factor in bioinformatics because of the central role that online databases, applications, and groupware such as e-mail play in the day-to-day operation of a bioinformatics facility. Opening an intranet to the outside world through username and password-protected restricted access may be the basis for collaboration as well as a weak point in the security of the organization. In addition, because many biometric laboratories are involved, even if indirectly, with applied genomics, there is a group of politically active opponents to this research. The computer-savvy members of these activist groups represent a potential threat to network security.
Every network presents a variety of security holes through which potential hackers and disgruntled or simply curious employees can implement random threats, such as viruses. Many of these threats are network- and operating systemspecific. For example, Microsoft typically announces a service pack within a few weeks after the introduction of a server-based operating system to patch security holes discovered by users.
The most secure methodphysical isolation from outside networksisn't usually a viable option. Even a closed network without dial-in or any other wired access to other networks can be breached by someone with enough motivation and time. For example, wireless networks are notorious for their potential to disseminate data to nearby listeners. A hacker with a high-gain antenna, receiver, and laptop computer can monitor wireless network activity from a mile or more away. A similar setup, configured to a slightly different frequency, can be used to reconstruct whatever data is displayed on a video screen, including username and password information. Every cable, peripheral, and display device emits a radio frequency signal that can be captured, amplified, and read. For this reason, computer facilities used by military contractors are frequently located in shielded, windowless rooms that minimize the chances of the radiation emitted from a computer reaching someone who is monitoring the building.
Although it may be practically impossible to maintain security from professional industrial spies, a variety of steps can be taken to minimize the threat posed by modestly computer-savvy activists and the most common non-directed security threats. These steps include using antiviral utilities, controlling access through the use of advanced user-authentication technologies, firewalls, and, most importantly, low-level encryption technologies.
Antiviral Utilities
In addition to threats from hackers, there is a constant threat of catastrophic loss of data from viruses attached to documents from outside sources, even those from trusted collaborators. The risk of virus infection can be minimized by installing virus-scanning software on servers and locally on workstations. The downside to this often-unavoidable precaution is decreased performance of the computers running antiviral programs, as well as the maintenance of the virus-detection software to insure that the latest virus definitions are installed.
Authentication
The most often used method of securing access to a network is to verify that users are who they say they are. However, simple username and password protection at the firewall and server levels can be defeated by someone who either can guess or otherwise has access to the username and password information. A more secure option is to use a synchronized, pseudorandom number generator for passwords. In this scheme, two identical pseudorandom number generators, one running on a credit cardsized computer and one running on a secure server, generate identical number sequences that appear to be random to an observer.
The user carries a credit-card sized secure ID card that displays the sequence on an LCD screen. When a user logs in to the computer network, she uses the displayed number sequence for her password, which is compared to the current number generated by a program running the server. If the sequences match, she is allowed access to the server. Otherwise, she is locked out of the network. Because the number displayed on the ID cardand in the serverchanges every 30 seconds, the current password doesn't provide a potential intruder with a way in to the system. The major security hole is that a secure ID card can be stolen, which will provide the thief with the password, but not the username.
More sophisticated methods of user authentication involve biometrics, the automated recognition of fingerprint, voice, retina, or facial features. Authentication systems based on these methods aren't completely accurate, however, and there are often false positives (imposters passing as someone else) and false negatives (an authentic user is incorrectly rejected by the system) involved in the process. In addition to errors in recognition, there are often ways of defeating biometrical devices by bypassing the image-processing components of the systems. For example, fingerprints are converted into a number and letter sequence that serves as the key to gaining access to network assets; anyone who can intercept that sequence and enter it directly into the system can gain access to the network.
A researcher employed by a biotech firm to analyze nucleotide sequences probably has no need to examine the files in a 3D protein visualization system in the laboratory a few doors down from his office. Similarly, payroll, human resources, and other administrative data may be of concern to the CFO, but not to the manager of the microarray laboratory. Authentication provides the information necessary to provide tiered access to networked resources. This access can be controlled at the workstation, the server, and firewall levels to limit access to specific databases, applications, or network databases.
Firewalls
As introduced in the discussion of network hardware, firewalls are stand-alone devices or programs running on a server that block unauthorized access to a network. Dedicated hardware firewalls are more secure than a software-only solution, but are also considerably more expensive.
Firewalls are commonly used in conjunction with proxy servers to mirror servers inside a firewall, thereby intercepting requests and data originally intended for an internal server. In this way, outside users can access copies of some subset of the data on the system without ever having direct access to the data. This practice provides an additional layer of security against hackers.
Encryption
Encryption, the process of making a message unintelligible to all but the intended recipient, is one of the primary means of ensuring the security of messages sent through the Internet and even in the same building. It's also one of the greatest concernsand limitationsof network professionals. Many information services professionals are reluctant to install wireless networks because of security concerns, for example.
Although cryptographythe study of encryption and decryptionpredates computers by several millennia, no one has yet devised a system that can't be defeated, given enough time and resources. Every form of encryption has tradeoffs of security versus processing and management overhead, and different forms of encryption are used in different applications (see Table 3-4).
Of the encryption standards developed for the Internet, most are based on public key encryption (PKE) technology. One reason that PKE is so prominent is because it's supported by the Microsoft Internet Explorer and Netscape Navigator browsers. PKE is a form of asymmetric encryption, in that the keys used for encryption and decryption are different. Aside from the added complexity added by the use of different keys on the sending and receiving ends, the two forms of encryption and decryption are virtually identical. As such, the illustration of PKE in Figure 3-12 assumes symmetric encryption for the purpose of clarity.
Figure 3-12 PKE. Workstations in California and Massachusetts exchange bioinformatics data by first exchanging public keys. These public keys are then used with private keys to generate a session key, which defines encryption and decryption. Although true PKE is asymmetric, the session keys illustrated here are identical (symmetrical) for clarity.
Table 3-4 Encryption Standards. PGP (Pretty Good Privacy) is one of the more popular encryption standards used on the Internet. Most of these standards are based on PKE technology.
Standard |
Description |
AES |
Advanced Encryption StandardEventual replacement for DES, based on 128-bit encryption. |
DES |
Data Encryption StandardUsed by the government, based on 64-bit encryption. |
IDEA |
International Data Encryption AlgorithmUsed by the banking industry, developed by the Swiss Federal Institute of Technology, 128-bit encryption. |
PGP |
Pretty Good PrivacyPopular on the Internet, effective, free, simple to use. |
RSA |
Rivest-Shamir-Adelman SystemPopular in business and government. |
S-HTTP |
Secure Hypertext Transfer ProtocolFor transmitting individual messages over the Internet. |
SSL |
Secure Sockets LayerDeveloped by Netscape Communications Corp. for the Internet. |
PKE allows two sequencing laboratoriesin Figure 3-12, one in a biotech firm in San Francisco (left) and one in a research facility in Cambridge (right)to securely exchange data. Assuming a researcher in San Francisco wants to send a message to the lab in Cambridge, he first acquires the public key (26) of the facility in Cambridge and, using his private key, generates a session key (2). That is, the private key for the lab in San Francisco is 8, the lab's public key is 16, and the key for this particular session with the lab in Cambridge is 2. A subsequent communication with the lab in Cambridge might use a session key of 4, 7, or some other random number. Similarly, the private key for the lab in Cambridge is 6 and the public key is 26. The session key is 2, identical to the session key used by the lab in San Francisco.
To decrypt a message from the lab in San Francisco, the lab in Cambridge uses its private key (6) and the public key (16) from the lab in San Francisco to generate a session key (2) that is identical to the key used by lab in San Francisco to encrypt the message. Note that only their respective owners know the value of the private keys and that the public keys are generally available. The session key is a function of the other lab's public key. For clarity, not shown is the public key infrastructure, which provides authentication of the public and private keys.
A more secure symmetrical encryption approach, and one used by most governments and corporations to send secure communications over networks, is to use a multi-digit key. The greater the key length, the more difficult and time-consuming it is to crack. The goal is to create a key that is long enough to either deter someone from attempting to hack the code, or one that requires so much computer time to decrypt that the encrypted message is of no value by that time.
Process
More important than the specific encryption algorithm or user-authentication technology used is the process of implementing a security strategy. For example, the best firewall, proxy server, and user-authentication system is valueless if a researcher has a habit of losing his secure ID card. Similarly, a wireless hub capable of supporting the latest security standards is vulnerable to attack if the person who configures the hub doesn't take the time to enable the security features. Similarly, a researcher who leaves her username and passwords on a Post-It Note stuck to her monitor provides a security hole for everyone from the janitorial staff to a visitor who happens to walk past her office.