Security Principles: Limitations
NSM is not a panacea; it suffers limitations that affect the ways in which NSM can be performed. The factors discussed in this section recognize that all decisions impose costs on those who implement monitoring operations. In-depth solutions to these issues are saved for the chapters that follow, but here I preview NSM's answers.
Collecting Everything Is Ideal but Problematic
Every NSM practitioner dreams of being able to collect every packet traversing his or her network. This may have been possible for a majority of Internet-enabled sites in the mid-1990s, but it's becoming increasingly difficult (or impossible) in the mid-2000s. It is possible to buy or build robust servers with fast hard drives and well-engineered network interface cards. Collecting all the traffic creates its own problems, however. The difficulty shifts from traffic collection to traffic analysis. If you can store hundreds of gigabytes of traffic per day, how do you make sense of it? This is the same problem that national intelligence agencies face. How do you pick out the phone call or e-mail of a terrorist within a sea of billions of conversations?
Despite these problems, NSM principles recommend collecting as much as you can, regardless of your ability to analyze it. Because intruders are smart and unpredictable, you never know what piece of data hidden on a logging server will reveal the compromise of your most critical server. You should record as much data as you possibly can, up to the limits created by bandwidth, disk storage, CPU processing power, and local policies, laws, and regulations. You should archive that information for as long as you can because you never know when a skilled intruder's presence will be unearthed. Organizations that perceive a high level of risk, such as financial institutions, frequently pay hundreds of thousands of dollars to deploy multi-terabyte collection and storage equipment. While this is overkill for most organizations, it's still wise to put dedicated hardware to work storing network data. Remember that all network traffic collection constitutes wiretapping of one form or another.
The advantage of collecting as much data as possible is the creation of options. Collecting full content data gives the ultimate set of options, like replaying traffic through an enhanced IDS signature set to discover previously overlooked incidents. Rich data collections provide material for testing people, policies, and products. Network-based data may provide the evidence to put a criminal behind bars.
NSM's answer to the data collection issue is to not rely on a single tool to detect and escalate intrusions. While a protocol analyzer like Ethereal is well suited to interpret a dozen individual packets, it's not the best tool to understand millions of packets. Turning to session data or statistics on the sorts of ports and addresses is a better way to identify suspicious activity. No scientist studies an elephant by first using an electron microscope! Similarly, while NSM encourages collection of enormous amounts of data, it also recommends the best tool for the job of interpretation and escalation.
Real Time Isn't Always the Best Time
As a captain in the U.S. Air Force, I led the Air Force Computer Emergency Response Team's real-time intrusion detection crew. Through all hours of the night we watched hundreds of sensors deployed across the globe for signs of intrusion. I was so proud of my crew that I made a note on my flight notebook saying, "Real time is the best time." Five years later I don't believe that, although I'm still proud of my crew. Most forms of real-time intrusion detection rely on signature matching, which is largely backward looking. Signature matching is a detection method that relies on observing telltale patterns of characters in packets or sessions. Most signatures look for attacks known to the signature writers. While it's possible to write signatures that apply to more general events, such as an outbound TCP session initiated from an organization's Web server, the majority of signatures are attack-oriented. They concentrate on matching patterns in inbound traffic indicative of exploitation.
The majority of high-end intrusions are caught using batch analysis. Batch analysis is the process of interpreting traffic well after it has traversed the network. Batch analysts may also examine alerts, sessions, and statistical data to discover truly stealthy attackers. This work requires people who can step back to see the big picture, tying individual events together into a cohesive representation of a high-end intruder's master plan. Batch analysis is the primary way to identify "low-and-slow" intruders; these attackers use time and diversity to their advantage. By spacing out their activities and using multiple independent source addresses, low-and-slow attackers make it difficult for real-time analysts to recognize malicious activity.
Despite the limitations of real-time detection, NSM relies on an event-driven analysis model. Event-driven analysis has two components. First, emphasis is placed on individual events, which serve as indicators of suspicious activity. Explaining the difference between an event and an alert is important. An event is the action of interest. It includes the steps taken by intruders to compromise systems. An alert is a judgment made by a product describing an event. For example, the steps taken by an intruder to perform reconnaissance constitute an event. The IDS product's assessment of that event might be its report of a "port scan." That message is an alert.
Alert data from intrusion detection engines like Snort usually provides the first indication of malicious events. While other detection methods also use alert data to discover compromises, many products concentrate on alerts in the aggregate and present summarized results. For example, some IDS products categorize a source address causing 10,000 alerts as more "harmful" than a source address causing 10 events. Frequently these counts bear no resemblance to the actual risk posed by the event. A benign but misconfigured network device can generate tens of thousands of "ICMP redirect" alerts per hour, while a truly evil intruder could trigger a single "buffer overflow" alert. NSM tools, particularly Sguil, use the event-driven model, while an application like ACID relies on the summarization model. (Sguil is an open source NSM interface discussed in Chapter 10.)
The second element of event-driven analysis is looking beyond the individual alert to validate intrusions. Many commercial IDS products give you an alert and that's all. The analyst is expected to make all validation and escalation decisions based on the skimpy information the vendor chose to provide. Event-driven NSM analysis, however, offers much more than the individual alert. As mentioned earlier, NSM relies on alert, session, full content, and statistical data to detect and validate events. This approach could be called holistic intrusion detection because it relies on more than raw alert data, incorporating host-based information with network-based data to describe an event.
Extra Work Has a Cost
IDS interface designers have a history of ignoring the needs of analysts. They bury the contents of suspicious packets under dozens of mouse clicks or perhaps completely hide the offending packets from analyst inspection. They require users to copy and paste IP addresses into new windows to perform IP-to-host-name resolution or to look up IP ownership at the American Registry for Internet Numbers (http://www.arin.net/). They give clunky options to create reports and force analysis to be performed through Web browsers. The bottom line is this: Every extra mouse click costs time, and time is the enemy of intrusion detection. Every minute spent navigating a poorly designed graphical user interface is a minute less spent doing real workidentifying intrusions.
NSM analysts use tools that offer the maximum functionality with the minimum fuss. Open source tools are unusually suited to this approach; many are single-purpose applications and can be selected as best-of-breed data sources. NSM tools are usually customized to meet the needs of the local user, unlike commercial tools, which offer features that vendors deem most important. Sguil is an example of an NSM tool designed to minimize analyst mouse clicks. The drawback of relying on multiple open source tools is the lack of a consistent framework integrating all products. Currently most NSM operators treat open source tools as stand-alone applications.