- Understanding Key Points of the Follow-Up Phase
- Acquiring the Evidence
- Authenticating, Preserving, and Analyzing Incident Data
- Conducting Post-Incident Activities
- Using Legal, Investigative, and Government Recourses
- Article Series
- References
- Acknowledgments
- About the Author
- Ordering Sun Documents
- Accessing Sun Documentation Online
Conducting Post-Incident Activities
There are other activities to perform following the recovery of an incident that must be supervised by the geo-based security officer and tracked by the organization's worldwide security manager.
Inventory of System Assets
Some customers might not have equipment owned by the incident servicing organization at their sites. However, for those customers who do lease or maintain vendor-owned equipment, an inventory of the servicing vendor organization's system assets should be maintained per site by the customer so that a record is available to the incident response team's organization, if necessary, for examination during an incident investigation. In addition, the enterprise IT department must maintain an inventory of the pertinent LAN, VLAN, WAN, and WLAN systems that are connected with the organization's constituents. The organization's line management and worldwide security team should also monitor these systems.
Vulnerability Discovery and Removal
Some or all of the discovery and removal of vulnerabilities might occur in the eradication and/or recovery phases. However, the following items can also be identified during the follow-up phase:
New unauthorized user accounts
Processes owned by unfamiliar users
Modified or deleted data and binaries of system or application executables or libraries
Denial of services (for example, a customer's system suddenly goes into single user mode)
Poor system performance
Accounting discrepancies
Suspicious login attempts
While network-level snooping tools passively observe and analyze network activity, network vulnerability scanners actively send packets over a network in search of vulnerabilities and malicious code on the hosts of the network.
With readily available tools and information on the Internet, finding specific vulnerabilities of an operating system is easy. A person with a malicious intent can find information about vulnerabilities from one of the several organizational sites that provide full disclosure, such as CVE (http://cve.mitre.org). The hacker can then trace a vulnerability number from CVE to a source code clip on the SecurityFocus web site (http://www.securityfocus.com). He or she can then find corresponding detailed instructions regarding the exploit on the SANS web site (http://www.sans.org) or some other public site. This can all be done in minutes.
Backdoors and Malicious Code
The VCSIRT should detect possible backdoors and malicious code introduced into the customer's network. An example is a rootkit, a set of software tools or scripts that enable a hacker to reenter the network for further misuse or damage. For more information on backdoors and malicious code, refer to the second article in this series, "Responding to a Customer's Security IncidentsPart 2: Executing a Policy."
Enabling Vulnerabilities
Enabling vulnerabilities are usually not caused by any malicious act. They tend to be genuine mistakes by system administrators and/or users. They also enable intruders to reenter the system or network using configuration weaknesses. Examples of enabling vulnerabilities on UNIX systems are poor or commonly used passwords, unused guest accounts, accounts with default passwords, misconfigured anonymous FTP, and inappropriate settings or entries in files such as /etc/ttys/, /etc/ttytab, or /etc/aliases. In addition, unpatched system software can also present unforeseen vulnerabilities.
Combination Threats
Recently seen combination or blended threats such as CodeRed, Nimda, and BugBear combined the characteristics of viruses, worms, trojan horses, and malicious code with server and Internet vulnerabilities to initiate, transmit, and spread an attack. By using distinct methods and techniques, these threats often spread rapidly causing widespread damage.
For example, the most recent BugBear.B mass mailer worm exploited a vulnerability in Internet Explorer. After it entered the network perimeter, it spread by using network shares. It automatically executed when the email message was previewed by an unpatched vulnerable host, so simply receiving the email triggered an infection because it emailed itself with a spoofed From address. The worm was polymorphic in nature. It also attempted to evade antivirus applications and to deactivate them, along with host or desktop-based firewall applications. It also flooded printing devices. Another example of a blended threat is the ISS/sadmind worm that attacked the Windows and Solaris operating systems.
Vulnerability Best Practices
The primary concepts for best practices against vulnerabilities are as follows:
A CSIRT must employ tools and procedures to verify and execute up-to-date vendor patches on all the customer systems in the response team's constituency.
A thorough scan with multiple vendor tools is effective.
Employees must be screened, and IT responsibilities must be segregated.
Attacks and attack patterns
Events (realize that a single event might consist of multiple attacks)
Unique attackers (with single IP address or a set of repeatedly used addresses)
Segregation and classification process of events and attacks
Determination of vulnerabilities, their severity levels, and counts in a specified period
Automated references to industry-known vulnerability indices such as CVEs or IAVAs
At least two levels of summary reportsone technical for IT administrators and one for management are necessary.
For more information, refer to "A Patch Management Strategy for the Solaris Operating Environment" (Sun BluePrints OnLine, January 2003).
For example, for mission critical or business critical environments, the recommendation is to use three patch rollout schemes: regularly scheduled, rapid, and emergency. For an emergency rollout, testing is done in a few hours in the unit and integration test environment before rolling it out.
The reason is that the customer should not fall into a trap by using a single vendor's tools that can succumb to some oversights by that vendor. The scans should include a thorough check of all possible vulnerabilities known within the customer's network involving hosts from multiple vendors.
Nessus (http://www.nessus.org) is well-maintained and widely popular. It scans systems and evaluates vulnerabilities present in services offered by that system. Although it has command-line and GUI modes, its GUI mode is more convenient. There are other related tools that can be useful (for example, nessQuick, a tool to manage Nessus reports using a database, and NessusWeb, a web interface to Nessus). It now includes better use of CERT advisories and CVE (common vulnerabilities and exposures) references.
There are some other excellent tools in the public domain. SATAN and Nmap are popular as network scanners. SATAN, which has undergone many changes since its initial release, is limited in its usage. A better version, SAINT, is aggressively maintained by the SAINT Corporation (http://www.saintcorporation.com) to keep up with the latest vulnerabilities. When vulnerabilities are detected, the SAINT vulnerability scanner categorizes the results in several ways, allowing customers to target the data they find most useful. SAINT can group vulnerabilities according to severity, type, or count. It can also provide information about a particular host or group of hosts. SAINT describes each of the vulnerabilities it locates, references CVEs, CERT advisories, and information assurance vulnerability alerts (IAVAs). It also describes ways to correct the vulnerabilities. In many cases, the SAINT vulnerability scanner provides links to sites where you can download patches or new versions of software that will eliminate the detected vulnerabilities.
Nmap is a superb low-level port scanner and a must-have for any CSIRT to scan different protocols (for example, TCP, UDP, ICMP, RPC, and Reverse-Ident). For more information, refer to the http://www.insecure.org/nmap site.
Insider attacks are usually a significant percentage of the total registered attacks in any thoroughly collected statistical security incidents data. This warrants cautious screening of employees without upsetting their rights to privacy. IT responsibilities should be segregated for clear accountability and internal auditing must take place routinely.
Customers must be advised on careful categorization of event data from devices such as firewalls and intrusion detection systems. Of particular importance are:
In addition, reporting should contain information for the evaluation of a solution (for example, a software patch) against a threat and the potential impact on the stability of the monitored device. There should be a policy statement for actions to be taken, channel of communication, time period, and responsible contacts. TABLE 4 contains a brief report policy implementation example.
TABLE 4 Reporting Recommendations
Severity and Issue |
Action |
Contact |
Communication |
Critical: DNS00023 |
Immediately install of patches and/or other solutions to fix internal DNS server (in less than 24 hours) |
Security policy owner and system administrator |
Within eight hours |
Moderately critical: WebServer00405 |
Install patches and/or other solutions on web server in the data center (in less than 48 hours) |
Security policy owner and system administrator |
Within 16 hours |
Noncritical: Solaris10067 |
In the next maintenance window, update the Solaris 9 OS kernel on the directory server in the DMZ (in less than one week) |
System administrator |
Within 48 hours |
Host-based analysis tools help customers learn things about specific systems when following up on an incident. They are useful to detect malicious code or backdoors. In addition, they can assist in the detection of unauthorized changes to system files or applications. Tripwire is a useful tool in this regard.
Damage Assessment
As soon as the security breach has occurred, the entire system, its network, and all of its components should be considered suspect by the organization's VCSIRT. The analysis of the damage and extent can be time consuming, but the organization's customer account manager and the assigned security officer must drive the work with two main reasons in mind: it could lead to some insight into the motives and nature of the incident, and most prosecutors ask for an estimate of the loss when discussing a case or while considering sentencing guidelines.
A security incident of any kind has several cost components associated with it. The lead of the response team (VCSIRT) and the geo-based security officer, has to determine the extent of damage due to the break-in and notify the customer's site administrator, users, and servicing organization's worldwide security team. The damage should be taken into account for future precautions and risk assessment, clean up, and estimating the affect on users. Loss to the customer's enterprise in the form of decreased productivity must also be considered.
The proposed Senate Bill S2448, "The Internet Integrity and Critical Infrastructure Protection Act," clarifies how loss should be calculated. It states that "the term 'loss' means any reasonable cost to any victim, including the cost of responding to an offense, conducting a damage assessment, and restoring the data, program, system, or information to its condition prior to the offense, and any revenue lost, cost incurred, or other consequential damages incurred because of interruption of service." Thus, the costs that must be tallied include:
Time spent by all the servicing organization's advisory personnel (such as the SAG), engaged VCSIRTs, the worldwide security team, and the customer's staff in cleaning up the damage and bringing systems back online (with tasks such as analyzing what has occurred, re-installing the operating system, restoring installed programs and data files).
Lost productivity by system, network, and site administrators and end users who were prevented from using the systems during down time or during any DoS attacks associated with these individuals using compromised systems or networks at the affected site.
Replacement of hardware, software, and/or other material or intellectual property that was damaged or stolen.
To assess the damage, you must determine the needed changes for a tainted system. That, in turn, will help estimate the amount of work involved. The following list contains some examples:
Employing checksums of all associated media and usage of tools that provide before and after comparisons
Looking at all centralized and decentralized logs for abnormalities
Examining patterns of system usage in system accounting records for abnormalities
The following two tables contain examples of cost calculations for the VCSIRT and the users of the affected customer site. The cost and hour value must be based on estimated salaries plus overhead and any indirect cost. Cost calculation must take into account the variance based on known, determinable factors for cost to the servicing team members and the cost to the users. (The variance is shown in the table as x% and y%.)
TABLE 5 Security Incident Cost Analysis for CSIRT for Incident Tracking #100001
VCSIRT Worker |
Hours |
Cost per Hour |
Total |
-x% (with variance) |
+x% (with variance) |
Geo-based security officer |
|
|
|
|
|
Security engineer |
|
|
|
|
|
Field systems engineer |
|
|
|
|
|
Total labor cost |
|
|
|
|
|
Median cost +/- x% |
|
|
|
|
|
TABLE 6 Security Incident Cost Analysis for CSIRT for Incident Tracking #100001
Number of Users |
Hours |
Cost per Hour |
Total |
-y% |
-y% |
Web site users |
|
|
|
|
|
Application users |
|
|
|
|
|
System administrators |
|
|
|
|
|
Total cost (lost productivity) |
|
|
|
|
|
Median cost to users +/- y% |
|
|
|
|
|
In addition to the above costs, other relevant costs such as any materials used, travel, PR, legal, and investigative or government agency consulting must be taken into account in a similar way with variances.
Risk Analysis
A new risk analysis must be conducted by the organization's Security Advisory Group (SAG), working with the organization's worldwide security team. No matter what risk analysis process is used (for example, qualitative versus quantitative), the overall method should remain the same. In general, the process should include the following activities:
Identifying the asset to be reviewed at the customer site
Ascertaining the threats and associated risks
Determining priorities on the risks
Implementing corrective measures
Monitoring the effectiveness of the controls or corrective measures implemented earlier
Fundamentally, risk analysis can be used to review any task, project, or idea. It could also relate to a recent event (that is, a lesson learned). As an arbitrary example, after a conversation with a Japanese JPCERT, one member of the Italian VCSIRT had inadvertently distributed information about a serious bug in the a Japanese vendor's operating system. Later, this turned out to be false information. The vendor was not pleased, and now the VCSIRT's parent organization and its enterprise are liable for wrongful disclosure.
Consider a hypothetical example. A VCSIRT member advised its customer to modify and reorder its boundary firewall rules to solve an IP-level filter performance problem. However, the fix silently opened up the customer's LAN to Internet intruders, which could subsequently result in a break-in. So, processes need to be fixed with regard to the team's internal and customer's review cycles, as appropriate. Information disclosure and configuration changes must reduce future liability risks arising from wrongful disclosures and advice.
In the context of the follow-up phase, two things need to happen:
First, the servicing organization's SAG should use risk analysis to determine if a security architecture, design, development, process, or procedural project should be undertaken to improve the security of the entity where the root cause of the compromise (that triggered the incident response process and risk analysis) was determined.
Second, the SAG's advice to the worldwide security team should follow up with a decision making process within the team, keeping in mind that the risk analysis process is required to support the business or mission of the customer's enterprise.
Lessons Learned
As depicted in FIGURE 2, lessons need to be captured throughout the incident response process and then fed back into the process at every step, as deemed necessary. A lesson learned in the Recovery phase could suggest improvements in the Evaluation and Containment phases.
FIGURE 2 Lessons Learned in the Computer Security Incident Response Process
There is no apparent or predefined correlation as to which phase's input will affect improvements in other phases. The organization's geo-based security officers need to oversee the capturing of this vital information. The geo-based security officers should work with the geo-based customer account managers to understand the lesson as a result of the incident, and document it clearly in a standard, agreed upon format.
The content of the document must include:
Incident description with date and time
Location, site, network, and host system
Names of all of the parties involved
Recommended solutions
Implemented solutions, with details about who, when, and how it was implemented, as well as which solutions succeeded and which failed
Pending issues that need further investigation, if any
Recommended changes in the security policy
You should include the Lessons Learned document in the revised security plan to prevent a similar incident from recurring and for review by the organization's SAG. You should also communicate any changes over the organization's security alias and ensure that all of the organization's security personnel are aware of changes in the policy, procedure, or administrative practice.
Upgrades of Policy, Processes, and Procedures
The organization's SAG is responsible for advising on upgrades to the policies, processes, and procedures. The procedure should be well-documented in the security incident response policy. The following are some specific examples of actions that need to be taken:
Establishing mechanisms for updates of policies, procedures, and tools. The customer's enterprise corporate security principles must be considered by the servicing VCSIRT.
Employing standardized security review specifically when introducing or upgrading to new applications and business partners in the organization's product or services delivery infrastructure.
Establishing channels of communication to make all of the appropriate teams of the organization aware of the latest upgrades to the security incident response policy. At least a quarterly meeting must be held by the incident servicing geo-based security officer and monitored by the organization's worldwide security manager.