- Rules
- Risk Assessment Summary
- IDS00-J. Sanitize untrusted data passed across a trust boundary
- IDS01-J. Normalize strings before validating them
- IDS02-J. Canonicalize path names before validating them
- IDS03-J. Do not log unsanitized user input
- IDS04-J. Limit the size of files passed to ZipInputStream
- IDS05-J. Use a subset of ASCII for file and path names
- IDS06-J. Exclude user input from format strings
- IDS07-J. Do not pass untrusted, unsanitized data to the Runtime.exec() method
- IDS08-J. Sanitize untrusted data passed to a regex
- IDS09-J. Do not use locale-dependent methods on locale-dependent data without specifying the appropriate locale
- IDS10-J. Do not split characters between two data structures
- IDS11-J. Eliminate noncharacter code points before validation
- IDS12-J. Perform lossless conversion of String data between differing character encodings
- IDS13-J. Use compatible encodings on both sides of file or network I/O
IDS04-J. Limit the size of files passed to ZipInputStream
Check inputs to java.util.ZipInputStream for cases that cause consumption of excessive system resources. Denial of service can occur when resource usage is disproportionately large in comparison to the input data that causes the resource usage. The nature of the zip algorithm permits the existence of zip bombs where a small file, such as ZIPs, GIFs, or gzip-encoded HTTP content consumes excessive resources when uncompressed because of extreme compression.
The zip algorithm is capable of producing very large compression ratios [Mahmoud 2002]. Figure 2–1 shows a file that was compressed from 148MB to 590KB, a ratio of more than 200 to 1. The file consists of arbitrarily repeated data: alternating lines of a characters and b characters. Even higher compression ratios can be easily obtained using input data that is targeted to the compression algorithm, or using more input data (that is untargeted), or other compression methods.
Any entry in a zip file whose uncompressed file size is beyond a certain limit must not be uncompressed. The actual limit is dependent on the capabilities of the platform.
This rule is a specific instance of the more general rule MSC07-J.
Noncompliant Code Example
This noncompliant code fails to check the resource consumption of the file that is being unzipped. It permits the operation to run to completion or until local resources are exhausted.
static final int BUFFER = 512; // ... // external data source: filename BufferedOutputStream dest = null; FileInputStream fis = new FileInputStream(filename); ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis)); ZipEntry entry; while ((entry = zis.getNextEntry()) != null) { System.out.println("Extracting: " + entry); int count; byte data[] = new byte[BUFFER]; // write the files to the disk FileOutputStream fos = new FileOutputStream(entry.getName()); dest = new BufferedOutputStream(fos, BUFFER); while ((count = zis.read(data, 0, BUFFER)) != -1) { dest.write(data, 0, count); } dest.flush(); dest.close(); } zis.close();
Compliant Solution
In this compliant solution, the code inside the while loop uses the ZipEntry.getSize() method to find the uncompressed file size of each entry in a zip archive before extracting the entry. It throws an exception if the entry to be extracted is too large—100MB in this case.
static final int TOOBIG = 0x6400000; // 100MB // ... // write the files to the disk, but only if file is not insanely big if (entry.getSize() > TOOBIG) { throw new IllegalStateException("File to be unzipped is huge."); } if (entry.getSize() == -1) { throw new IllegalStateException( "File to be unzipped might be huge."); } FileOutputStream fos = new FileOutputStream(entry.getName()); dest = new BufferedOutputStream(fos, BUFFER); while ((count = zis.read(data, 0, BUFFER)) != -1) { dest.write(data, 0, count); }
Risk Assessment
Rule |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
IDS04-J |
low |
probable |
high |
P2 |
L3 |
Related Guidelines
MITRE CWE |
CWE-409. Improper handling of highly compressed data (data amplification) |
Secure Coding Guidelines for the Java |
Guideline 2-5. Check that inputs do not cause |
Programming Language, Version 3.0 |
excessive resource consumption |
Bibliography
[Mahmoud 2002] |
Compressing and Decompressing Data Using Java APIs |