Data Structure Resizing
Java applications tend to make high use of Java SE's StringBuilder or StringBuffer for assembling Strings and also make high use of Java objects that act as containers of data such as the Java SE Collections classes. Both StringBuilder and StringBuffer use an underlying char[] for their data storage. As elements are added to a StringBuilder or StringBuffer, the underlying char[] data storage, may be subject to resizing. As a result of resizing, a new larger char[] array is allocated, the char elements in the old char[] are copied into the new larger char[] array, and the old char[] discarded, that is, available for garbage collection. Similar resizing can also occur in Java SE Collections classes that use an array for their underlying data store.
This section explores ways to identify data structure resizing, in particular StringBuilder, StringBuffer, and Java SE Collections classes resizing.
StringBuilder/StringBuffer Resizing
When a StringBuilder or StringBuffer becomes large enough to exceed the underlying data storage capacity, a new char array of a larger size, 2x larger in the OpenJDK StringBuilder and StringBuffer implementation (used by Java HotSpot Java 6 JDK/JRE), is allocated, the old char array elements are copied into the new char array, and the old char array is discarded. A version of the implementation used by StringBuilder and StringBuffer follows:
char[] value; int count; public AbstractStringBuilder append(String str) { if (str == null) str = "null"; int len = str.length(); if (len == 0) return this; int newCount = count + len; if (newCount > value.length) expandCapacity(newCount); str.getChars(0, len, value, count); count = newCount; return this; } void expandCapacity(int minimumCapacity) { int newCapacity = (value.length + 1) * 2; if (newCapacity < 0) { newCapacity = Integer.MAX_VALUE; } else if (minimumCapacity > newCapacity) { newCapacity = minimumCapacity; } value = Arrays.copyOf(value, newCapacity); }
Continuing with the fictitious tax payer program example from the previous section (full listing of the source code used in this section can be found in Appendix B in the section "First Resizing Variant"), StringBuilder objects are used to assemble random Strings representing tax payer names, addresses, cities, states, social security numbers, and a tax payer id. It also uses the no argument StringBuilder constructor. Hence, the program is likely to be subject to StringBuilder's underlying char[] being resized. A capture of a memory or heap profile with a profiler such as NetBeans Profiler confirms that is the case. Figure 6-18 shows a heap profile from NetBeans Profiler.
Figure 6-18 Heap profile
In Figure 6-18, you can see that char[], StringBuilder, and String are the most highly allocated objects and also have the largest amount of live objects. In the NetBeans Profiler, selecting and right-clicking on the char[] class name in the far left column as shown in Figure 6-19 shows the allocation stack traces for all char[] objects.
Figure 6-19 Showing allocation stack traces
In the char[] stack allocation traces, shown in Figure 6-20, you can see an entry for java.lang.AbstractStringBuilder.expandCapacity(int), which is called from AbstractStringBuilder.append(char) and AbstractStringBuilder.append(String) methods. The expandCapacity(int) method calls java.util.Arrays.copyOf(char[], int). Looking back at the previous source code listing, you can see where AbstractStringBuilder.append(String str) calls expandCapacity(int) and calls Arrays.copyOf(char[] int).
Figure 6-20 char[] allocations from expanding StringBuilders
You can also see from Figure 6-20, over 11% of the current live char[] objects are from resized StringBuilder char[]. In addition, there are a total of 2,926,048 char[] objects that have been allocated, and of those, 390,988 char[] allocations occurred as a result of StringBuilder char[] resizing. In other words, about 13% (390,988/2,926,048) of all char[] allocations are coming from resized StringBuilder char[]s. Eliminating these char[] allocations from resizing improves the performance of this program by saving the CPU instructions needed to perform the new char[] allocation, copying the characters from the old char[] into the new char[], and the CPU instructions required to garbage collect the old discarded char[].
In the Java HotSpot JDK/JRE distributions, both the StringBuilder and StringBuffer offer no argument constructors that use a default size of 16 for their underlying char array data storage. These no argument constructors are being used in this program. This can be seen in the profile by expanding the java.lang.AbstractStringBuilder.<init>(int) entry seen in Figure 6-20. The expansion of the java.lang.AbstractStringBuilder.<init>(int) entry, shown in Figure 6-21, shows it is called by a no argument StringBuilder constructor.
Figure 6-21 Uses of StringBuilder default constructor
In practice, few StringBuilder or StringBuffer object instances result in having consumed 16 or fewer char array elements; 16 is the default size used with the no argument StringBuilder or StringBuffer constructor. To avoid StringBuilder and StringBuffer resizing, use the explicit size StringBuilder or StringBuffer constructor.
A modification to the example program follows, which now uses explicit sizes for constructing StringBuilder objects. A full listing of the modified version can be found in Appendix B in the section "Second Resizing Variant."
public static String getRandomTaxPayerId() { StringBuilder sb = new StringBuilder(20); for (int i = 0; i < 20; i++) { int index = threadLocalRandom.get().nextInt(alphabet.length); sb.append(alphabet[index]); } return sb.toString(); } public static String getRandomAddress() { StringBuilder sb = new StringBuilder(24); int size = threadLocalRandom.get().nextInt(14) + 10; for (int i = 0; i < size; i++) { if (i < 5) { int x = threadLocalRandom.get().nextInt(8); sb.append(x + 1); } int index = threadLocalRandom.get().nextInt(alphabet.length); char c = alphabet[index]; if (i == 5) { c = Character.toUpperCase(c); } sb.append(c); } return sb.toString(); }
Recent optimizations in Java 6 update releases of the Java HotSpot VM analyze the usage of StringBuilder and StringBuffer and attempt to determine the optimal char array size to use for a given StringBuilder or StringBuffer object allocation as means to reduce unnecessary char[] object allocations resulting from StringBuilder or StringBuffer expansion.
Measuring the performance impact after addressing StringBuilder and StringBuffer resizing will be done in combination with addressing any Java Collection classes resizing, the topic of the next section.
Java Collections Resizing
The addition of the Java Collections to Java SE offered an enormous boost to developer productivity by providing containers with interfaces allowing the ability to easily switch between alternative concrete implementations. For example, the List interface offers an ArrayList and LinkedList concrete implementation.
Some of the Collections' concrete implementations are subject to potential expensive resizing as the number of elements added to the Collection grows such as ArrayList, Vector, HashMap, and ConcurrentHashMap since their underlying data store is an array. Other Collections such as LinkedList or TreeMap often use one or more object references between the elements stored to chain together the elements managed by the Collection. The former of these, those that use an array for the Collection's underlying data store, can be subject to performance issues when the underlying data store is resized due to the Collection growing in the number of elements it holds. Although these Collections classes have constructors that take an optional size argument, these constructors are often not used, or the size provided in an application program is not optimal for the Collection's use.
As is the case with StringBuilder or StringBuffer, resizing of a Java Collections class that uses an array as its data storage requires additional CPU cycles to allocate a new array, copy the old elements from the old array, and at some point in the future garbage collect the old array. In addition, the resizing can also impact Collection's field access time, the time it takes to dereference a field, because a new underlying data store, again typically an array, for the Collection's underlying data store may be allocated in a location in the JVM heap away from the object references stored within the data store and the other fields of the Collection. After a Collection resize occurs, it is possible an access to its resized field can result in CPU cache misses due to the way a modern JVM allocates objects in memory, in particular how those objects are laid out in memory. The way objects and their fields are laid out in memory can vary between JVM implementations. Generally, however, since an object and its fields tend to be referenced frequently together, an object and its fields laid out in memory within close proximity generally reduce CPU cache misses. Hence, the impact of Collections resizing (this also applies to StringBuffer and StringBuilder resizing) may extend beyond the additional CPU instructions spent to do the resizing and the additional overhead put on the JVM's memory manager to having a lingering higher field access time due to a change in the layout of the Collection's fields in memory relative the Collection object instance.
The approach to identifying Java Collections resizing is similar to what was described earlier for identifying StringBuilder and StringBuffer resizing, collecting heap or memory profile with a profiler such as NetBeans Profiler. Looking at the source code for the Java Collection classes helps identify the method names that perform the resizing.
Continuing with the fictitious tax payer program, the program variant in which tax payer records were populated into multiple HashMaps using a tax payer's state of residence as a key into a second HashMap where a tax payer's id is used as an index is a good example of where Collections resizing can occur. A full source code listing from this variant can be found in Appendix B in the section "First Resizing Variant." The source code, found in TaxPayerBailoutDbImpl.java, that allocates the HashMaps follows:
private final Map<String, Map<String,TaxPayerRecord>> db; public TaxPayerBailoutDbImpl(int numberOfStates) { db = new HashMap<String,Map<String,TaxPayerRecord>>(); for (int i = 0; i < numberOfStates; i++) { Map<String,TaxPayerRecord> map = Collections.synchronizedMap( new HashMap<String,TaxPayerRecord>()); db.put(BailoutMain.states[i], map); } }
Here you can see the HashMaps are using a HashMap constructor that takes no arguments. As a result, the HashMap relies on a default size for its underlying mapping array. The following is a portion of OpenJDK's HashMap.java source code that shows the default size chosen for a HashMap's underlying data storage.
static final int DEFAULT_INITIAL_CAPACITY = 16; static final float DEFAULT_LOAD_FACTOR = 0.75f; public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR); table = new Entry[DEFAULT_INITIAL_CAPACITY]; init(); } void init() { }
Two factors decide when the data storage for a HashMap is resized: the capacity of the data storage and the load factor. The capacity is the size of the underlying data storage. That's the HashMap.Entry[]'s size. And the load factor is a measure of how full the HashMap is allowed to reach before the HashMap's data storage, the Entry[], is resized. A HashMap resize results in a new Entry[] being allocated, twice as large as the previous Entry[], the entries in the Entry[] are rehashed and put in the Entry[]. The CPU instructions required to resize a HashMap are greater than what is required by StringBuilder or StringBuffer resizing due to the rehashing of the Entry[] elements.
In Figure 6-18, you can see a row for java.util.HashMap$Entry[]. For this entry you can see there are 67 allocated objects, and 37 of them are live at the time of the profile snapshot. This suggests that 37/67, about 55%, are still live. That also suggests 45% of those Entry[] objects that had been allocated have been garbage collected. In other words, the HashMaps are experiencing resizing. Notice that the total bytes consumed by HashMap.Entry[] objects is much less than those consumed by char[] objects. This suggests the impact of eliding the HashMap resizing is likely to be less than the impact realized from eliding the StringBuilder resizing.
Figure 6-22 shows the allocation stack traces for HashMap.Entry[]. Here you can see some of those HashMap.Entry[] allocations result from a HashMap.resize(int) method call. In addition, you can see the no argument HashMap constructor is being used, which also allocates a HashMap.Entry[].
Figure 6-22 HashMap.Entry[] allocation stack traces
Since this example program populates 50 different HashMaps with a total of 2,000,000 fictitious records, each of those 50 HashMaps hold about 2,000,000 / 50 = 40,000 records. Obviously, 40,000 is much greater than the default size of 16 used by the no argument HashMap constructor. Using the default load factor of .75, and the fact that each of the 50 HashMap holds 40,000 records, you can determine a size for the HashMaps so they will not resize (40,000 / .75 = ~ 53,334). Or simply passing the total number of records to store divided by the number of states, divided by the default load factor, i.e., (2,000,000 / 50) / .75, to the HashMap constructor that holds the records. Following is the modified source code for TaxPayerBailoutDbImpl.java that elides HashMap resizing:
private final Map<String, Map<String,TaxPayerRecord>> db; private final int dbSize = 2000000; public TaxPayerBailoutDbImpl(int dbSize, int numberOfStates) { final int outerMapSize = (int) Math.ceil(numberOfStates / .75); final int innerMapSize = (int) (Math.ceil((dbSize / numberOfStates) / .75)); db = new HashMap<String,Map<String,TaxPayerRecord>>(outerMapSize); for (int i = 0; i < numberOfStates; i++) { Map<String,TaxPayerRecord> map = Collections.synchronizedMap( new HashMap<String,TaxPayerRecord>(innerMapSize)); db.put(BailoutMain.states[i], map); } }
In this example program, both StringBuilder and HashMap resizing occur during the initialization phase of the program, the phase of the program that populates a Map of Maps with fictitious, randomly generated tax payer records. Hence, to measure the performance impact of eliding the StringBuilder and HashMap resizing, the initialization phase of this program has been instrumented with a time stamp at the beginning of the program and after the Map of Maps has been populated. A modified version of this example program, one that uses the no argument HashMap constructor, calculates and reports the time it takes to populate the HashMaps with 2,000,000 records, can be found in Appendix B in the section "First Resizing Variant."
When this variant of the program is run on a Sun SPARC Enterprise T5120 Server configured with 64 virtual processors (the same value as that returned by the Java API Runtime.availableProcessors()), the amount of time it takes to complete the initialization phase is 48.286 seconds.
Updating this program variant with the changes described in this section to address both StringBuilder and HashMap resizing and running on the same UltraSPARC T5120 system with the same JVM command line options reports it takes 46.019 seconds to complete its initialization phase. That's about a 5% improvement in elapsed time. The source code for this variant can be found in Appendix B in the section "Second Resizing Variant."
Applying the data resizing strategy reduces the application's path length, the total number of CPU instructions required to execute the program, and potentially more efficient use of CPU cycles due to fewer possibilities of CPU cache misses as a result of frequently accessed data structure fields being laid out in memory next to each other.
You may have noticed that the initialization phase in this program is single threaded. But the system it is being executed on has a CPU that is multicore and multithreaded per core. The Sun SPARC Enterprise T5120 Server this program is executing on has 8 cores, and 8 hardware threads per core. It is a chip multithreading type of CPU chip, CMT for short. In other words, 8 cores and 8 hardware threads per core means it has 64 virtual processors. That also means the Java API, System.availableProcessors(), returns a value of 64. A next step to improve the performance of the initialization phase of this program is to refactor it to utilize all of those 64 virtual processors. This is the topic of the next section.