Increasing Parallelism
Modern CPU architectures have brought multiple cores and multiple hardware execution threads to developer desktops. This means there are more CPU resources available to do additional work. However, to take advantage of those additional CPU resources, programs executed on them must be able to do work in parallel. In other words, those programs need to be constructed or designed in a multithreaded manner to take advantage of the additional hardware threads.
Java applications that are single threaded cannot take advantage of additional hardware threads on modern CPU architectures. Those applications must be refactored to be multithreaded to do their work in parallel. In addition, many Java applications have single-threaded phases, or operations, especially initialization or startup phases. Therefore, many Java applications can improve initialization or startup performance by doing tasks in parallel, that is, making use of multiple threads at the same time.
The example program used in the previous sections "Lock Contention" and "Data Structure Resizing" has a single-threaded initialization phase where random fictitious tax payer records are created and added to a Java Map. This single-threaded initialization phase could be refactored to being multithreaded. The single-threaded form, as it was run in the "Lock Contention" and "Data Structure Resizing" sections, when run on the same Sun SPARC Enterprise T5120 Server, reports it takes about 45 to 48 seconds for the initialization phase to complete. Since there are 64 virtual processors on an a Sun SPARC Enterprise T5120 Server, 63 of those 64 virtual processors are idle doing little or no work during the initialization phase. Therefore, if the initialization phase could be refactored to utilize those additional 63 virtual processors, the elapsed time it takes to execute the initialization phase should be significantly less.
The key to being able to refactor single-threaded phases of a program to be multithreaded is constrained by the program's logic. If there is a loop of execution involved, and much of the work performed within that loop is independent of what happens within each loop iteration, it may be a good candidate to be refactored into a multithreaded version. In the case of the fictitious tax payer program, Map records are added to a ConcurrentMap. Since a ConcurrentMap can handle multiple threads adding records to it and the records can be created independently of each other, the work performed in the single-threaded loop can be broken up and spread among multiple threads. With a Sun SPARC Enterprise T5120 Server that has 64 virtual processors, the work that is being done in the single-threaded loop could be spread across those 64 virtual processors.
Here is the core part of the single-threaded loop logic (full implementation can be found in Appendix B in the section "Increasing Parallelism Single-Threaded Implementation"):
// allocate the database TaxPayerBailoutDB db = new TaxPayerBailoutDbImpl(dbSize); // allocate list to hold tax payer names List<String>[] taxPayerList = new ArrayList[numberOfThreads]; for (int i = 0; i < numberOfThreads; i++) { taxPayerList[i] = new ArrayList<String>(taxPayerListSize); } // populate the database and tax payer list with random records populateDatabase(db, taxPayerList, dbSize); ... private static void populateDatabase(TaxPayerBailoutDB db, List<String>[] taxPayerIdList, int dbSize) { for (int i = 0; i < dbSize; i++) { // make random tax payer id and record String key = getRandomTaxPayerId(); TaxPayerRecord tpr = makeTaxPayerRecord(); // add tax payer id & record to database db.add(key, tpr); // add tax payer id to to tax payer list int index = i % taxPayerIdList.length; taxPayerIdList[index].add(key); } }
The core part of refactoring the for/loop to be multithreaded results in creating a Runnable, or Callable, along with an ExecutorService to execute the Runnables or Callables in addition to ensuring the implementation of a TaxPayerBailoutDB and taxPayerIdList are thread safe. That is, the data they hold will not be corrupted as a result of having multiple threads writing data to them simultaneously. Following are segments of source code that contain the most relevant parts to the multithreaded refactoring (full implementation can be found in Appendix B in the section "Increasing Parallelism Multithreaded Implementation"):
// allocate the database TaxPayerBailoutDB db = new TaxPayerBailoutDbImpl(dbSize); List<String>[] taxPayerList = new List[numberOfThreads]; for (int i = 0; i < numberOfThreads; i++) { taxPayerList[i] = Collections.synchronizedList( new ArrayList<String>(taxPayerListSize)); } // create a pool of executors to execute some Callables int numberOfThreads = System.availableProcessors(); ExecutorService pool = Executors.newFixedThreadPool(numberOfThreads); Callable<DbInitializerFuture>[] dbCallables = new DbInitializer[numberOfThreads]; for (int i = 0; i < dbCallables.length; i++) { dbCallables[i] = new DbInitializer(db, taxPayerList, dbSize/numberOfThreads); } // start all db initializer threads running Set<Future<DbInitializerFuture>> dbSet = new HashSet<Future<DbInitializerFuture>>(); for (int i = 0; i < dbCallables.length; i++) { Callable<DbInitializerFuture> callable = dbCallables[i]; Future<DbInitializerFuture> future = pool.submit(callable); dbSet.add(future); } // A Callable that will execute multi-threaded db initialization public class DbInitializer implements Callable<DbInitializerFuture> { private TaxPayerBailoutDB db; private List<String>[] taxPayerList; private int recordsToCreate; public DbInitializer(TaxPayerBailoutDB db, List<String>[] taxPayerList, int recordsToCreate) { this.db = db; this.taxPayerList = taxPayerList; this.recordsToCreate = recordsToCreate; } @Override public DbInitializerFuture call() throws Exception { return BailoutMain.populateDatabase(db, taxPayerList, recordsToCreate); } } static DbInitializerFuture populateDatabase(TaxPayerBailoutDB db, List<String>[] taxPayerIdList, int dbSize) { for (int i = 0; i < dbSize; i++) { String key = getRandomTaxPayerId(); TaxPayerRecord tpr = makeTaxPayerRecord(); db.add(key, tpr); int index = i % taxPayerIdList.length; taxPayerIdList[index].add(key); } DbInitializerFuture future = new DbInitializerFuture(); future.addToRecordsCreated(dbSize); return future; }
After applying the refactoring to make the initialization phase multithreaded by dividing up the number of records to be added to the Map to run in 64 threads rather than 1 thread, the time it takes to perform the initialization phase drops from about 45 seconds to about 3 seconds on the Sun SPARC Enterprise T5120 Server. A higher clock rate dual or quad core desktop system may not observe as much of an improvement. For example, the author's dual core desktop system realized about a 4 second improvement, 16 seconds down to about 12. The larger the number of virtual processors that additional parallel work can be spread among, the greater the potential performance improvement.
This simple example illustrates the potential benefit of being able to take advantage of additional virtual processors on a system that may be idle for some phase of an application by making that phase multithreaded.