The Great Rat Race: Java

Showing posts with label Java. Show all posts

Wednesday, January 18, 2012

Singleton Pattern - The Correct Usage

The singleton pattern is perhaps the most widely (mis)used pattern. Developers who program in object-oriented languages use the singleton pattern whenever they think of a service or manager class. I have seen a lot of projects use singletons even when the singleton class never encapsulates any private data that needs to be handled for multiple instantiations. Usage of the singleton pattern - with a private constructor and a getInstance() method - ensures that only a single instance of a class exists and all threads use the single instance. A database manager class that maintains a local data cache is an ideal candidate for a singleton, as we wouldn't want to maintain multiple local caches; the singleton cache should be loaded once and preserved in memory for faster access. Operations on this cache should also be thread-safe. If a class does not hold and control access to private data, then simple object instantiation makes perfect sense.

Another common mistake with the singleton pattern implementation is doing the object instantiation in the getInstance() method by making it thread-safe and having a check to see if the object is already instantiated. Given below is the implementation logic in Java. The method getInstance() is made thread-safe by using the synchronized keyword. This approach degrades the application performance as a lock is obtained on the object every time a thread calls the getInstance() method.

MySingleton mySingleton = null;

public synchronized MySingleton getInstance() {
if (mySingleton == null) {
mySingleton = new MySingleton();
}
return mySingleton
}

A better alternative is to instantiate the singleton object at the time of declaration. In this case, the ClassLoader instantiates the singleton when it comes across its reference, well before any thread is active in the application. Since the getInstance() is no longer synchronized, this offers better performance.

MySingleton mySingleton = new MySingleton(); // Object instantiated during declaration

// No need to synchronize method
public MySingleton getInstance() {
return mySingleton
}

The concurrency-optimized implementation of the Singleton pattern is explained in many Java Concurrency books, but hopefully now you won't have to read the whole book to learn this neat trick.

Tuesday, December 14, 2010

Concurrency - Optimistic vs. Pessimistic Approach

Whenever developers think of concurrency, the first thing that comes to their mind is semaphores and mutex that provide serial access to a critical section of code. Most languages provide an extensive API for thread synchronization and very often folks just start using the synchronization primitive without much thought into the what they are trying to accomplish. As an example, the most abused concurrency primitive is the "synchronized" keyword provided by Java, which is often put anywhere and everywhere that a developer feels that there is a possibility of concurrent access. "Synchronized" is a monitor, and as such, it doesn't require explicit lock and release statements, as a semaphore or mutex would. This is why perhaps people generally add the synchronized keyword to methods, whenever they feel that the method does something that needs protection from concurrent access. It is not uncommon to come across instances of deeply nested method calls, with each method having a synchronized keyword in the declaration. Synchronized implicitly obtains and releases a lock on the object every time the method with the modifier is called. This is a computationally intensive operation that makes the application slower than it needs to be. Now Java 5 provides some powerful concurrency primitives, but before jumping the bandwagon and starting to use those primitives all over the code, it is better to evaluate the concurrency needs of the application that is being built.

There are generally two approaches to handle concurrency in a software program, each with its pros and cons. An engineering team should consider and evaluate both approaches and decide to use either one, or both, based on the needs of the product they are building. The two approaches are:

Optimistic approach: In this approach, there are no semaphores or mutex to protect a critical section of code that handles the shared data. There is a master copy of the shared data, with each thread getting a local copy to work on. When a thread wishes to update its local copy of the data, the local copy is compared with the master copy to ascertain if the data has been modified since it was last read by the thread. If not, then the update is successful; however, if the data has indeed been modified, then a concurrent modification exception is thrown and the user is expected to re-apply the modifications on the new copy of the data. This approach is common in databases, and it is also used by Java for collections that are not thread-safe by default (HashMap, HashSet, ArrayList).

Pros:

Due to the absence of semaphores and mutex, the application exhibits better performance and scalability.
The modifications of the first thread that performs the update are persisted, whereas the other threads are informed of the change in data and requested to repeat the update on the modified data.

Cons:

User may need to perform the modifications again, if another thread updates the data after it was read by the user thread. This may cause frustration in a multi-user heavy-transaction environment.

Pessimistic approach: This approach requires the use of a semaphore, mutex or monitor to ensure serial access to a critical section in code. In this approach, a single copy of the data is maintained and serial access is provided to threads requesting access to this data. When a given thread enters the critical section, no other thread is allowed to access this data until the thread exits the critical section.

Pros:

Suitable for situations where there is no shared data, however, serial access need to be provided to a shared resource, such as a socket.

Cons:

If the semaphore or mutex is not released properly, it leads to a memory leak. This degrades the application performance over time.
Another problem with semaphores and mutex is the possibility of a deadlock, which occurs when a circular dependency is introduced between two threads, each requesting a lock on a resource that is currently held by the other.
Since serial access is provided to concurrently executing threads that wish to update shared data, the changes made by the last thread are persisted, whereas the other threads are unaware of what happened to their modifications.

A given application may use either one, or both, of the above-mentioned approaches. For shared data access among multiple threads, it is preferable to use the optimistic approach, whereas, for shared resource access (socket etc.), it is generally better to use the pessimistic approach.

Monday, January 11, 2010

Throttling the SwingWorker using an ExecutorService

The SwingWorker is a utility class that ships with Java 6. It allows a Swing application to perform lengthy background computation in a separate worker thread, in effect freeing the event dispatch thread to interact with the user. Even though the SwingWorker utility is an important addition to the Java SDK, it does increase the resource overhead of the application by creating two additional threads for processing the lengthy computation - one thread performs the actual background work, while the other waits for the background thread to finish and then updates the results on the UI. Since the event dispatch thread is free to accept user input, the user - in the absence of a prompt response - may invoke the same functionality repeatedly. This results in a large number of worker threads being instantiated, and for a J2EE application, this in turn results in an increase in the number of associated threads being spawned by the servlet container to process the client requests. The increased number of threads on the server-side typically results in server overload and performance degradation. Even though the SwingWorker utility provides a cancel() method to stop the execution of an existing worker thread, there is no way to cancel the execution of the server-side thread created by the servlet container. The solution to this problem is to throttle the SwingWorker utility by using the ExecutorService, which has been added in Java 5 to execute Runnables using a thread pool. A fixed sized thread pool ExecutorService allows only a certain number of SwingWorker threads to be active at anytime, with the new threads having to wait for the earlier ones to finish, before getting a chance to execute. The value of the thread pool size is specific to the application and is primarily dependent on how many SwingWorker threads are expected to be active at any given time.

The code sample given below depicts a typical Swing application that uses the SwingWorker utility to retrieve data from the server. The SwingWorker utility is parameterized to have any desired return type , which is returned from the doInBackground() method. The type is used to denote the intermediate results that are used by the publish() and process() methods to depict - if required - the progress to the user. The doInBackgound() method is executed by the background worker thread that performs the lengthy computation. A second thread blocks at the get() call in the done() method and the event dispatch thread continues to perform user interaction. Finally, once the lengthy background computation is complete, the get() method returns the result of the doInBackground() method, which is then used by the second waiting thread to update the results on the Swing UI.

As explained above, once a SwingWorker thread is submitted for execution, it may be subsequently cancelled by invoking the cancel() method on the SwingWorker instance created. However, it is not possible to cancel the server-side thread that is spawned by the servlet container to process the client request. To avoid this problem, it is advisable to throttle the number of threads being created by using an ExecutorService with a fix thread pool of a certain size. Therefore, instead of calling the execute() method on the SwingWorker instance, the SwingWorker instance - which is a Runnable - is submitted to an implementation of the ExecutorService.

// Create a background worker thread

SwingWorker swingWorker =

new SwingWorker, Void>() {

// This method executes on the background worker thread

@Override

protected doInBackground() throws Exception {

compute result;

return result;

}

// This method executes on the UI thread

@Override

protected void done() {

result = get();

}

};

// Submit to the executor

SwingWorkerExecutor.getInstance().execute(swingWorker);

Given below is a very simple implementation of the SwingWorkerExecutor that creates an ExecutorService with a fixed thread pool size set as 3, which allows only three worker threads to be active at any given time. New Runnable instances of SwingWorker wait in the queue and are selected for execution only when a previous instance has completed execution. This strategy effectively avoids the spawning of numerous threads of the server, and therefore, prevents any possible performance degradation.

public class SwingWorkerExecutor {

private static final int MAX_WORKER_THREAD = 3;

private static final SwingWorkerExecutor executor = new SwingWorkerExecutor();

// Thread pool for worker thread execution

private ExecutorService workerThreadPool = Executors.newFixedThreadPool(MAX_WORKER_THREAD);

/**

* Private constructor required for the singleton pattern.

private SwingWorkerExecutor() {

}

/**

* Returns the singleton instance.

* @return SwingWorkerExecutor - Singleton.

public static SwingWorkerExecutor getInstance() {

return executor;

}

/**

* Adds the SwingWorker to the thread pool for execution.

* @param worker - The SwingWorker thread to execute.

public void execute(SwingWorker worker) {

workerThreadPool.submit(worker);

}

Monday, August 17, 2009

Comparing Java Runtime Analysis (or Profiling) Tools

Runtime analysis is a practice aimed at understanding software component behavior by using data collected during the execution of the component. The analysis provides an understanding of the following aspects of the application execution environment:

Execution paths and code coverage.
Memory utilization and memory leaks.
Execution performance and performance bottlenecks.
Thread analysis and related concurrency issues.

Enterprise Java applications that are designed to run on modern multi-core processors typically benefit from the use of a Java runtime analysis tool, as it provides information on memory leaks, performance bottlenecks and even concurrency issues such as deadlocks.

Recently at work, I got the opportunity to evaluate the leading Java profiling tools. An initial review of the leading tools in this arena resulted in the following list - JProfiler, YourKit Java Profiler, Java Visual VM, DevPartner Java Edition, Rational Purify, JProbe and OptimizeIt. A preliminary investigation shortlisted the candidates to JProfiler, YourKit Java Profiler and Java Visual VM. Both JProfiler and YourKit Profiler are leading award-winning tools and I was basically looking to compare them with the Java Visual VM, a free tool available with JDK 6.0 Update 7 (Windows). The other tools were rejected for various reasons - the DevPartner for Java product kept crashing, Rational didn't have a standalone Java edition, OptimizeIt only worked well with JBuilder, and finally, based on reviews posted on the web, JProbe was considered somewhat inferior to JProfiler and YourKit Profiler.

The three selected candidates (JProfiler 5.2.2, YourKit Java Profiler 8.0.13 and Java Visual VM 1.6.0_14 were compared against the following evaluation criteria:

License Cost: While Visual VM is free, both JProfiler and YourKit Profiler are commercial tools that provide an option to purchase both standalone (node-locked) licenses and floating licenses. The license cost for both these products is more-or-less in the same range.
Ease-of-use: This was an important consideration as the profiling tool should be intuitive to use, with the results being presented in an easily understandable format. All the tools faired equally in this category.
Performance (CPU) Profiling: CPU profiling helps to identify hotspots (or method) that result in higher CPU usage. All the three tools provide comprehensive analysis, however, YourKit and JProfiler have better presentation options that display the data using call graphs and trees.
Memory Utilization: This form of analysis presents information regarding the memory usage. The three tools under consideration provide a view which lists all the objects and their associated memory consumption. Again, the presentation of JProfiler and YourKit is slightly better than Visual VM.
Thread Analysis: Provides a view of the threads running in the VM. All the tools under consideration provide very good thread analysis capabilties and also detect concurrency issues such as deadlocks.
Code Coverage: This criteria was also under consideration; however, it was deemed less important in comparison with the other criteria.None of the tools being evaluated provided code coverage analysis.
Remote Profiling: This is the ability to perform the runtime analysis from a remote machine. JProfiler and YourKit provide all the features that are available for local analysis; however, Visual VM only provides a limited set of features for remote analysis.
IDE Integration: All three tools integrate well with Eclipse and other major Java IDEs.
Supported Platforms: The three tools selected support all the major version of the common operating systems - Windows, Linux, Mac OS X and Solaris.

Based on the comparative analysis, it was clear that both JProfiler and YourKit were slightly superior products with some great profiling features. While Visual VM may not have all the features provided by the other two products, it is extremely intuitive and provides all the basic features that are desired from a runtime analysis tool. It is also important to note that since Visual VM is extensible via a plug-in architecture (just like Eclipse), it is poised for growth and contributions from the open-source community will eventually make it a compelling product, possibly at par (or even better) that JProfiler and YourKit.

Therefore, we decided to use Java Visual VM, as it satisfies our current needs and there is already some contribution from the development community in the form of some really useful plug-ins.

The Great Rat Race