Monday, July 19, 2010
Optimizing Power Consumption
Wednesday, June 9, 2010
Catching Java Exceptions
Monday, January 11, 2010
Throttling the SwingWorker using an ExecutorService
The SwingWorker is a utility class that ships with Java 6. It allows a Swing application to perform lengthy background computation in a separate worker thread, in effect freeing the event dispatch thread to interact with the user. Even though the SwingWorker utility is an important addition to the Java SDK, it does increase the resource overhead of the application by creating two additional threads for processing the lengthy computation - one thread performs the actual background work, while the other waits for the background thread to finish and then updates the results on the UI. Since the event dispatch thread is free to accept user input, the user - in the absence of a prompt response - may invoke the same functionality repeatedly. This results in a large number of worker threads being instantiated, and for a J2EE application, this in turn results in an increase in the number of associated threads being spawned by the servlet container to process the client requests. The increased number of threads on the server-side typically results in server overload and performance degradation. Even though the SwingWorker utility provides a cancel() method to stop the execution of an existing worker thread, there is no way to cancel the execution of the server-side thread created by the servlet container. The solution to this problem is to throttle the SwingWorker utility by using the ExecutorService, which has been added in Java 5 to execute Runnables using a thread pool. A fixed sized thread pool ExecutorService allows only a certain number of SwingWorker threads to be active at anytime, with the new threads having to wait for the earlier ones to finish, before getting a chance to execute. The value of the thread pool size is specific to the application and is primarily dependent on how many SwingWorker threads are expected to be active at any given time.
The code sample given below depicts a typical Swing application that uses the SwingWorker utility to retrieve data from the server. The SwingWorker utility is parameterized to have any desired return type
As explained above, once a SwingWorker thread is submitted for execution, it may be subsequently cancelled by invoking the cancel() method on the SwingWorker instance created. However, it is not possible to cancel the server-side thread that is spawned by the servlet container to process the client request. To avoid this problem, it is advisable to throttle the number of threads being created by using an ExecutorService with a fix thread pool of a certain size. Therefore, instead of calling the execute() method on the SwingWorker instance, the SwingWorker instance - which is a Runnable - is submitted to an implementation of the ExecutorService.
// Create a background worker thread
SwingWorker
new SwingWorker
// This method executes on the background worker thread
@Override
protected
compute result;
return result;
}
// This method executes on the UI thread
@Override
protected void done() {
result = get();
}
};
// Submit to the executor
SwingWorkerExecutor.getInstance().execute(swingWorker);
Given below is a very simple implementation of the SwingWorkerExecutor that creates an ExecutorService with a fixed thread pool size set as 3, which allows only three worker threads to be active at any given time. New Runnable instances of SwingWorker wait in the queue and are selected for execution only when a previous instance has completed execution. This strategy effectively avoids the spawning of numerous threads of the server, and therefore, prevents any possible performance degradation.
public class SwingWorkerExecutor {
private static final int MAX_WORKER_THREAD = 3;
private static final SwingWorkerExecutor executor = new SwingWorkerExecutor();
// Thread pool for worker thread execution
private ExecutorService workerThreadPool = Executors.newFixedThreadPool(MAX_WORKER_THREAD);
/**
* Private constructor required for the singleton pattern.
*/
private SwingWorkerExecutor() {
}
/**
* Returns the singleton instance.
* @return SwingWorkerExecutor - Singleton.
*/
public static SwingWorkerExecutor getInstance() {
return executor;
}
/**
* Adds the SwingWorker to the thread pool for execution.
* @param worker - The SwingWorker thread to execute.
*/
public
workerThreadPool.submit(worker);
}
}
Monday, August 17, 2009
Comparing Java Runtime Analysis (or Profiling) Tools
- Execution paths and code coverage.
- Memory utilization and memory leaks.
- Execution performance and performance bottlenecks.
- Thread analysis and related concurrency issues.
- License Cost: While Visual VM is free, both JProfiler and YourKit Profiler are commercial tools that provide an option to purchase both standalone (node-locked) licenses and floating licenses. The license cost for both these products is more-or-less in the same range.
- Ease-of-use: This was an important consideration as the profiling tool should be intuitive to use, with the results being presented in an easily understandable format. All the tools faired equally in this category.
- Performance (CPU) Profiling: CPU profiling helps to identify hotspots (or method) that result in higher CPU usage. All the three tools provide comprehensive analysis, however, YourKit and JProfiler have better presentation options that display the data using call graphs and trees.
- Memory Utilization: This form of analysis presents information regarding the memory usage. The three tools under consideration provide a view which lists all the objects and their associated memory consumption. Again, the presentation of JProfiler and YourKit is slightly better than Visual VM.
- Thread Analysis: Provides a view of the threads running in the VM. All the tools under consideration provide very good thread analysis capabilties and also detect concurrency issues such as deadlocks.
- Code Coverage: This criteria was also under consideration; however, it was deemed less important in comparison with the other criteria.None of the tools being evaluated provided code coverage analysis.
- Remote Profiling: This is the ability to perform the runtime analysis from a remote machine. JProfiler and YourKit provide all the features that are available for local analysis; however, Visual VM only provides a limited set of features for remote analysis.
- IDE Integration: All three tools integrate well with Eclipse and other major Java IDEs.
- Supported Platforms: The three tools selected support all the major version of the common operating systems - Windows, Linux, Mac OS X and Solaris.
Based on the comparative analysis, it was clear that both JProfiler and YourKit were slightly superior products with some great profiling features. While Visual VM may not have all the features provided by the other two products, it is extremely intuitive and provides all the basic features that are desired from a runtime analysis tool. It is also important to note that since Visual VM is extensible via a plug-in architecture (just like Eclipse), it is poised for growth and contributions from the open-source community will eventually make it a compelling product, possibly at par (or even better) that JProfiler and YourKit.
Therefore, we decided to use Java Visual VM, as it satisfies our current needs and there is already some contribution from the development community in the form of some really useful plug-ins.
Monday, August 3, 2009
Comparison of Java source code analysis tools
- License Cost: An important consideration in any evaluation. Being a commercial product, Coverity Prevent has as associated license cost, whereas the other two are basically free.
- Quality of the Analysis: Obviously concurrency and resource ultilization issues are deemed more important that ununsed method / variables. This was a difficult comparison to make, as all the three tools reported a wide spectrum of problems. In general, however, I found the Coverity Prevent and FindBugs analysis to be better than PMD.
- Speed of the Analysis: Since the objective is to integrate the analysis with the nightly build, a short analysis time is preferred. While Coverity Prevent took hours to analyze the codebase, both FindBugs and PMD were done in minutes.
- Eclipse Integration: Having an Eclipse plugin is essential to report defects during day-to-day development. Fortunately, the three tools selected provide one.
- Rule Customization / Extension: The ability to customize the existing rulesets and add new ones was considered a desirable feature. While all three provided the option to add / drop certain rulesets from the anaysis, only FindBugs and PMD allowed the user to create new customized rulesets.
- Defect Reporting: This considered the ability of the tools to report the defects in the most intuitive and convenient manner. Coverity Prevent has a great web-based defect manager that allows the user to remotely look at the defects, review the associated source code and act on them. FindBugs has a GUI option that displays the defects and provides associated source markups, however, PMD doesn't provide the source markups and it is only available as a command-line program. All three tools provided an option to export the defects in different output formats (html, xml).
Monday, July 6, 2009
Effective Clustering
The term *Clustering*, in the context of computing, is a group of one or more servers (or nodes) that are connected via a high-speed interconnect, with the objective of offering an illusion of a single computing platform. Let’s look at the benefits of creating an effective computing cluster, and the considerations that surround it.
Some of the benefits of a computing cluster are:
1. Better Scalability: Clustering is a way to augment the horizontal scalability of the service provided by the computing platform. Vertical scalability may be augmented by hardware enhancements (better processor, more memory), and/or by using good design practices and code refactoring to remove performance bottlenecks in the service offered. Efforts to achieve vertical scalability work up to a point, and after that the only available option is horizontal scalability, which is achieved by adding more nodes and forming a computing cluster. However, if a single node supports N users, having five such nodes doesn't mean automatic support for 5XN users. Linear scalability depends on certain other considerations, such as load balancing and continuous monitoring, which have been explained later.
2. High-Availability / Failover: Another important benefit of clustering is redundancy of data and service. In the event of a failure of one or more nodes in the cluster, it is expected that the other nodes continue to offer access to the service and associated data, perhaps not at the same level of performance. Data availability for certain clusters that don’t have a shared database is not trivial, as it generally involves some form of data replication to synchronize the data on all the nodes in the cluster.
3. Improved Performance: Certain clusters are set up to perform lengthy computation tasks in parallel. Having more than one node concurrently work on a task may significantly improve the performance of the overall computation, provided that the gains surpass the overhead involved in task allocation and collating the results.
Now that we have reviewed some of the benefits of clustering, let's review some consideration for setting up effective clusters.
1. Cluster Type: The very first consideration to address is the type of cluster that we desire to set up - should it be an active-passive cluster or an active-active cluster? An active-passive cluster has only a single processing node and one or more standby nodes, one of which is designated as a fail-over for the primary (sometimes referred to as the hot standby). An active-passive cluster is generally targeted towards high-availability and failover, and it offers limited or no scalability. An active-active cluster is basically a cluster of peers and it offers true scalability, as well as high-availability and failover. An active-active cluster generally requires the synchronization of shared resources (data, session) across all the nodes in the cluster.
2. Load Balancing: A good load balancing mechanism of one of the most important tenets of an effective cluster. It is imperative to equally distribute the processing load on all the nodes in the cluster. While using an external load balancer, such as a commercial one from F5 or the free Apache mod_proxy_balancer, is a viable option, it adds up to the cost of the deployment. Good load balancers (F5) easily cost a couple of thousand dollars and even the free ones (mod_proxy_balancer) require an additional dedicated machine. Some low-end load balancers don’t do anything more than a round-robin on the client requests and they perform a shallow ping on the node, only checking for the availability of the node, not of the offered service. While external load balancers are a viable option in some cases (for thin browser-based clients), for other proprietary client cases, it is better to build the load balancing in the client. Implementing a load balancing algorithm in the proprietary client gives the opportunity to determine the real-time load on the cluster, via a connection set up (or handshake) phase.
3. Handling Shared Resources: Certain resources may need to be shared across the cluster; however, they may be node-specific by nature – such as data kept in local databases, user session data, and distributed task processing data. Synchronization of this data across the cluster involves certain data replication and distributed locking mechanisms. For a cluster to perform at optimal levels, the data replication and distributed locking algorithms need to be well designed.
4. Continuous Monitoring: A given node in the cluster should be aware of the state of the other nodes in the cluster. This may be achieved using a heartbeat mechanism, where periodic ping messages are exchanged between the various nodes in the cluster. In the event of a node being down, other members of the cluster may attempt to restart it and even temporarily redistribute the load of the failed node among them.
By meticulously handling these clustering considerations, we can hope to receive some of the tangible benefits of an effective clustering solution.