Issue 58

Java Performance: Tools

Lucian Torje
Senior Java Developer @ Siemens

The goal of this article is to provide an insight into the most used java performance tools. Like any tool, its usefulness depends on its usage and on the skills of the persons using it. We used the Spring Pet clinic application during the tests, since Spring MVC & Boot is the most used web framework according to ZeroTurnaround's Java Tools and Technologies Landscape 2016.


HPROF is a .dll packaged free tool distributed with every JDK. It's built on top of JVM TI and provides heap and CPU profiling. According to Brendan Gregg it has a couple of reported issues:

In order to profile a Spring Pet Clinic sample, the app needs to be enhanced with an exit endpoint that will close the application context. Profiling with HPROF is unpractical if the only specific areas are targeted, because it lacks the start/pause/stop capabilities.

$ mvn package
$ java -Xrunhprof:cpu=samples -jar target/spring-petclinic-1.5.1.jar
2017-03-20 23:00:49.565  INFO 11640 --- [  restartedMain] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped "{[/close]}" o nto public void
Dumping CPU usage by sampling running threads ... done.
$ cat java.hprof.txt
JAVA PROFILE 1.0.1, created Mon Mar 20 23:00:22 2017
CPU SAMPLES BEGIN (total = 22941) Mon Mar 20 23:04:20 2017
rank   self  accum   count trace method
   1 56.37% 56.37%    5784 301590$        SubSelector.poll0
   2 16.73% 73.10%    1716 301591
   3  6.67% 79.77%     684 300352 org.springframework.boot.loader.jar.JarURLConnection.getInputStream
   4  5.47% 85.23%     561 300356 sun.misc.URLClassPath$Loader.        getResource
   5  2.16% 87.40%     222 300231
   6  1.39% 88.79%     143 300131 java.lang.ClassLoader.defineClass1
   7  1.02% 89.81%     105 300502 sun.misc.URLClassPath$Loader.      findResource

NetBeans profiler

NetBeans profiler is a free Java profiling tool integrated into the NetBeans IDE. It enables the following profiling tasks:

Overall, the NetBeans profiler is a nice and useful tool to use that has all the features needed for the job. It is a fine choice if you are looking for a good and free profiling tool.


VisualVM is a lightweight Java profiler. It's shipped with JDK and it can also be installed separately from the VisualVM download page. It is a better alternative to JConsole, offering the same capabilities. It is extendable through its plugins.

VisualVM includes the following features:

VisualVM is started from terminal by running the following command (if in $PATH):

$ jvisualvm

VisualVM is definitely a tool that should not be left out of every Java developer's toolbox. This belief is also enforced by the ZeroTurnaround's report from Nov 2015 which states that 46% of the respondents do use VisualVM.

Mission Control

Java Mission Control and Flight Recorder (known as JRockit Mission Control and JRockit Flight Recorder - abbreviated JFR) are advertised as having near zero overhead profiling and diagnostics in production environments.

Java Mision Control is free to use, Java Flight Recorder is not. In order to use JFR you must agree with the Oracle commercial terms.

JMC & JFR offer all the basic features like CPU, heap, thread monitoring (CPU, deadlocks detection or threads count). From the advanced features, the ones worth mentioning are the triggers, I/O monitoring, MBeans, hot methods statistics, exceptions and events monitoring, as well as time period filtering.

Mission Control is accessible with the following command (its UI offers a nice Java Flight Recording wizard and also JMX access to JFR methods):

$ jmc

In order to prepare for recordings, the following command arguments need to be added to your app java process:

-XX:+UnlockCommercialFeatures -XX:+FlightRecorder

In order to schedule a flight recording add the following parameters:


Recordings can be started from the terminal using the following command:

$ jcmd 9200 JFR.start delay=20s duration=60s name=MyRecordings 
Recording 4 scheduled to start in 20 s. The result will be written to:

Checking status from the terminal can be accomplished by running the following command:

$ jcmd 9200 JFR.check
Recording: recording=4 name="MyRecordings" duration=1m filename="c:/TEMP/myrecording.jfr,settings=profile" compress=false (running)

Making a JFR dump is easy as calling:

$ jcmd 9200 JFR.dump name=MyRecordings filename=c:/TEMP/dump.jfr
Dumped recording "MyRecordings", 265.2 kB written to: C:\temp\dump.jfr

Java Mission Control and Flight Recorder are definitely great tools to use and if you already own a license for "Oracle Java SE Advanced", "Oracle Java SE Advanced Desktop" or "Oracle Java SE Suite" there is no reason to switch to other profiling tools - JMC and JFR will do the job well.


JMH stands for Java Microbenchmark Harness and it is a micro benchmarking framework, distributed with OpenJDK since 2013. The reason why this framework is mentioned among profiler tool lies in the functionality that JMH offers, which is gathering performance statistics for a test object (from a piece of code to a full app). JMH should be configured with warmup time (yields better results) - this is known as the calibration step that some profilers perform before starting the actual measurement.

JMH benchmarking output methods run time (percentiles too), average time or throughput, depending on the configuration.

The easiest and recommended way to use JMH is to create a separate project using the following command:

$ mvn archetype:generate \
          -DinteractiveMode=false \
          -DarchetypeGroupId=org.openjdk.jmh \
          -DarchetypeArtifactId=jmh-java-benchmark-archetype \
          -DgroupId=org.sample \
          -DartifactId=test-pet-clinic \

Now we have a project inside the test-pet-clinic folder, which we can build and start:

$ cd test-pet-clinic/
$ mvn clean install
$ java -jar target/benchmarks.jar
# JMH 1.9.3 (released 667 days ago, please consider updating!)
# VM invoker: C:\Program Files\Java\jre1.8.0_111\bin\java.exe
# VM options: 
# Warmup: 20 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.test.spring.RestImplementationsBenchmark.owners
# Parameters: (path = /owners, port = 8080, server = localhost)

# Run progress: 0.00% complete, ETA 00:06:40
# Fork: 1 of 10
# Warmup Iteration   1: 7.829 ops/s
# Warmup Iteration  19: 91.628 ops/s
# Warmup Iteration  20: 93.504 ops/s
Iteration   1: 92.267 ops/s
Iteration   2: 34.738 ops/s
Iteration   3: 92.347 ops/s

Result " owners ":
  98.147 ▒(99.9%) 3.851 ops/s [Average]
  (min, avg, max) = (34.738, 98.147, 113.290), stdev = 15.129
  CI (99.9%): [94.296, 101.998] (assumes normal distribution)

# Run complete. Total time: 00:06:08

Benchmark  (path)  (port)   (server)   Mode  Cnt   Score   Error  Units
RestImpl   /owners  8080   localhost   thrpt 173  98.147 ▒ 3.851  ops/s

Among the JMH uses we mention TubeMogul's benchmarks for Java 8 or Pavel Samolysov's finding who concludes that: "EJB is up to 15% faster than the Spring Framework while CDI is up to 19% slower".

Daniel Mitterdofer identified in his elastic search benchmarking presentation 7 deadly sins, benchmarking sins need to be handled when performing benchmarks:


XRebel Trial comes as a .zip file containing the xrebel.jar java agent, the license and the license agreements. The integration is straightforward - just run your application on the server with the following parameters: -javaagent:[path/to/]xrebel.jar - this will enable the XRebel toolbar. Once the web application is started with XRebel enabled, the toolbar is displayed on bottom left side. The toolbar has the following menus:

It tracks down events from sources like: HTTP, Quartz , JMS, Periodic Task, RabbitMQ, Unidentified.

It has support for the following frameworks Spark, Spring MVC, Spring Boot, JSF, Vaadin, Spark Framework, Grails, Struts, Jersey and is capable of aggregating data from other XRebel enabled microservices used by the current frontend/backend application. There is also support for the following application servers GlassFish, JBoss, Jetty, Tomcat, TomEE, WebLogic, WebSphere, WebSphere Liberty Profile, WildFly. Among the NoSQL databases supported are Cassandra, Couchbase Server, MongoDB, Redis and among relational databases the following need to be mentioned: Apache Derby, H2, HSQLDB, Microsoft SQL Server, MySQL, Oracle, PostgreSQL, SAP MaxDB and SQLite.


JProfiler is a commercial product developed by EJ Technologies. It supports all the basic features:

Some of the advanced features that we liked are:

The UI is intuitive and easy to use and, in a short amount of time, we were able to make use of its full power.


JMeter provides load statistics from the user perspective (user feel), but also through its extension, it can be used to check internal methods. JMeter terminology defines the following items:

The http requests can be added by the user or recorded by using the JMeter proxy server. Usually Firefox is used for the task and in case your app uses some specific headers like x-csrf-token used to detect forgery, it is possible, from JMeter, to extract and use tokens/ids from previous calls by creating referenceable variables (e.g. ${my_csrf}). JMeter has support for the following extractors:

The measurement contains the following performance-relevant items:

A typical JMeter http request looks like this:

Thread Name: Thread Group 1-1
Sample Start: 2017-03-26 13:33:01 EET
Load time: 1559
Connect Time: 1
Latency: 1559
Sample Count: 1
Error Count: 0

Response headers:
HTTP/1.1 200 OK
Set-Cookie: JSESSIONID=11D10081A56BB9FA911E7350E57429A1; Path=/owners
Content-Type: application/json;charset=utf-8

JMeter is a nice surprise and it's integration with the most important continuous integration tools (Bamboo JMeter Aggregator, JMeter plugin for TeamCity, CircleCI JMeter package or JMeter Jenkins Plugin) makes it a profiler to be considered.


YourKit profiler is a standard profiler offering the basics:

Out of the advanced features, YourKit offers:

Overall the interaction was a pleasant one. In order to get the high-level statistics, the app needs to work with CPU tracing or CPU sampling.


Note: The grades should be taken lightly - they represent our own user-driven perspective. Each grade should be taken in a [0.5] convergence interval. The final grade was calculated as an average of grades (where Yes/No correspond to 0/5).


Table 1 - Java profilers evaluation

Knowing as much as possible about the JVM and your app will help understand the profiler's results. One of the best sources to read about the hotspot lifecycle and optimizations performed is Doug Hawkins's JVM Mechanics presentation. For those curious enough, there are some nice tools that allow seeing the Java byte code and also the natively generated assembly code (javap & JITWatch with hsdis enabled).

GC's hiccups could also be a source of downtime (although the blame is put on GC most of the time). If the GC is the culprit, there is always the option of switching to a different implementation - there are several available for the Oracle JVM and also for Open JDK. The most notable is Shenandoah. GC tuning is explained here.

Instrumenting apps alter the measurements, sometimes by an order of magnitude. This happens due to the fact that the profiled code is modified while performing instrumentation, thus making the JVM behave differently (different optimized areas, different safepoints, different code layout and added overhead). This is not an isolated case since most profilers make use of Java Virtual Machine Tool Interface, the only difference lies, in most of the cases, in the way each profiler performs calibration.

Sampling works by observing the system from outside on a regular basis, is usually done by creating a thread responsible for making thread dumps and analyzing the results. This approach uses the CPU; therefore, the data will be biased. Smart profilers like Richard Warburton's Honest Profiler use parts from both worlds - checks the calling times regularly by calling an OpenJDK internal API that does async call trace.

Measuring the Java app performance from outside can also be done by using the logs, JMX monitoring, application actuators or simply by checking the system performance counters.

There is another great source related to performance methodologies and tools for CPU issues in Aleksey Shipilëv's performance mind map.


Tuesday, 19 September, 6:00 PM


Facebook Meetup


  • Endava
  • 3PillarGlobal
  • Gemini Solutions
  • Betfair
  • Accenture
  • Telenav
  • Siemens
  • Bosch
  • ntt data
  • FlowTraders
  • Crossover
  • MHP
  • Colors in projects

IT Days