15 Jun 2007 Java Real-Time System: Separating Performance from Scalability
The Java Posse recently podcast an interview that they did at JavaOne with a couple of guys from the Java Real-Time System (RTS) project. This interview provided me with an important insight about RTS that I want to share with you – in particular, how it could relate to performance in the enterprise.
When I first heard about the RTS, I found myself interested in the theory of it, but never envisaged that it would be relevant to my work as an enterprise Java developer. I vaguely imagined RTS development as something practiced by genius software engineers working in exotic fields like avionics or robotics – far from the more mundane realities of my Java EE life.
However, when RTS 2.0 was announced at the JavaOne keynote, I got a small glimpse of how it could pertain to enterprise Java. The CIO of NASDAQ was brought out and explained how RTS will help them guarantee response times on trades. Trading systems are one enterprise application where guaranteed response times are critical – a seconds delay can result in the loss of millions of dollars.
The Java Posse interview then took things a bit further by exploring in more detail what RTS can do when it comes to the performance and scalability of our systems – something enterprise Java developers routinely have to deal with.
When we build an enterprise system, we often have both performance and scalability requirements (it’s important not to confuse the two, something I’ve discussed before). Usually performance requirements are of the form “each request must be handled within x milliseconds/seconds”, whilst scalability requirements are of the form “must be able to handle z concurrent requests”. Performance requirements are often expressed as a desired response time for a particular load. The scalabilty of a system is its ability to respond within a particular response time as the load upon it increases.
Generally speaking, if we increase the load, the average response time will increase. In testing whether our system meets its performance requirements, we usually write some load test that checks the average response time for a particular rate of requests. If the response time falls under a particular threshold for our maximum expected load, we pass the test. If the response time exceeds the threshold, the test has failed and we have to take action. This action usually involves profiling our code, tweaking our garbage collector settings or increasing resources like CPU or memory.
Normally, you profile your system first. But what if it’s still not performing satisfactorily after diligent code optimization? You then have to either invest serious time and analysis in optimizing your garbage-collection, or bite the bullet and buy more hardware. Garbage-collector optimization is a notoriously tricky task. And frankly, sometimes it’s just not worth the effort. The temptation can be to just fork out for the hardware, but the question will remain: could I have avoided buying that exta hardware if I’d known how to tweak the garbage collector the right way?
So how might RTS help with this? Well, it enables us to break the problem into two steps: performance first, then scalability. You can use it to guarantee the maximum response time for a request, irrespective of the load (well, not quite guarantee, but greatly increase your confidence in it). Essentially you achieve this by using the garbage-collector less. This requires more up-front effort than a non-RTS approach – you’ve got to provide more information to the system at build-time about how you plan on using memory. However, once you’ve done it, you gain much more certainty about how long things will take compared to a standard Java system. And you can be more confident that best way to increase scalability is to just add resources, not fiddle around with the garbage collector.
Am I saying that we should all start using RTS to build enterprise systems? No – not unless you’ve got very rigid performance requirements …like you would in a real-time trading system. What I am saying is that knowing a little bit about RTS helped me understand how performance and scalability are different, and how this understanding could help distinguish between achieving scalability via garbage -collection configuration versus achieving it through the addition of physical resources. In other words, it brought back to my attention a couple of trade-offs that 10+ years of memory-managed Java had caused me to forget: RTS allows us to spend extra effort on development to gain more reliability by reducing our dependence on garbage-collection.