30 Sep 2013 JavaOne 2013 Day Two – Polyglot Gauntlet
Day Two of JavaOne was another huge day and the longest of the entire week, finishing 9.15pm. Looking back over it, it was the strongest technically featuring excellent presenters and ideas. The morning began with some sessions on best practises in functional programming and cloud. The rest of the day was all about polyglot JVM languages. I could not be more proud that the JavaOne organisers have enabled this awakening. Talking to past attendees, much more of the conference schedule has been dedicated to emerging languages this year.
Functional Reactive Programming
The day started off with the highly anticipated Functional Reactive Programming with RxJava by Ben Christensen. RxJava is a port by Netflix of Erik Meijir’s Rx (Reactive Extensions) for .NET and Javascript. It is a “library for composing asynchronous and event-based programs using observable sequences for the JVM”. In short, what it’s good for is processing streams of different data together without blocking. A few years ago, I saw Erik Meijer speak at YOW on reactive programming, and whilst the idea was revolutionary, there was a bit of .NET bias in his presentation. Without an implementation I could easily use, it didn’t seem quite attainable. I’m glad to see however that Netflix has embraced this and made this available to the JVM as an open source project.
To give you a feel of how the API is used, I’ve taken this code sample from the slides. The API was designed with polyglot in mind so that all JVM languages, not just Java, can consume it effectively. This means that of all the popular JVM languages this is the one look very similar when using the library.
https://speakerdeck.com/benjchristensen/functional-reactive-programming-with-rxjava-javaone-2013?slide=8
The premise of reactive programming is that the Observerable/Observer is the asynchronous dual to the synchronous Iterable/Iterator. So traditionally Java devs load in data to memory from a synchronous data source, iterate over a collection, pull in items through an iterator via a next() call, throwing exceptions if something happens unexpectedly and returning back to our caller when done.
The alternative is to have our traditional Observable design pattern which observes a stream coming from a collection. It has methods that are notified when a new value comes in onNext(), onError(Exception) when something goes wrong, and onCompleted when done. This allows the application to be asynchronous, RxJava decides wether or not the call will block based on the underlying data source.
This example from the slides shows that changing from the Iterable style on the left to the Observable on the right is quite easy, replacing the terminating forEach operation on the stream with a subscribe method instead.
We are now observing a stream and can implement onNext, onError and onCompleted in the subscribe block rather than performing blocking operations. This keeps the client code from even having to handle how to deal with an asynchronous data-source.
RxJava provides functional methods (zip,map, flatMap, etc) that describe how to filter & process various streams.
Netflix has been progressively moving all their APIs to return Observable<DomainObject> rather than just a DomainObject. Using this reactive style takes away the responsibility for the client to have to manage asynchronous or synchronous calls – no more copy / paste variations of similar buggy asynchronous calls – and in turn has given them huge performance gains since everything is non-blocking. I can highly recommend taking a look at it.
Building Massively Scalable Applications in the Cloud
This session was by Yoav Landman CTO of JFrog, the company behind Maven Repository Artifactory and the growingly popular open source software repository, bintray. Bintray is similar to github but they provide more stats about your projects and for host your large binaries via a CDN. The talk discussed bintray’s experiences during the creation of the service, including what aspects of the service need to be critically available and what don’t and how they chose the frameworks and cloud vendors to deal with based on this measure of criticality. They talk also touched on logging, backup, restore and similar devops concerns.
For the 4 areas of the app, a web front end, a rest api, search and indexing and binary downloads, JFrog elected to use a number of different frameworks and db’s for each stack. They knew they would make some mistakes and need to be able to do lots of refactoring, knowing that they’d have to keep trying alternatives and not get too attached to a particular vendor or framework if it wasn’t going to work for their needs.
For binary downloads, they used NGinx which allowed the big downloads to scale on a single thread per request model. Curiously some file upload sessions start, but never close, and NGinx thankfully doesn’t pass the file onto the app server until the whole file is received. This saves resources on the webserver. They used a blob bucket store from OpenStack called SWIFT which stores each file with a checksum reference (very fast) rather than a hierarchical file path. To map the files from the checksum to the path, they used CouchDB.
The application chose Grails for the front end as Angular.JS wasn’t as mature at the time. Grizzly and Jersey were selected for their REST API. MongoDB for their front end db, which could integrate well with Grails’ GORM and used it for its structured search. For full text search, they used elasticsearch after they found Solr and Sphynx not up to the job. They did write an elasticsearch gorm mongo-db plugin as the original grails elasticsearch plugin would only work with JPA backed datasources that would send update events which Mongo didn’t do.
When projects get uploaded, they need to be indexed and so for this they’d be added to a queue managed by Redis and Jesque. Yes thats correct, they added another nosql store just to do the stats 😉
When it came to getting their platform in the cloud, they wanted to leave it up to the professional services. They used Softlayer and Swift to provide their Blob storage and couch mapping , MongoHQ for their frontend db. For the other things, ElasticSearch, Grails, Redis, there were no SaaS hosts so they created their own CentOS layer.
The speaker certainly knew his stuff and amongst the conclusions were to plan for scale, but be realistic, ‘some things just don’t scale’. Backup everything and test your backups, especially the ones for failover.
Polyglot, polyglot, polyglot
The next few sessions, Ten Cool things you can do with JVM languages, Curious case of Nashorn on the JVM (Java <-> Javascript interop), “Down the Rabbit Hole” (JVM bytecode), Teaching Java with Javascript (how to get Javascript user groups looking at the Nashorn), Groovy Kung-fu (demystifying monads for the ‘half-asleep, jetlagged and drunk’) all provided some great insights.
With “Down the rabbit hole’, by JRuby lead Charles Nutter was looking at how the JVM made bytecode, and then assembler and what language features had a large costs. There is an old joke, “If you wait long enough, the JVM will eventually optimise everything perfectly and even terrible code will perform well.” But until the JVM can optimise everything and become the mythical super optimising compiler, we need to be aware that how we structure our code affects performance. For example, inner classes in Java create objects that take RAM and serialization, which seems innocuous, is actually quite heavy in the byte-code it generates. Charles took us through a number of examples and showed that two ways of invoking lambdas in Java 8, method references versus a longer form call, are at this point in time optimising differently and that method references weren’t being optimised away. It’s fascinating the people are keeping track of these things and also benefiting the community by keeping the JVM engineers honest.
Adopt a JSR workshop
After a long day of polyglot goodness, it was time for a change of tact. The final session of the day benefited my role as a JUG leader and was about Oracle’s Adopt A JSR program. The focus of the program is to find individuals, JUGs and companies that can provide feedback to the JSRs (Java Specification Requests) whilst in development so that they are relevant & correct. The number of JUGs in the program has grown to over 45 last year and they have jointly contributed to 23 JSRs in all areas of the Java platform.
Anyone can contribute to a JSR’s development by downloading early access releases and using the JSRs. There are mailing lists for each JSR to join that garner feedback and ideas. For bug tracking, JIRA is used to raise defects. Adopt A JSR members also run hack days within their JUG or organization.
This is all great but there is one sticking point is when it comes to contributing code. Currently you must sign the JSPA, an agreement that means you or your company wont claim rights to the source code you’ve contributed. I’ve not seen how large open source projects deal with this issue but it would appear that the next iteration of the JCP (Java community process) rules changes, JCP.next, will address the JSPA and make this legalease document more accessible for more people to contribute in future.
Another current challenge for the JCP committee is getting more companies involved. Discussion covered identifying organisations who can contribute and for the few new JUGs like my own, how we could get involved into the JCP. Unfortunately more time was spent on the former rather than the later, but it was good to meet the Oracle and JUG leads who take part in this process and learn about the challenges.
No Comments