Java

The year was 1997. The Red Hot Chili Peppers were musing on love and the motions of amusement park rides, Pathfinder landed on Mars and Leonardo DiCaprio drew Kate Winslet as per one of his French associates.  It was around this time I had heard about a thing called “Java”, a fancy new language everyone was talking about. The word on IRC was that it was based on work Sun Microsystems had originally done for embedded software on set-top boxes and other smart appliances.
[caption id="attachment_20558" align="alignright" width="377"]Taumata_Racer.jpg Waterslide analogy. One input, multiple outputs. Each slide represents a date partition in one table.[/caption] Do you have some data that needs to be fed into BigQuery but the output must be split between multiple destination tables? Using a Cloud Dataflow pipeline, you could define some side outputs for each destination table you need, but what happens when you want to write to date partitions in a table and you're not sure what partitions you need to write to in advance? It gets a little messy. That was the problem I encountered, but we have a solution.
So Java 9 is on its way and it's finally bringing us project Jigsaw. I say finally as this was first meant to be delivered with Java7 and then with Java8. It is now the main new feature for Java 9.

Will this post interest me?

If you use (or intend to use) Google Cloud Dataflow, you've heard about Apache Beam, or if you're simply bored in work today and looking to waste some time, then yes, please do read on. This short post will cover why our team finally took the plunge to start porting some of Dataflow applications (using the 1.x Java SDKs) to the new Apache Beam model (2.x Java SDK). Spoiler - it has something to do with this. It will also highlight the biggest changes we needed to make when making the switch (pretty much just fix some compile errors).
Do you have an unreasonable fear of cronjobs? Find spinning up VMs to be a colossal waste of your towering intellect? Does the thought of checking a folder regularly for updates fill you with an apoplectic rage? If so, you should probably get some help. Maybe find another line of work. In the meantime, here's one way to ease your regular file processing anxieties. With just one application of Google Cloud Functions, eased gently up your Dataflow Pipeline, you can find lasting relief from troublesome cronjobs.

There are applications that execute such complex tasks that if they didn’t use a concurrent processing model, they would be so slow as to be unusable. This group of applications includes data analytics, real-time games and recommendation systems.

Even with modern programming languages that support concurrency, we are faced with the task of coordinating multiple threads, handling synchronisation and the constant possibility of race conditions. These make it difficult to write, test and maintain code, discouraging many developers from implementing better and faster solutions for their problems.

multithreading

In this blog entry we are going to take a quick look at the Akka toolkit, its main concepts and some code examples in Java. For further information about this topic, please check the official documentation at http://akka.io/

I just spent a couple of days at the YOW! Connected conference and had a great time, despite nursing a bit of a cold. There were a tonne of great talks at the conference covering a wide range of topics, but in this post I'm going to briefly reflect on one specific trend that interested me at the event: the way in which UI platforms are advancing to adopt modern languages, and are even influencing each other in the process. The end-result: they're all moving towards languages that are both functional and statically typed.

Full disclosure: This year I was a member of the programme committee for the conference. So in writing this post, there's a bit of a risk that I'm creating an echo chamber for myself. All I can really say in my defence is that I hadn't consciously made these connections in advance - it was only afterwards that I saw a trend!

  Shine's very own Pablo Caif will be rocking the stage at the very first YOW! Data conference in Sydney. The conference will be running over two days (22-23 Sep) and is focused big data, analytics, and machine learning. Pablo will give his presentation on Google BigQuery,...