Getting ‘ya music recommendation groove on with Google Cloud Platform!

Intro

Recommendation systems are found under the hood of many popular services and websites. The e-commerce and retail industry use them to increase their sales, the music services provide interesting songs to their listeners, and the news sites rank the daily articles based on their readers interests. If you really think about it, recommendation systems can be used in pretty much every area of daily life. For example, why not automatically recommend better choices to house investors, guide your friends in your hometown without you being around, or suggest which company to apply to if you are looking for a job.

All pretty cool stuff, right!

But, recommendation systems need to be a lot smarter than a plain old vanilla software. In fact, the engine is made up of multiple machine learning modules that aim to rank the items of the interests for the users based on the users preferences and items properties.

In this blog series, you will gain some insight on how recommendation systems work, how you can harness Google Cloud Platform for scalable systems, and the architecture we used when implementing our music recommendation engine on the cloud. This first post will be a light introduction to the overall system, and my follow up articles will subsequently deep dive into each of the machine learning modules, and the tech that powers them.

TEL monthly newsletter – February 2017

The TEL group was established in 2011 with the aim of publicising the great technical work that Shine does, and to raise the company’s profile as a technical thought-leader through blogs, local meet up talks, and conference presentations. Each month, the TEL group gather up all the awesome things that Shine folk have been getting up to in and around the community.  Here’s the latest roundup:

Will Athena slay BigQuery?

*Updated on 16th December 2016 – see below

With the announcement of Amazon Athena at this year’s AWS re-invent conference, I couldn’t help but notice its striking similarity with another rival cloud offering. I’m talking about Google’s BigQuery. Athena is a managed service allowing customers to query objects stored in an S3 bucket. Unlike other AWS offerings like Redshift, you only need to pay for the queries you run. There is no need to manage or pay for infrastructure that you may not be using all the time. All you need to do is define your table schema and reference your files in S3. This works in a similar way to BigQuery’s federated sources which reference files in Google Cloud Storage.

Given this, I thought it would be interesting to compare the two platforms to see how they stack up against each other. I wanted to find out which one is the fastest, which one is more feature rich and which is the most reliable.

10 amazing secrets university never taught you about how to become a truly professional software developer

Last week I was very lucky to be able to attend the YOW! 2016 Conference in Melbourne. I had never attended a major conference aimed purely at software developers before and when I arrived early on the first day I wasn’t quite sure if I had the right building. Thankfully within 30 seconds of walking in the door I spotted a man wearing a fedora and I knew I had come to the right place!

The conference overall was an extremely well run affair. The speakers were all very good and many were either from high profile companies such as Facebook and Uber or were outright living legends of the industry such as Robert ‘Uncle Bob’ Martin. There were three talks to choose from during each time slot and they covered a wide range of topics. The hardest bit was choosing which talk sounded most interesting and I suffered from severe ‘Fear of Missing Out’ syndrome when making my selections.

I would highly recommend attending to anyone who is looking to gain a better sense of what’s going on in the software industry. Setting aside two whole days to listen to presentations, talk to other developers and generally ruminate about the craft of developing software is a great way to take a step back from the daily grind and spend some time looking at the forest instead of the trees. I picked up a number of things that I’ll be able to take back and directly apply in my day-to-day development.

I’ve summarised one of my favourite talks below:

Shiner to present at very first YOW!Data conference

 

Shine’s very own Pablo Caif will be rocking the stage at the very first YOW! Data conference in Sydney. The conference will be running over two days (22-23 Sep) and is focused big data, analytics, and machine learning. Pablo will give his presentation on Google BigQuery, along with a killer demo of it in action. You can find more details of his talk here.

High availability, low latency streaming to BigQuery using an SQS Queue.

When you have a Big Data solution that relies upon a high quality, uninterrupted stream of data for it to meet the client’s expectation you need a solution in place that is extremely reliable and has many points of fault tolerance. That all sounds well and good but how exactly does that work in practice?

Let me start by explaining the problem. About 2 years ago our team was asked to spike a streaming service that could stream billions of events per month to Google’s BigQuery. The events were to come from an endpoint on our existing Apache web stack. We would be pushing the events to BigQuery using an application written in PHP. We did exactly this, however, we were finding that requests to BigQuery were taking too long and thus resulted in slow response times for users. So we needed to find a solution to Queue the events before sending them to BigQuery.

Shine hosts a successful Digital Leaders Breakfast

Yesterday Shine hosted a number of Digital Leaders to breakfast in Melbourne, aiming to share experiences and learning.
More than 20 participants from organisations including Coles, Energy Australia, NAB, ANZ, Telstra, Fairfax Media and Australia Post heard presentations from Todd Copeland (GM Digital, NAB), Simon Noonan (CIO, Sportsbet) and Jeff Mentiplay (GM Analytics & Commercial Delivery).
The response was extremely positive, including that participants greatly valued the presentations, the subsequent discussion and the opportunity to meet peers.
Digital Leaders Breakfast April 2016 There were some key themes from the presentations, which generated a great deal of open discussion amongst the presenters and participants.  These topics included the ability to learn and act quickly upon customer needs, the need to change traditional organisational structures, financial measures and team office locations to accelerate delivery.  In addition, the importance of “looking externally” to learn from global Digital leaders and the value of sharing platforms and customer insights across organisational silos was discussed.
Based on the overwhelmingly positive feedback and rich discussions, Shine plans to run another Digital Leaders Breakfast later in 2016.
IMG_0718  IMG_0713

NoSQL in the cloud: A scalable alternative to Relational Databases

cloud-db.jpg

With the current move to cloud computing, the need to scale applications presents itself as a challenge for storing data. If you are using a traditional relational database you may find yourself working on a complex policy for distributing your database load across multiple database instances. This solution will often present a lot of problems and probably won’t be great at elastically scaling.

As an alternative you could consider a cloud-based NoSQL database.  Over the past few weeks I have been analysing a few such offerings, each of which promises to scale as your application grows, without requiring you to think about how you might distribute the data and load.

YOW! 2015 Melbourne: A Conference Report

The YOW! 2015 Developer Conference in Melbourne took place a few weeks ago, and once again the organisers did a splendid job curating a selection of both international and local speakers (including Shine’s very own Ben Teese). There were also delicious meals and glorious developer fuel (a.k.a coffee) to keep the energy going strong between the amazing talks.

This year’s conference felt like it featured a wider variety of topics compared to previous years; headlining were Mobile development, Lean and Agile, Performance Testing, Software Architecture and Design, Big Data, Cloud Platforms, and DevOps. There was one topic however that took the crown and was presented with an overwhelming sense of urgency and importance: Microservices.

We had talks from big players such as Facebook, Uber, ThoughtWorks and Netflix, each giving an insight on how they are using microservices and how nearly everything they have done is a microservice (Over 1000 services!). It is safe to say that it was this year’s favourite buzz word.

A week in the life of a Google Developer Expert

c246283e-2952-41db-a64f-a8fb9f186c6c-original
All the GDEs posing at the Googleplex

A few months back, Shine’s Pablo Caif and Graham Polley were welcomed into the Google Developer Expert (GDE) program as a result of their recent work at Telstra. The projects they are working on consist of building bleeding edge big data solutions using tools like BigQuery and Cloud Dataflow on the Google Cloud Platform (GCP). You can read all about that here.

GDE acceptance comes with many benefits and privileges, one of which is a yearly trip to a private summit at a different location each year. With Google footing the bill, they bring all the GDEs (around 250 currently) from around the globe for, let’s admit it, a complete Google geek-out fest for 2 days!

This year the summit was at the Googleplex in Mountain View. Needless to say, Pablo and Graham were chomping at the bit to go. However, in addition to the summit, Google invited them to fly out prior to actual summit itself. They had lined up a few other things especially for the guys. So this was no ordinary trip. Lucky buggers!

We asked both guys to give their individual feedback on the trip, and here’s what they had to say about it. Read on if you want to hear about how the guys spent six days hanging out with Google in America.