Scheduling BigQuery jobs: this time using Cloud Storage & Cloud Functions

Intro

Not so long ago, I wrote a blog post about how you can use Google Apps Script to schedule BigQuery jobs. You can find that post right here. Go have a read of it now. I promise you’ll enjoy it. The post got quite a bit of attention, and I was actually surprised that people actually take the time out to read my drivel.

It’s clear that BigQuery’s popularity is growing fast. I’m seeing more content popping up in my feeds than ever before (mostly from me because that’s all I really blog about). However, as awesome as BigQuery is, one glaring gap in its arsenal of weapons is the lack of a built-in job scheduler, or an easy way to do it outside of BigQuery.

That said however, I’m pretty sure that the boffins over in Googley-woogley-world are currently working on remedying that – by either adding schedulers to Cloud Functions, or by baking something directly into the BigQuery API itself. Or maybe both? Who knows!

Falsehoods Programmers Believe About Names – With Examples

In 2010, Patrick McKenzie wrote the now-famous blog “Falsehoods Programmers Believe About Names”, in which he listed 40 things that were not universally true about names.

Did programmers sit up, take notice and change their attitudes to names? Sadly, not really. We still get asked to fill our names out in online forms which assume we have a first name and a last name (in that order) and which refuse to allow us to continue unless we have filled out both. They assume our names can be entered in alphabetic characters, often only ASCII.

I fear that part of the reason that this blog post had less impact than I hoped was that Patrick did not give examples of how each assumption can be false. But having worked in a previous life on IBM’s Global Name Management product, I can assure you that it’s all true.

Still not convinced? In this post I’m going to list all 40 of Patrick’s original falsehoods, but give you an example (or two) drawn from my experiences working in this space. Ready? Let’s go!

Getting ‘ya music recommendation groove on with Google Cloud Platform!

Intro

Recommendation systems are found under the hood of many popular services and websites. The e-commerce and retail industry use them to increase their sales, the music services provide interesting songs to their listeners, and the news sites rank the daily articles based on their readers interests. If you really think about it, recommendation systems can be used in pretty much every area of daily life. For example, why not automatically recommend better choices to house investors, guide your friends in your hometown without you being around, or suggest which company to apply to if you are looking for a job.

All pretty cool stuff, right!

But, recommendation systems need to be a lot smarter than a plain old vanilla software. In fact, the engine is made up of multiple machine learning modules that aim to rank the items of the interests for the users based on the users preferences and items properties.

In this blog series, you will gain some insight on how recommendation systems work, how you can harness Google Cloud Platform for scalable systems, and the architecture we used when implementing our music recommendation engine on the cloud. This first post will be a light introduction to the overall system, and my follow up articles will subsequently deep dive into each of the machine learning modules, and the tech that powers them.

Putting together the pieces: Server-side rendering with React Router v4 and Redux

Server-side rendering a React app is a miracle on-par with childbirth and modern air travel.

OK, that opening sentence was a little over-the-top. I apologise to birth mothers and those in the aviation industry.

Let me start again: server-side rendering a React app is…kind of cool.

That said, it can be a little tricky to get started, especially if you’re trying to do it with an existing app.

In this post I’ll explain one way you can implement server-side rendering (SSR) for an app that’s using  React Router v4 and Redux Thunks.

Along the way we’ll discuss the fundamental difference between JavaScript clients and servers, how it forces us to change the way we do routing, and the small “missing-link” that enables us to bridge React Router v4 with Redux thunks.

We’ll build up a simple example to demonstrate. I’m going to assume you’ve got some knowledge of:

  • React
  • Redux
  • React Router v4

However, you are not required to have knowledge of:

  • Childbirth
  • Aeronautics

Let’s do this.

TEL Newsletter – December 2017

Shine’s TEL group was established in 2011 with the aim of publicising the great technical work that Shine does, and to raise the company’s profile as a technical thought-leader in the community through blogs, local meet up talks, and conference presentations. Every now and then (it started off as being monthly, but that was too much work), we curate all the noteworthy things that Shiners have been up to, and publish a newsletter. Read on for this month’s edition.

Fun with Serializable Functions and Dynamic Destinations in Cloud Dataflow

Taumata_Racer.jpg
Waterslide analogy. One input, multiple outputs. Each slide represents a date partition in one table.

Do you have some data that needs to be fed into BigQuery but the output must be split between multiple destination tables? Using a Cloud Dataflow pipeline, you could define some side outputs for each destination table you need, but what happens when you want to write to date partitions in a table and you’re not sure what partitions you need to write to in advance? It gets a little messy. That was the problem I encountered, but we have a solution.

My favourite talks from YOW! 2017 Melbourne

No food reviews here I’m afraid

This year I was incredibly lucky to score a coveted ticket to YOW! in beautiful Melbourne. I was also asked to be a track host for a couple of sessions, so that was quite an honour too. This post is a whirlwind wrap-up of the conference, and only includes my favourite talks from the two day event. If you’re hoping to hear detailed reviews on how the coffee/food/WiFi/venue was, then you’ll be greatly disappointed (it was all great BTW).

re:Invent 2017: Day 2

The last time I was fortunate enough to attend AWS’s global conference, re:Invent, was three years ago in 2014. Then there were 14,000 delegates and the conference spanned just two Las Vegas hotels. Lambda was announced during Werner Vogels’ keynote and it seemed that the most in-demand sessions had “Docker” in the title.

In just three years the conference has tripled in size with 43,000 delegates attending this year spread across a campus of six Las Vegas hotels. Although not one of the biggest conferences held in Vegas, it’s obviously a significant logistical challenge. After some hiccups on the first day with the inter-venue shuttles and a venue running out of food, everything seemed to settle down and run smoothly from the start of the second day. Whether the improvement was due to human learnings of the hivemind or training of some Machine Learning algorithms is up for debate but almost certainly it was a combination of the two. No, actually, the transport still is not good and Uber is key to success.

re:Invent 2017: Day 1

What happens In Vegas….

The old adage tells us that what happens in Vegas, stays in Vegas. But for one week a year the reverse becomes true. Thousands of cloud enthusiasts descend on the city of sin and come away filled with renewed vigour to play with, and ultimately implement, the latest toys from Amazon Web Services.

This year I’ve been lucky enough to represent Shine by travelling to Las Vegas and participating in this prestigious event. In this post I’ll be recapping some of the things I’ve seen. I’ll add more as the week goes on with my thoughts and reflections, as well as the latest announcements from AWS.

Scheduling BigQuery jobs using Google Apps Script

Do you recoil in horror at the thought of running yet another mundane SQL script just so a table is automatically rebuilt for you each day in BigQuery? Can you barely remember your name first thing in the morning, let alone remember to click “Run Query” so that your boss gets the latest data refreshed in his fancy Data Studio charts, and then takes all the credit for your hard work?

Well, fear not my fellow BigQuery’ians. There’s a solution to this madness.

It’s simple.

It’s quick.

Yes, it’s Google Apps Script to the rescue.

Disclaimer: all credit for this goes to the one and only Felipe Hoffa. He ‘da man!