Author: Aaron Brown

Introduction

It’s a simple question, often asked by project managers, data scientists, and quality engineers on every data engineering project when that first data source is ingested. How do we know the data that has been ingested into a data lake is accurate and error-free?

On a recent project we were asked by a client if it would be possible to host a React app using serverless technologies, but also ensure that traffic never left their VPC and corporate network.

In this post I'm going to talk about how we achieved this outcome, and how it proved to be more of a challenge than we first thought it would be.

Recently, Energy Australia (one of Shine's long standing clients) approached us to help them build an Alexa skill in time for the launch of the Amazon Echo into the Australia/New Zealand market. The skill will allow Energy Australia customers to ask Alexa for information regarding their bills, and to get tips on how to minimise their energy usage.  In this blog post I'll give an overview of our solution, and outline some of the tips and pitfalls we discovered during development.

Setting the scene

A couple of months ago my colleague Graham Polley wrote about how we got started analysing 8+ years worth of of WSPR (pronounced 'whisper') data. What is WSPR? WSPR, or Weak Signal Propagation Reporter, is signal reporting network setup by radio amateurs for monitoring the ability for radio signals to get from one place to another. Why would I care? I’m a geek and I like data. More specifically the things it can tell us about seemingly complex processes. I’m also a radio amateur, and enjoy the technical aspects of  communicating around the globe with equipment I've built myself. [caption id="attachment_17082" align="alignnone" width="300"]Homer simpson at Radio transceiver Homer Simpson as a radio Amateur[/caption]