databases

The decline of Stack Overflow?

A few months back I read this post from 2015 (yes, I know I'm a little late to the party) about how Stack Overflow (SO) was in serious decline, and heading for total and utter oblivion.  In the post, the first item to be called  out was that SO "hated new users":
Stack Overflow has always been a better-than-average resource for finding answers to programming questions. In particular, I have found a number of helpful answers to really obscure questions on the site, many of which helped me get past a road block either at work or in my hobby programming. As such, I decided I’d join the site to see if I could help out. Never before has a website given me a worse first impression.
At the time, I remember thinking that this seemed like somewhat of an unfair statement. That was mostly down to the fact that when I joined the community (many years ago), I had fond memories of a smooth on-boarding, and never experienced any snarky remarks on my initial questions. Yes, gaining traction for noobs is very, very hard, but there is a good reason why it exists. For me, SO is invaluable. How else would I be able to pretend to know what I'm doing? How else could I copy and paste code from some other person who's obviously a lot smarter than me, and take all the credit for it? Anyway, once I had read the post, and gotten on with my life (e.g. copying and pasting more code from SO), I did't think too much more about the post. Maybe I had just been lucky with my foray into the SO community? However, just last week, I was reminded of that post once again, when I noticed that BigQuery (BQ) now has a public dataset which includes all the data from SO - including user comments and answers. Do you see where I am going with this yet? If not, then don't worry. Neither did I when I started writing this.

  Shine's very own Pablo Caif will be rocking the stage at the very first YOW! Data conference in Sydney. The conference will be running over two days (22-23 Sep) and is focused big data, analytics, and machine learning. Pablo will give his presentation on Google BigQuery,...

At Shine we're big fans of Google BigQuery, which is their flagship big data processing SaaS. Load in your data of any size, write some SQL, and smash through datasets in mere seconds. We love it. It's the one true zero-ops model that we're aware of for grinding through big data without the headache of worrying about any infrastructure. It also scales to petabytes. Although we've only got terabytes, but you've got to start somewhere right? If you haven't yet been introduced to the wonderful world of BigQuery, then I suggest you take some time right after this reading this post to go and check it out. Your first 1TB is free anyway. Bargain! Anyway, back to the point of this post. There have been a lot of updates to BigQuery in recent months, both internally and via features, and I wanted to capture them all in a concise blog post. I won't go into great detail on each of them, but rather give a quick summary of each, which will hopefully give readers a good overview of what's been happening with the big Q lately. I've pulled together a lot of this stuff from various Google blog posts, videos, and announcements at GCP Next 2016 etc.
Databases are the backbone of most modern web applications and their performance plays a major role in user experience. Faster response times - even by a fraction of a second - can be the major deciding factor for most users to choose one option over another. Therefore, it is important to take response rate into consideration whilst designing your databases in order to provide the best possible performance. In this article, I’m going to discuss how to optimise DynamoDB database performance by using partitions.
Quite a while back, Google released two new features in BigQuery. One was federated sources. A federated source allows you to query external sources, like files in Google Cloud Storage (GCS), directly using SQL. They also gave us user defined functions (UDF) in that release too. Essentially, a UDF allows you to ram JavaScript right into your SQL to help you perform the map phase of your query. Sweet! In this blog post, I'll go step-by-step through how I combined BigQuery's federated sources and UDFs to create a scalable, totally serverless, and cost-effective ETL pipeline in BigQuery.

Shine is extremely proud to announce that Pablo Caif has been invited to present at GCP Next 2016, which is Google's largest annual cloud platform event held in San Francisco. Pablo will be presenting on the work Shine have done for Telstra, which involves building solutions on GCP to...

cloud-db.jpg With the current move to cloud computing, the need to scale applications presents itself as a challenge for storing data. If you are using a traditional relational database you may find yourself working on a complex policy for distributing your database load across multiple database instances. This solution will often present a lot of problems and probably won’t be great at elastically scaling. As an alternative you could consider a cloud-based NoSQL database.  Over the past few weeks I have been analysing a few such offerings, each of which promises to scale as your application grows, without requiring you to think about how you might distribute the data and load.

Shine Senior Consultant Ben Teese has had a piece published in the latest DZone Guide to Database and Persistence Management. In the article, Ben does an overview and comparison of the Firebase, Meteor, and Amazon Cognito platforms. These platforms all aim to solve the use-case of...