I was lucky enough to be one of the 6,000 cloud geeks that descended on Vegas last week to attend AWS re:Invent 2012. This inaugural AWS developer conference was broken into 3 days. The first day was a bit of a warm-up day, with technical workshops and a AWS partner day. The two subsequent days had keynotes and deep-dive sessions covering all elements of the AWS ecosystem. In this post I’ll cover what I saw during the three days I was there, and what had the biggest impact on me.
My first impression as I went down to register on Tuesday morning was awe at the sheer scale of this event. There were 3 floors of conference rooms, a massive dining area and more Cisco wireless routers than I have ever seen in one place. My second realisation was that this was a true sell-out event, with no tickets available at the door. I heard first-hand how a group had travelled across the country to try and register on the day, only to be told that it had sold out. When they asked “what should we do now?” to the registration staff, the answer came back : “I don’t know, maybe do some work….”
The partner sessions started with a keynote from the laid-back but passionate Andy Jassey (VP AWS) who talked about the “High Scale, Low Margin” philosophy that is in the DNA of Amazon and their dedication to putting the customer first. This point was driven home with 2 key stats:
- Every day AWS provisions enough server capacity to power the entire amazon.com business back in 2003 – when it was ‘only’ a $5.2b global business
- In the last 4 years AWS has delivered 23 cost reductions across the AWS platform
The take-home for me from this was that the model developed by AWS and the scale with which it has been deployed presents an astronomical barrier-to-entry for other competitors in this market. I was left wondering who will really challenge these guys in the next 3-5 years.
Andy also announced a new tier to the partner structure for some of the largest AWS business generators.
Following on from Andy were two more presentations: the first from Tiffani Bova from Gartner who presented some excellent information (only marred by some non-ironic 1990’s clip art on one of the slides). Tiffani talked about the technology ‘Hype Cycle’ and how right now “Cloud” has moved from the “peak of inflated expectations” to the “trough of disappointment” where basically people believe that the “Cloud” will solve all their problems, and are inevitably disappointed when it fails to meet these expectations.
Tiffani also discussed how more traditional IT work is often actually under the control of other groups – for example, marketing departments might be allocated up to 30% of IT budget.
After lunch, the event was broken down into 3 streams: business, sales and technical.
I opted for the technical stream, which discussed the various ways in which AWS partners could leverage the platform to enhance their own business. To be honest this was a little drawn-out and while there was some good information in here, it was a little light in technical detail. However one interesting element was hearing what other companies (particularly in the US) have been doing with AWS services such as Direct Connect (allowing you to have a fiber link into the AWD datacenter) and AWS Marketplace (think app store for AWS).
Day 2 saw the remainder of the 6,000 attendees arrive, and the massive lobbies of the Venetian hotel suddenly were a sea of cloud fanatics.
As I walked into the main auditorium for the opening keynote the scale of this conference was once again driven home. It seemed more like a rock concert than a developer conference, with massive screens, pumping music and a light show. As I sat in the middle of the auditorium watching wave after wave of people enter and feeling the excitement in the room, I could not help but feel that, as the very first official AWS conference, this was a day that I would look back upon as a pivotal moment in IT. I felt very privileged to be there to witness it.
Andy Jassy again took the stage and addressed the crowd in his now-familiar style, looking completely unfazed by the 6,000 people seated infront of him (and another 11,000 watching over the web). Andy used his keynote to talk about the customer-focused view that AWS takes, and how feedback had driven both many of their new services, as well as enhancements to existing services.
He then introduced some customers to talk about their experiences. The first organisation to stand out was NASA, represented by the CTO of the Jet Propulsion Laboratory and a real-life rocket scientist. These guys were awesome. They talked with passion about how they used the AWS stack to streaming the Mars rover landing. The ability to provision massive amounts of infrastructure on-demand for a peak window of 7 minutes while the world watched would just have been a dream a few years ago, yet now it is not only possible but also affordable.
Next up was the CEO of Netflix, Reed Hastings. Just before Reed walked on stage, Andy announced a 25% price drop across the board for all S3 storage. This was obviously a complete shock to Reed, who bluntly stated that the change would save them “a fortune”.
Reed Hastings has to be one of the most technical CEO’s I have ever heard speak. He compared the cloud’s current state of evolution to his days writing assembler code and registries. Reed discussed the future technology trends he saw and commented that we are just at the start of this very disruptive wave of change in IT. Listening to what Netflix have used the platform for, and the maturity of their development principles (including the now-famous Chaos Monkey, which I’ll cover later), was truly inspiring.
On the home front, Australia’s Commonwealth bank also got a brief mention in the keynote for the massive cost savings they have seen ($20m) in moving some of their infrastructure to AWS.
Finally Andy announced the introduction of the latest AWS service offering: “RedShift”. RedShift provides a low-cost, high-performance data warehousing service and accommodates up to 1.6 PetaBytes of data per cluster. This to me is gold, as I have spent my whole career seeing large enterprises spend a massive amount of time, energy and cash on mediocre Data Warehouse solutions.
RedShift shows the seriousness with which AWS is treating enterprise customers and provides just another compelling reason for enterprises to start to move to the AWS cloud.
All that and it wasn’t even lunchtime yet!
The rest of the day was broken into technical/functional area deep-dives with about 15 sessions running at any one time slot. My first session was on Big Data and Amazon’s Elastic Map Reduce (which is an on-demand Hadoop on EC2). There is no doubt that this is a massive growth area and the potential to gain deeper insights from the information we gather is phenomenal. This was a great session for me, as it answered some of my cloud-related architectural best-practice questions, and give some good tips and tricks to get going successfully with AWS Elastic Map Reduce.
In the afternoon my first session was on “AWS Database Tier Architectural Best Practices”. This session was a little disappointing as it covered at a high-level all the Amazon data storage options available, but without much technical detail. In short, it did not really give me much to take away.
Next up was “Building Web Scale Architectures with AWS”. This session was hosted by an Australian solution architect from Amazon, along with an appearance from James Hamilton (VP and Distinguished Engineer). They talked in-depth about the design patterns need to build scalable and fault tolerant solutions particulaly with AWS components.
This session reinforced the principles we have come to expect in best-practice software development: automated testing, automated deployments and loosely-coupled, stateless architectures. They focused on scalability at every layer, making the point that if you use the correct tools and build scalability in from the start, it’s not that hard to achieve.
One area which always gets forgotten in this arena is the database. Consequently it was good to see them discuss techniques for using sharing to distribute database operations. They also described how Amazons Dynamo DB gives you data sharing for free (I’ve put Dynamo on my list of things to have a play with).
Key to this presentation were the assertions that “all things fail all the time” and that the the more scale you have, the more you should expect failure. This lead neatly into an introduction of Chaos Monkey – an open source tool written by Netflix which will randomly drop EC2 instances or S3 buckets from your AWS environment.
Netflix typically use Chaos Monkey in production during the middle of their working day. To me this is the ultimate software maturity model. It also means that they at least have some control over when an issue is likely to be exposed, and that all of their engineers will be there to look into anything that comes up.
Next up was “The Best EC2 Features You Never Knew About”, which at least partly lived up to its name, as I did come away with a few improvements to the way we have set up EC2 instances to date. Specifically: custom monitoring metrics (and setting these up early); the ability to have multiple network interfaces on an EC2 instance; and finally, the ability to use a High Availability Auto Scaling cluster with only one instance, to ensure that even single-server architectures are always up and running.
This was a good session, and concluded with the development manager for the IT behind the “Obama for America” campaign. He talked about the massive infrastructure they were able to roll out using AWS to support the campaign, in particular the ground crew on election day.
Finally, after their infectious presentation at the keynote earlier that day, I had to get me some more of those NASA boys! This time they did a deep dive into NASA’s use of AWS, and talked at length at how they used multiple cloud providers to perform their work and how, when setup correctly, the environment is more secure than having your own hosted environment.
These guys use a lot of VPCs and are able to provide scientists with on-demand computing power to run simulations at low cost. The overwhelming conclusion for me was that if it’s good enough for NASA, then how can it not be good enough for the enterprises we deal with every day?
Day 3 started with a keynote from Amazon CTO Werner Vogel. Werner asserted that the key to any innovation is that you have to try lots, and are bound to fail more often than you are successful. He went on to proclaim that this makes AWS an enabler for innovation, because it allows you to innovate fast: scaling if you’re successful, or failing quickly and inexpensively if you’re not successful.
Werner went on to talk about his ‘tablets of success’ (or put differently, ‘commandments for the cloud’). Here he hit on some common themes for the whole conference: listening to your customers; the need for loosely coupled architectures; and automation of everything (realestate.com.au got a full slide here as a leader in this space).
At this point, an amazing stat came out: Amazon.com has a release into production every 11.6 seconds on average per weekday. That is real continuous delivery!
Werner had customer names present on each of his ‘tablets’ – including Pinterest and Animoto. What was clear from this presentation is that amazon.com have learned a lot about fault-tolerance and scalable architectures over the past 18 years, and all of that knowledge has been pumped into AWS. Put differently, truly fault-tolerant and scalable architectures are no longer just the domain of the giants.
To finish, Werner announced firstly 2 new EC2 instance types aimed particularly at Big Data and high-performance computing, then secondly the new Data Pipeline service. With these and the RedShift announcement the day before, it’s clear that Amazon believe Big Data is going to be a hot topic for the next few years.
To me, Werner’s style also stated a lot about the Amazon culture. He stood in front of the audience wearing jeans, t-shirt and volleys. His detailed technical knowledge was clear to all. This was no corporate CIO. Its was clear that Werner’s an engineer and always will be.
Following from here I was back to the deep-dive sessions.
The first I opted for was “Packet Plumbing in the Amazon Virtual Private Cloud”. This was a deep technical session, and for me was a great explanation of some of the networking and security black-magic that goes on in a VPC. As with most of the sessions, it was initially jam-packed. However, as the discussion got deeper and deeper we did lose some people. I’m not sure what they’d been expecting from a session with a title like that.
From here I headed off to ” Failures At Scale and How to Ignore Them” – mainly because it was presented by James Hamilton. This to me was one of the best sessions of the whole conference.
James has been architecting and building fault-tolerant architectures for a long time. He talked about the common myths of DR and the rather confronting statement: “so if you have DR, are you willing to shut off production?”.
I don’t know anyone who would be game to do this in any organisation I have worked for, but at Amazon it’s something that they do on a regular basis just to check it still works. It’s like the ultimate Chaos Monkey.
James went on to talk about failures, stating that for an individual small application the chances of failure are small: around 2% for disk failure and 3% for server failure. However, at scale failure becomes a regular occurrence. By the time you have a full data centre you have servers and disks failing multiple times every day.
Also at scale, some of the more unlikely events can start occuring. He talked of how under certain circumstances in a data center the power can fail with an explosion which can look like a short (even though it’s not). Consequently, the backup generator does not get cut in, for an actually a short would likely blow it – and generators are expensive.
James said that this is the logic in every power controlling switch he has seen, and this exact failure scenario has occurrence for amazon.com twice. So now they review the code of these switches in each data centre and ensure that their generators come no matter what – even if it risks blowing up the generator.
There was lots of great information in James’s presentation, but what kept spinning through my mind was this: any reluctance about moving to the cloud is crazy. By moving to AWS you have guys like James working for you and ensuring your business is properly prepared for disaster recovery. If anything the concern should be in not doing it soon enough.
For the final keynote of the conference Jeff Bezos (CEO) had a chat with Werner on stage. I was very excited to see Jeff and hear what he had to say. The fireside chat style was nicely done, but for me this wasn’t the greatest presentation of the conference. There were some nice anecdotes of running Lean Process in the amazon.com business but Bezos ended up talking more about that than AWS. He did talk about some of his non-Amazon projects such as the 1,000 Year Clock and Blue Origin (the vertical launch and land rocket he hopes one day will create affordable space travel for the masses).
What was apparent once again was the customer-centric thinking which is part of Amazon’s DNA. One thing that did come up in Bezos keynote was the phrase “get the flywheel turning”. This mindset came up in a number of sessions throughout the conference – it is clearly a Bezos philosophy. The basic premise is this: instead of thinking purely in terms of profit and loss, Amazon think of a lifecycle:
- As more people use Amazon.com, AWS, etc,
- Amazon gains economies of scale, and
- With this scale come price cuts,
- Which in turn bring more people on to Amazon.com, AWS, etc
The flywheel concept also benefits from periodically introducing disturbance into the cycle. In the case of AWS, these disturbances come in the form of new service offerings or storage technologies (for example, Glacier). New features such as these also encourage more people to come on to the platform, thus accelerating the cycle. I think this is an awesome way of thinking, and one that could only come out of a low-margin, high-scale business.
My final session before I had to start making my way back to Australia was a more in-depth look at Amazon RedShift. In his keynote the previous day Andy Jassy had discussed a benchmark test they had run on amazon.com that compared RedShift to their existing data warehouse. They found the result to be that RedShift ran ten times faster, but at a fraction of the cost (Actually it turned out that RedShift was sometimes up to 150 times faster, but even the AWS guys thought that was too outlandish to publish in their major keynote).
The fundamental driver behind RedShift is that is that for some time now databases have been IO-bound. Whilst CPU and memory technology is doubling every couple of years (in line with Moore’s Law), disk speed is only doubling every 10 years. The AWS team thus thought that if they could leverage the distribution of storage already available in AWS for a column-store model, they could massively improve performance. Another good and informative session.
Leaving the conference and heading for the airport I reflected on the conference and the conversations I had over the 3 days. Whilst not all of the sessions were great (a common complaint I heard was the sessions were not technical enough) everyone was excited to be there and excited about the future of AWS. Amazon’s pace of innovation, quality of people and business model make it hard to believe that anyone is going to be able to challenge them for at least a few years.
Consequently, I am certain that AWS is going to change IT as we know it. It’s now a matter of when, not if.