I’m back at re:Invent 3 years after the inaugural conference and I’m keen to know what changed, not in the offering of the platform but more about what people are doing with it and what other technology trends there are concerning the cloud.
As we turn up to the partner keynote (the day before the main conference starts,) the first thing that is apparent is the sheer scale of the event. Last time around the partner keynote was in a smallish room with a few hundred people. This year there are 6,000 partners in the room – as many people as attended the whole conference in 2012.
Onto the main conference itself and its announcements show Amazon remains committed to features to help startups and emerging technology areas such as IoT (Internet of things.) But equally, these days, there are a number of features aimed directly at the enterprise and some of the existing enterprise providers. There are by now many articles covering the individual announcements and so I’m not going to go into much detail on these. To dive deeper into the interpretation of the enterprise announcements check out the blog: Welcome to Starship Enterprise
One of the key technical trends of the conference started with an early mention just 45 minutes into the first keynote. Microservices architectures were quickly followed by DevOps and these two topics came up time and time again: from customers speaking on keynotes (and trying hard to make it look like they new what either meant) to detailed sessions with some of the leaders in this field, such as Netflix.
So how are they being used? Are they having an impact? And why is this important to the cloud and, in particular, AWS?
From the customer presentations in the keynotes, to a number of breakout sessions and hands on labs, a continually reoccurring theme is that DevOps is being embraced by all. Whilst I know this is not surprising, what was a little more intriguing was the fact that nearly all of the presentations mentioning DevOps made a pitch for people in the audience to apply for jobs with them. The cloud really provides an opportunity to take DevOps to the next level and automate everything. However, it is also clear that DevOps means many different things and nearly all presentations discussing DevOps started with what it means to them.
Definition 1 – Automate everything
Lots or sessions talked of automation and the services AWS have provided to support this. For example the introduction of CodeCommit, Code Pipelines and CodeDeploy support developers with full automation of their continuous integration model. Then with new services announcements such as Inspect and Config Rules allowing automation to be taken to the next level performing security checks and alarms through the same automated pipelines.
Another key to automation is automation of recovery, a great quote for the conference was “don’t name your instances – they are cattle, not pets.” A certainty when dealing with EC2 instances is that they are going to die at some point – maybe in the next minute, maybe in the next year – but they will die and you therefore need to build this into your design and automate the recovery (lucky AWS give you lots of tools to help.)
Definition 2 – DevOps is a culture of 100% ownership
Automation is critical and a key to everything DevOps but it’s not the only thing. The best definition I heard at re:Invent was by Dave Hahn (@relix42) from Netflix who described their DevOps culture. This entails 100% ownership from code, test and deploying through to running and support – all by a single team. “When something goes wrong you don’t want the team that have had a week’s training, you want the world experts in that service on call.” This resonates really well with me and is a belief we have long held at Shine. The critical thing with this model is the feedback loop it provides: there is nothing like being woken at 2am to investigate a bug you introduced to ensure you never do that again. Dave stated that in Netflix they have a rule that any unexpected issue can happen once and only once.
Whatever your definition of DevOps, in the cloud certain things are inevitable: Your instances will die; You will share resources; The architecture will change; You never see the lights so you need a different operating model to how its been run before.
Monitoring and metrics
Another complexity of the cloud is that you now have more monitoring metrics at your disposal than ever before “Collecting data is cheap; not having it when you need it is expensive…” But how do you deal with this level of information? Finding the signal from the noise is now the problem.
A presentation by Datadog DevOps evangelist Matthew Williams gave some good guidance essentially stating there is a difference between what events you “alarm on” and what you “page on”. You should only page on “work metrics” and not “resource metrics” where work metrics are the results of system behaviour such as response rates, error rates or latency. You may alarm on resource events such as CPU but they don’t necessarily indicate an issue.
One thing is for certain, DevOps is here to stay and the only way to go to make the most of the cloud.
Possibly the only topic to rival DevOps in terms of coverage and interest level was that of microservices. It seems to be an architecture that has been adopted by many if not all of the startup community and is now working its way into more enterprise system designs. Now anyone who knows me will know I’m skeptical about the “Microservices for everything” approach that seems to be the trend at the moment. However, seeing some of the large product development organisations such as Amazon and Netflix discuss how microservices are used provides an insight into some of the real advantages of this architecture. At the development scale of these large product organisations the breakdown into mircorservices allows innovation and delivery speed that would be difficult with other architectures. As an example Amazon made 50,000,000 deployments last year thats and average of 1 deployment every 1.5 seconds. Another well loved feature of microservices is language independence between services and this supports the culture of “Freedom and Responsibility” which is sought by the leading digital organisations such as Netflix and Amazon and is needed to attract the best engineering talent.
One of the other advantages of a microservices architecture which was discussed a great deal is the ability to provide graceful degradation strategies at a granular level for services. Any service not be performing at any particular point in time needs to have a strategy of protecting the end user from that service degradation. Dave Hahn from Netflix discussed a detailed example of this with the LOLOM (list of list of movies) service used at Netflix to return the personalised list of
titles to show a Netflix user when they login. If this service is performing poorly they have many fallback options, including returning a default list of titles, re-routing this service to another region and a number of others depending on the underlying cause. This approach also helps Netflix to user test new ideas to a sample set of customers by switching which service implementation they use for a particular user.
One issue that all organisations with an extensive microservices architecture seem to have to overcome is that of service registry and discovery. It’s no good having a large number of business services if no one knows what they are or who is using them. Netflix have open sourced a number of products they use to help them: service discovery through Eureka, distributed configuration through Archaius, resilient and intelligent inter-process and service communication through Ribbon. To provide reliability beyond single service calls, Hystrix was created to isolate latency and fault tolerance at runtime.
Since microservices and Docker containers seem to also go hand in hand, Amazon also announced that the EC2 Container registry is coming soon to help address some of the service registry complexity.
The other big excitement in the microservices space is serverless systems. To quote Werner Vogels: “No server is easier to maintain than no server.” There were a number of presentations on building microservices architectures on Lambda and the announcements for Lambda for Python and Lambda scheduled tasks were met with great excitement (although I was hoping for Lambda for Australia.)
The move to to serverless systems fulfils Reed Hasting’s prediction back in 2012 when he said the cloud would move us away from managing servers/or VM’s as we do today and that leading cloud providers would handle that for us, we would no longer need to concern ourselves with the type and number of servers needed to run our services. Well that pretty much describes Lambda.
Whilst microservices and DevOps were my main focus, some other themes worth a quick mention:
IoT (Internet of Things)
This remains a hot topic and a passion of Werner Vogels. Whilst the stereotypical example of the internet fridge is neither a reality, nor ever likely to be, there are big movements in this space. One of the most interesting presentations on this was the presentation by John Deer where microservices were embedded into farming machines to improve the efficiency and yield in the agricultural sector. Whilst I was disappointed that the presenter did not ride onto the stage on a stallion, it did leave an impression where we will see more and more of our process and tasks instrumented and we will need to work out what to do with all that data later. With Amazon’s announcement of IoT managed platform we should see the adoption and pace of innovation in this space accelerate – expect to see an internet enabled bin or fish tanks available soon.
Real Time Streaming
With the announcements of Kinesis Firehose, and Kinesis Analytics to come, it is clear that AWS are committed to the data streaming and its promise of real-time data analytics. I attended a great presentation by a guy from AdRoll on how they used Kinesis and Apache Storm to process click and impression data for their realtime ad-bidding system. Some impressive stats: they process 200 million impressions per day and have 300 servers generating logs at 700 records per second. They went from processing this data with 30 minutes delay to almost real time using Kinesis. This is a similar project to one we have run on the Google platform Test Driving Google Data Flow and these new features released by Amazon and those stats makes a compelling case for the Kinesis platform.
Making Mobile easier
With many customers talking of how they are moving their digital work loads to AWS, and in particular mobile apps, Amazon just made the whole thing easier with their release of the beta of AWS Mobile Hub. The tag line was making it easier for mobile developers to build apps without worrying about the server side. A bold statement and it will be interesting to see how successful they will be in this goal.
All in all a great conference but the “vibe” to me was more valuable than any one session. Seeing the trends in the industry and in particular the cloud space was really interesting. Also, the continued pace of innovation by Amazon is incredible – there is a clear commitment to making development for engineers easier by taking away some of the heavy lifting and letting the developers concentrate on product development and innovation.
It’s also clear this year that Amazon is as focused on the Enterprise as it is on the startup and has some of the large tech incumbents in its sights with Oracle in particular about to come under intense pressure on cost and innovation.