Deployment Alerting using AWS Services

pexels-photo-5077054.jpeg

Deployment Alerting using AWS Services

While we would all like to think everything we do is perfect, mistakes happen from time to time. People can select the wrong versions for deployment by mistake, bugs in the infrastructure we build can result in incorrect artefacts being deployed, or even breakdowns in communication can result in the wrong application being deployed at the wrong time. The only thing worse than the wrong version of an application ending up in production is when nobody notices it for several weeks, until after serious damage has been done.

In this post, I’m going to show you how you can use AWS cloud native solutions to reduce these sorts of issues by increasing the visibility of deployments across an entire environment.

The Problems with Traditional Deployment Alerting

Most of traditional methods of deployment monitoring send alerts when a CI service such as Azure DevOps or Jenkins finish deploying an artefact, or when the service starts on the platform running it. Unfortunately, neither approach is able to handle blue-green deployments properly.

Alerting when a CI service finishes deployments will only tell you when a new version is launched as an inactive service – it can’t easily tell you when the service becomes live, nor if it gets rolled back, both of which are very important scenarios you would want to be informed about. And while alerting when a service starts can inform you when it gets rolled back, it also results in a lot of noise in a cloud environment where services can scale up or be replaced at any time. In cases where an inactive service is warmed up well before going live, possibly even with beta testing, alerting when a service starts also can’t inform you when it starts accepting live traffic.

Fortunately, there’s a simple solution which can solve these problems and be quickly rolled out across your entire environment!

The Solution – Monitoring “Live” Versions

Instead of alerting on deployment or service start, a simple serverless app can be built which reads version information from the live apps themselves. The app only needs to read an endpoint on each app every few minutes, use a data store to keep track of the version history, and detect when the version changes. This app can then send alerts to your preferred notification channel, be that Slack, Teams, or even good-old-fashioned emails.

As this approach will always read the version endpoint of the live apps, it can cover blue-green deployments and rollbacks perfectly. It’s also technology-agnostic, giving you the freedom to use whatever continuous integration service and runtime platform you like. Plus, it allows deployment alerting to be handled in a single place across the whole organisation, making it significantly easier to maintain when compared to traditional alerting methods.

In this post, I hope to provide you with guidance so you can build a deployment alerting solution of your own.

Step one: Set up “info” endpoints

The first step is to configure “info” endpoints on each application which you want to monitor. These endpoints can be very simple, as the only thing we’re interested in is the version the app is currently running.

If you can configure these endpoints to be as consistent as possible, such as all being GET requests which return back a JSON payload containing a version property, it will make the rest of the work much simpler.

This step is likely to consume the most time, especially if you have a large number of services.

Step two: Create a service manifest

Next, you will need a manifest of all your services so the alerting solution knows what to look for. This should contain information such as:

  • The name of the application
  • URI to retrieve the version information
  • Details of any authentication required to hit the version info endpoint

You also may need to include details on how to parse the request if you’re unable to make all endpoints consistent, such as the path to the version attribute inside the response.

In most cases a YAML file in an S3 bucket should be sufficient to store this information, but depending on your environment you may have another solution.

Step three: Configure the DynamoDB table

The next step is to create a table in DynamoDB to hold the app version info. This table will not only be used by the alerting solution to determine if a version has changed, it will also maintain the history of versions deployed, which will come in handy if auditing ever has to occur.

An example of a table structure for app version history. appName is the partition key, while deployed is the sort key

Step four: Create the Lambda

Now that we have all the building blocks in place, we need to create a lambda in AWS to wire it all together. The lambda has to perform the following simple steps:

  1. Fetch the services manifest
  2. Iterate over each service in the manifest, hitting the info endpoint to retrieve the current live version
  3. Retrieve the previously live version of the service from the DynamoDB table
  4. Compare the live version with the previously live version. If the versions differ, send a message to your preferred notification platform
  5. If a version change was identified, save the new version in the DynamoDB table

Here is a simplified example of how this could be done in NodeJS

async function identifyAndAlertVersionChanges() {
    // Download the manifest from S3 and retrieve the array of services
    const services = await getServicesFromManifest();
    const serviceResults = services.map(service => {
        // Hit the "info" endpoint of the service to retrieve the version
        const appVersion = await fetchCurrentAppVersion(service);
        // Fetch the previous version from DynamoDB. The first run for each service will have 
        // no result in DynamoDB. You could either manually populate the initial values in the table
        // or include logic here to always save the version if Dynamo is empty
        const previousVersion = await getPreviousVersionFromDynamo(service);
        if (appVersion !== previousVersion) {
            const notificationMessage = `Version ${appVersion} of ${service.name} has been deployed! Replaces ${previousVersion}`
            await sendVersionUpdateNotification(notificationMessage);
            await saveNewVersionInDynamo(service, appVersion);
        }
    });
    await Promise.all(serviceResults);
}

Step five: Schedule the lambda

Now that we have set up a lambda which monitors and alerts on version changes, the only thing left is to run it on a regular basis. For this we can use a tool such as AWS EventBridge Scheduler, which allows the lambda to be triggered based on a configured rate. It is important not to trigger the lambda too frequently, to avoid putting unnecessary load on the services or your infrastructure. However, you should make sure it is also triggered frequently enough that deployments are notified in a timely manner.

I’ve found that every 5 minutes usually strikes a good balance, but you may find it needs to be more or less frequent depending on your own setup.

Conclusion

In this post, I have shown how to record and alert upon deployments across an entire environment using AWS cloud native solutions. By sending a message to a wide group of people any time a version change is detected, you can be sure any deployment mistake is quickly identified and resolved, hopefully well before it results in serious issues!

francis.greatorex@shinesolutions.com

Associate Principal Software Engineer (SRE Specialty) at Shine Solutions

No Comments

Leave a Reply

Discover more from Shine Solutions Group

Subscribe now to keep reading and get access to the full archive.

Continue reading