Mastering AWS CloudFormation

white cloud over rural field

Mastering AWS CloudFormation

This post is for those who are working with AWS CloudFormation on a daily basis, for those who are just getting into CloudFormation, or for the curious mind.

CloudFormation is AWS’ way of doing infrastructure as code (IaC). It’s been around since early 2011, but is still very popular among the DevOps community, even though competitor products like Hashicorp`s Terraform are also on the rise.

AWS also offers the Serverless Application Model (SAM) and Cloud Development Kit (CDK) frameworks as optional abstraction layers on top of CloudFormation. These frameworks help lower the barrier to entry for CloudFormation, especially for software developers.

Throughout the years, I have worked on many different projects where AWS CloudFormation was used to manage infrastructure and applications. In this post I’m going to talk about some of the things I’ve learnt along the way, recommendations for best practices, and CloudFormation features I’ve found to be particularly useful.

Standards

You can’t use CloudFormation at scale without first establishing some procedures to ensure your templates are correct and meet project standards.

Validating CF templates

Nothing is more demoralising during a deployment than experiencing a failure caused by a trivial error in a template. Validating your templates before you deploy them can save you a lot of time and energy because you’ll catch errors sooner and it’ll be easier to figure out where they happened.

The easiest way to check if your template contains errors or not is by running the command line tool cfn-lint against your template. Cfn-lint is open-source and is the official CloudFormation linter tool from AWS. It allows you to validate AWS CloudFormation templates against the AWS CloudFormation Resource Specification.

It is available via The Python Package Index Pypi and can be installed on any environment with a running Python environment. The latest version of cfn-lint can be installed via the cli command:

# pip install cfn-lint

After the installation you can run cfn-lint on the command line against your template

# cfn-lint cloudformation/template.yaml

Cfn-lint will return a zero exit code if no issues have been found with your CloudFormation template. More information about cfn-lint can be found on GitHub.

Another linter for AWS CloudFormation templates is cfn_nag. This is a very powerful linter with a focus on security. It scans the template for any insecure resource configurations and will let you know if your Security Group is misconfigured, if encryption is disabled or if the template is not following security best practices.

Cfn_nag is open-source and written in ruby and therefore requires a ruby environment to install & run it. Cfn-nag is available via RubyGems and the latest version can be installed via the cli command

# gem install cfn-nag

After the installation you can simply run cfn_nag on the command line against your template

# cfn_nag cloudformation/template.yaml

A non zero exit code is returned if the security validation of your template has failed. More information about cfn-nag can be found on GitHub.

For those who are using AWS CloudFormation to deploy AWS Step Functions, I recommend using the tool statelint to validate their AWS Step Functions syntax. Statelint is also an open-source AWS tool available via RubyGems.

# gem install statelint

Validating your Statemachine templates can be done via the command

# statelint statemachine/template.json

Integrating these linters into your development lifecycle provides you with the confidence that your CloudFormation templates are standardised, safe and working. You will have less to worry about and isn’t it great to not worry about infrastructure? Add them as a git hook before you commit changes or add them to your CI/CD pipeline. Run the validation on each Pull Request to verify that no misconfigured CloudFormation template makes it into your branch.

But a validation is only effective if taken seriously. One recommendation I always make is to listen to it and don’t ignore it. Why bother integrating it when you still ignore what it has to tell you ?

Provide a logical structure

Structuring your CloudFormation templates will make them easier to read & understand and provide you with a better flexibility of updating your CloudFormation stacks.

I usually like to structure my templates logically but everyone has their own definition of a logical structure so here is my definition…

  • Group resources within the template by service & resource type
  • Separate resources across templates based on their logical utilisation

If you’re writing your templates in YAML, I recommend using comments to support the grouping of resources within the template.

I usually start my templates like this:

Resources:
  ########################
  # IAM Resources
  ########################
  # IAM Roles
[..]
  # Managed Policies
[..]

  ########################
  # S3 Resources
  ########################
  # S3 Bucket Topic 
[..]
  # S3 Bucket Policy
[..]

This allows me to fly over and understand the template even months/years after I last touched it.

Separating resources across multiple templates based on their logical utilisation is a very tough one because there is no simple answer to it. I believe there are many ways to structure your resources across multiple templates but the following questions help me to understand how:

  • Do I have any reusable resources ?
  • What is the logical relationship of my resources ?

I like to see any resources with a 1:n relationship and no prior dependency to other resources as a reusable resource. I define those resources as early as possible in my CloudFormation stack, ideally in a prerequisite stack, which can be created once and then reused as many times as often. I’ll talk more about the usage of a prerequisite stack in the next chapter, but some of those resources could be S3 Buckets, Cognito or IAM resources.

Identifying the logical relationship of resources requires you to fully understand all resources and how they are related to each other. For example, if you have resources related to SNS / SQS think about moving them into a messaging stack, if you have resources related to an API Gateway, think about moving those into an API Stack, if you have resources related to a business operation move them into their own stack, and so on.

Modularity & Reusability

I have heard people complaining about AWS CloudFormation not being modular, which I don’t necessarily agree with. The better I got to understand AWS CloudFormation the more I see the modularity in it. Though, I do agree that the modularity of AWS CloudFormation might not be as straightforward as for other solutions and you may have to think a bit more creatively.

Using of a prerequisite stack

The idea behind utilising a prerequisite stack is based on Cross-Stack References. The concept is that shared resources are managed in a prerequisite stack and are consumed by the main stack. By managing a single prerequisite stack and multiple main stacks, you establish a 1:n relationship. Segregating your stacks provide you several advantages

  • Modularity of your CloudFormation stack
  • Permission-Based Resource Segregation
  • Reduced deployment time
  • Cost reduction due to resource reusability
  • Increased flexibility developing and testing your application

Examples of those resources may be S3 buckets, IAM Roles, EC2 Security Groups or Cognito resources and many more. To define resources from your prerequisite stack in your main stack make use of Import/Export or AWS SSM Parameter Store.

In this example, resources are segregated based on permission & reusability. The main stack is managed by the application team and is consuming shared resources from various stacks. The prerequisite stack is owned by the application team and contains all shared application resources like Cognito. Meanwhile the Permission & Security stack are managed by the responsible teams within the organisation. 

This approach optimises development & deployment time, as only application resources are managed in the main stack and can be created as many times as possible. This allows developers to test their own changes without interrupting other developers from developing & testing and also reuses the same authentication configuration from Cognito. This setup may also allow you to do a blue/green deployment.

Nested stack

CloudFormation allows you to define CloudFormation Stacks within CloudFormation which are called nested stacks. Nested stacks provide you with the advantages of taking care about the stack dependencies during deployment & deletion, allowing you to modularise your CloudFormation templates and refer to stack outputs without exporting them. If no dependency is defined nested stacks are being deployed simultaneously.

This diagram is an example of defining a nested stack in CloudFormation and the modularity you can achieve with it.

We defined two RDS Stacks and one EC2 stack, highlighted in yellow. Both RDS Stacks are being created simultaneously and are using the RDS template, highlighted in pink, for creating the RDS instances and subsequently are using the Secrets Manager & SSM templates to store information like username/password and resource names. RDS Stack 1 generates one Output, highlighted in green, with the name Endpoint. By passing the output Endpoint as input parameter to the EC2 Stack we defined a dependency on the RDS Stack1 and therefore the EC2 Stack gets created after RDS Stack 1 is successfully created.

The templates in pink contain the actual AWS resource definition while the templates in yellow contain a reference to these templates. A big advantage is the user-friendliness as they don’t have to worry about AWS resource definition, compliance and standards as those have already been defined in the templates.

Defined stack outputs are accessible within the nested stacks without exports. You can access the output of a nested stack, defined within the same template, by referring to the stack name & output name using the intrinsic function Fn::GetAtt. In the diagram, the EC2 stack is consuming the output Endpoint from RDS Stack 1 via the main stack, by referring to the output as an input parameter for the EC2 stack.

RDSStack1:
[..]
EC2Stack:
  Properties:
    [..]
    Parameters:
      [..]
      RdsEndpoint:!GetAtt 'RDSStack1.Outputs.Endpoint'

For the advanced CloudFormation users, manage AWS CloudFormation templates in their own Github repository and use git submodule to load the necessary template repository in your git project as AWS is doing it with their AWS Quickstart projects.

Sharing Cloudformation Stack information

AWS CloudFormation allows you to share stack information across CloudFormation stacks within the same AWS account & region by using Export/Import & AWS SSM.

Export & Import Stack Outputs

You can export defined stack outputs in your stack so they become available to import for other stacks within the same AWS account & region. To do so, you need to define the Export field in your stack output and provide a region specific unique export name. To Import an exported output in another stack you need to define the export name in the intrinsic function FN::ImportValue. One important thing to note about exports, you can not delete a stack with a defined export if the export is still imported by another stack, defining another layer of stack dependency.

As from the prereq example, the IAM Roles stack has a stack output with an export prereq-dev-api-gw-role-arn defined. This export is being imported in all API Gateway stacks of the several Main stacks. Updating the export name or deleting the export is not possible as long as it is imported by one of the API Gateway stacks.

I personally like to use exports to share stack information within CloudFormation as they are easy to define and you usually don’t have to worry about compliance standards other than the export name. Though I would make the following recommendations for the usage of export/import in your stacks…

  • Avoid using common names as export names, as they are unique per account & region.
  • Use export names containing at least the stack & environment name e.g. rdsstack1-dev-endpoint
  • Avoid hardcoding the export names in your imports & parameterise as much as possible instead.
  • When updating an export name, consider creating a new export with the new name to avoid stack dependency conflicts.
  • When deleting an export, remove the import, run a stack update, remove the export & run a stack update.
  • They are very useful when you can deploy multiple versions of your stack within the same account e.g. for each branch or blue/green deployments.
  • Avoid using them when you have only one version of your stack which you keep updating e.g. rolling deployment.

AWS Systems Manager Parameter Store

You can also use AWS SSM Parameter Store to share resource information across CloudFormation stacks within the same account & region. Using dynamic references you can resolve values from the parameter store during a stack creation or update.

To store stack information in AWS Systems Manager Parameter Store you need to define the AWS::SSM::Parameter resource within your template. To resolve the value of a SSM parameter, define the ssm dynamic reference in your template '{{resolve:ssm:parameter-name:version}}'. As of April 2021 version is now an optional field, if not defined it will always fetch the most recent version.

In this example, the IAM Roles stack is storing the IAM Role arn as an AWS SSM Parameter which is being resolved during the stack creation of the Main stack. The SSM Parameter can be updated without interrupting the existing stacks. If you update the value of the SSM parameter and you want to use the new value in your existing stack, execute a stack update operation on the existing stacks. You will want to be able to update an existing stack if it resolves a SSM Parameter which does not exist anymore.

Using AWS SSM Parameters to pass stack resource information to other CloudFormation stacks is a great alternative to export/import. Here are a few things to consider when using this approach:

  • Just like exports, AWS SSM Parameter names are unique per account & region.
  • Don’t resolve more than 60 dynamic references in your template, otherwises you will experience Rate Limit Exceeded errors during a stack event.
  • CloudFormation AWS::SSM::Parameter resource supports only type String & StringList.
  • They are very useful when you have only one version of your stack.
  • Resolve SSM Parameters within Lambda Functions and not in the CloudFormation template.
  • No additional layer of Stack dependencies, the stack managing the SSM parameter can be deleted. Updating a Stack with a deleted AWS SSM Parameter will fail while resolving the AWS SSM Parameter, though.

Export/Import vs. AWS SSM Parameter

In summary, I would recommend if your Cloudformation template is designed to create multiple versions of the same stack within your environment use Export/Import. If your Cloudformation template follows a static design pattern use AWS SSM Parameter.

To avoid stack errors and incompatibility within your stack, don’t delete or rename existing Export or SSM Parameter names. Instead, create new ones and refer to the new ones in your templates, run a stack update and if you are certain that the old ones aren’t used anymore remove them.

One suggestion which applies for both approaches, parameterise the import/SSM names as much as possible and don’t hardcode them in your templates.

Use Policy-Based resources

Try to avoid defining in-line policies in your resources and use Policy-Based resources if they are available for your resources. Policy-Based resources provide you with a greater flexibility and support you in the parameterising of your CloudFormation templates. With Policy-Based resources you switch the responsibility of the resource to a smaller entity, e.g. using IAM Managed Policies to manage access to an AWS Lambda Function removes the direct dependency between the IAM role and the Lambda function. The IAM Role & lambda function can be created simultaneously while the Managed Policy will be created after both resources were successfully created, which will favour the overall deployment time.

Some of the more common Policy-Based resources:

I like to define Policy-Based resources within the same template where the resource is managed and attach it to the appropriate resources like an IAM role or EC2 Security Group.

Recommendations and Tricks

A few general recommendations and tricks helping you to master AWS Cloudformation.

Parameterising

Parameterise your AWS CloudFormation templates as much as possible. Try not to hardcode any information in your templates and design your templates with the aim that you can deploy multiple independent versions of the same stack within the same environment. Use pseudo parameters to refer to AWS Resources like in IAM policies instead of hardcoding them. 

Avoid cross dependencies

Try to avoid defining any kind of cross dependency in your CloudFormation templates. Using policy-based resources can help you to avoid cross dependencies within the resources defined in your template and across templates.

Updatable resources

Ensure that your resources can be updated during a stack update event. When you integrate cfn_nag you may encounter that it is showing the following warning Resource found with an explicit name, this disallows updates that require replacement of this resource. In other words, a Stack update will fail on resources with an explicit defined resource name. If you need to define a resource name, provide one via the resource Tag Key Name.

Since AWS CloudFormation detects changes by comparing the new version of the CloudFormation template with the existing version, CloudFormation does not automatically recognise modifications made to the code of an AWS Lambda function or AWS Step Function. To guarantee that your Lambda functions and Step Functions are updated, add a dynamic reference to the parameter name S3Key. A dynamic reference might be a commit id, build id or CI/CD pipeline execution id.

CloudFormation Qutoas

A few AWS CloudFormation quotas which are good to know. As of October 2020, AWS has increased the limit of five service quotas for AWS CloudFormation:

  • Maximum size of a template passed as an S3 Object 1MB
  • Maximum number of resources per template 500
  • Parameters 200
  • Mappings 200
  • Outputs 200

Should you need to define more than 200 Parameters in your CloudFormation stack, try to group multiple parameters into one parameter as a CommaSeparatedList. If you reach the limit of 200 Outputs, store some of them as AWS SSM Parameter.

End 2 End Testing

Ever heard of TaskCat ? I didn’t until I started working on Shine’s AWS QuickStart project. TaskCat is an open-source tool for testing your AWS CloudFormation templates, developed by the AWS Integration and Automation team. It automatically deploys your CloudFormation templates in multiple regions and generates a report with pass/fail per region. More information can be found in TaskCat Github project.

Summary

Cloudformation offers a great way of writing modular & reusable infrastructure code by using the practice of Cross-Stack References & Nested Stacks. Utilising helper tools like cfn-lint and cfn_nag will help you to catch syntax errors in your CloudFormation template early in the development phase and also harden the security of your templates. TaskCat allows you to run automated tests across multiple regions on AWS.

There are more features available for CloudFormation which will help you to write modular & reusable infrastructure code, I haven’t touched on: Conditions, Intrinsic Functions, Pseudo Parameters and many more.

michael.bloch@shinesolutions.com
No Comments

Leave a Reply