NewRelic deployment with Terraform

NewRelic deployment with Terraform

NewRelic is a powerful and popular Cloud-based platform that helps users monitor the performance of their services via pre-built dashboards and analytics models. Unfortunately, it can be very time-consuming to manage NewRelic resources from multiple projects via the console.

In this post, I will introduce an Infrastructure as Code (IaC) approach which uses Terraform code to manage NewRelic dashboards and alerts. More specifically, I’ll define all NewRelic resources with Terraform’s HashiCorp Configuration Language (HCL), then use Terraform commands to provision or update NewRelic. I’ll present an example to create a dashboard which monitors the invocation counts, response times and error rates of three AWS lambdas. I’ll also have NewRelic send an email alert.

1. Getting Started

1.1 Create User Key via API keys UI.

First we’ll get a key that is used to access REST API endpoints, along with the NewRelic account ID. These can be obtained from the NewRelic UI:

In this case, you can see that the key value starts with “NRAK-“. For privacy I’ve concealed the rest of the key, as well as the account ID, but you can substitute your own.

1.2 Define NewRelic resources via Terraform

1.2.1 Define the NewRelic Provider

Next we’ll configure Terraform with the user key and account ID:

terraform {
  # Require Terraform version 0.13.x (recommended)
  required_version = "~> 0.13.0"

  # Require the latest 2.x version of the New Relic provider
  required_providers {
    newrelic = {
      source  = "newrelic/newrelic"
      version = "~> 2.21"
    }
  }
}

provider "newrelic" {
  account_id = <NewRelic Account ID>
  api_key = <NewRelic User Key>
  region = "US"
}

where <NewRelic Account ID> and <NewRelic User Key> are the values we obtained in the previous step.

1.2.2 Define the NewRelic dashboard

Now we’ll actually set up the dashboard, which is named demo-dev. The three lambdas are named demo-identity-dev, demo-webhook-dev and demo-invalidate-dev. The prefix “demo” is a namespace identifier specific to this example.

We will assume that the lambda functions have already been provisioned in AWS and properly configured to post data to NewRelic (see the NewRelic lambda instrumentation setup documentation for more information on how to do this).

For the next step we’ll use a Terraform dynamic block, which is a convenient way to create resources and queries with specific naming conventions. Within the block, we’ll use NewRelic’s grid system – which is twelve columns wide – to layout three NewRelic dashboard widgets side-by-side:

resource "newrelic_one_dashboard" "service_dashboard" {
    name = "demo-dev"
    dynamic page {
        for_each = ["identity", "webhook", "invalidate"]

        content {
            name = page.value

            widget_markdown {
                title = "${page.value} API"
                text = "# ${page.value} API"
                row = 1
                column = 1
                width = 12
                height = 1
            }

            widget_line {
                title = "Total API Invocations [${page.value}]"
                row = 2
                column = 1
                width = 4

                nrql_query {
                    query       = "SELECT count(*) as 'Invocations' FROM AwsLambdaInvocation WHERE (entityName = 'demo-${page.value}-dev') SINCE 1 DAYS AGO TIMESERIES EXTRAPOLATE"
                }
            }

            widget_billboard {
                title = "Lambda response time (percentile) [${page.value}]"
                row = 2
                column = 5
                width = 4

                nrql_query {
                    query       = "SELECT percentile(cwDuration, 50) as 'ms (p50)', percentile(cwDuration, 90) as 'ms (p90)', percentile(cwDuration, 99) as 'ms (p99)' FROM AwsLambdaInvocation WHERE (entityName = 'demo-${page.value}-dev') SINCE 1 DAYS AGO"
                }
            }

            widget_line {
                title = "Call Error (percentage) [${page.value}]"
                row = 2
                column = 9
                width = 4
                height = 3

                nrql_query {
                    query       = "SELECT percentage(count(*), WHERE error IS true) as 'Error rate (%)' FROM AwsLambdaInvocation WHERE (entityName = 'demo-${page.value}-dev') SINCE 1 DAYS AGO TIMESERIES EXTRAPOLATE"
                }
            }
        }
    }
}

Here is the resulting dashboard, over which I have laid a rough outline of the grid columns and rows:

You can see that the first row is for the dashboard title, and the second row is split into three sections, each of which is four cells wide and contains a separate widget.

1.2.3 Define the NewRelic alerts and notification channel

Next we’ll move onto the alerts. To make alerts work properly, we need to set up a NewRelic policy, alert channel and alert condition. The policy hosts all of the related alert conditions. In this case we will use a for_each argument for the resource to create an alert for any error spike that occurs in each lambda function:

resource "newrelic_alert_policy" "service_alert_policy" {
  name = "demo-dev"
}

# Creates an email alert channel.
resource "newrelic_alert_channel" "email_channel" {
  name = "email-demo-dev@test.com"
  type = "email"

  config {
    recipients              = <notification email address>
    include_json_attachment = "1"
  }
}

resource "newrelic_alert_policy_channel" "service_alert_policy_channel" {
  policy_id   = newrelic_alert_policy.service_alert_policy.id
  channel_ids = [newrelic_alert_channel.email_channel.id]
}

resource "newrelic_nrql_alert_condition" "service_alert_error_rate" {
  for_each = toset(["identity", "webhook", "invalidate"])

  policy_id                    = newrelic_alert_policy.service_alert_policy.id
  type                         = "static"
  name                         = "${each.value} Error Spike [dev]"
  description                  = "Alert ${each.value} lambda function errors [dev]"
  enabled                      = true
  value_function               = "sum"
  violation_time_limit_seconds = 3600

  fill_option          = "static"
  fill_value           = 0

  critical {
    operator              = "above"
    threshold             = 5
    threshold_duration    = 900
    threshold_occurrences = "AT_LEAST_ONCE"
  }

  nrql {
    query             = "SELECT count(*) FROM AwsLambdaInvocationError WHERE (entityName = 'demo-${each.value}-dev') EXTRAPOLATE"
    evaluation_offset = 3
  }
} 

1.3 Provision via Terraform commands

After creating the terraform resource definition file, we can run terraform init, terraform plan and terraform apply in sequence to provision or update the resources in NewRelic. The output of this process will look like this in the NewRelic console:

I’ve highlighted the section that confirms we are using the correct user key.

2 Conclusion

In this post, I’ve introduced an efficient method for setting-up NewRelic monitoring resources via Terraform. This IaC technique is another step towards deploy application infrastructure without human intervention. In addition, the fact that it is code means changes can be kept in sync with the underlying application versioning. Unfortunately, Terraform does not support references for NewRelic resources (these can be very useful when working with AWS resources, for example). Nevertheless, Terraform has proven to be a great tool for automatically provisioning our monitoring and alerting infrastructure.

rongjun.xie@shinesolutions.com
No Comments

Leave a Reply