Azure Application Insights series part 3: How to configure monitoring alerts

This is the third and final post in a series I’m writing on Azure’s Application Insights (AI) service. In the previous post we looked at how to create monitoring dashboards in Azure.

Here in this post we run through some examples for how to configure monitoring alerts with built-in Azure resource metrics and custom instrumented events and metrics.

Series links

Part 1: Azure Application Insights series part 1: How to instrument your application code for monitoring.

Part 2: Azure Application Insights series part 2: How to create application monitoring dashboards in Azure.

Azure Action Groups

Before we can create alerts we need to define some Azure action groups. An action group defines the set of actions that can be invoked when an alert is triggered. For example: sending a text message, starting an Azure function, or invoking a web hook.

If you use an incident management system, then this is where you would connect/send alerts. For demo purposes I’m just going to have Azure send me text message alerts directly.

Manually creating an action group

1. Open your Application Insights resource in the Azure portal.

2. Click on the Alerts tab, then click on Manage actions.

Screenshot: The ‘manage actions’ button in on the App Insights resource.

3. Click on “+ Add action group“. Provide a full and short name for the action group, along with a resource group and target Azure subscription.

4. You should add at least one action to get started with alerts. For our sample scenario, I have added a single action named ‘Text‘ with the Email/SMS/Push/Voice action type.

The wizard will ask you for the contact information you want the alerts to flow to. Click OK when you are done filling out the forms, and Azure will create your action group.

Creating an action group via automation

You can create action groups in your CI pipeline by using the Azure Resource Manager (ARM) templates (here), or by the Azure API operations for action groups (here).

Kusto Query Language

The third alert example up ahead will use the log query features for Application Insights. The query syntax used for Application Insights logs uses Kusto, which you can read more about here. Providing this info ahead of time so the log queries will make some sense.

Alert example: Built-in resource metric

Scenario: You have a deployed Azure Web App and you want to trigger an alert when a built-in Azure resource metric reaches a certain threshold. For example, when too many HTTP 500 errors are returned to clients.

Creating metric alerts in the Azure portal is pretty easy, but I definitely recommend creating them via Azure Resource Manager (ARM) templates so they can be created like regular objects alongside your other resources in a CI pipeline.

The Azure RM resource we want to create is a Microsoft.Insights/metricAlerts object (documented here). Here is the ARM template resource definition using the most current schema and a couple of input parameters:

{
  "$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "webSiteName": {
      "type": "string"
    },
    "webSiteRegion": {
      "type": "string"
    },
    "actionGroupName": {
      "type": "string"
    }
  },
  "variables": { },
  "functions": [ ],
  "resources": [
    {
      "name": "[concat(parameters('webSiteName'), '-Throwing-500-Errors')]",
      "type": "Microsoft.Insights/metricAlerts",
      "apiVersion": "2018-03-01",
      "location": "global",
      "tags": {},
      "properties": {
        "description": "The Web application has thrown at least 5 HTTP 500 errors in the last 5 minutes.",
        "severity": 2,
        "enabled": true,
        "scopes": [
          "[concat(resourceGroup().id, '/providers/Microsoft.Web/sites/', parameters('webSiteName'))]"
        ],
        "evaluationFrequency": "PT1M",
        "windowSize": "PT5M",
        "targetResourceType": "Microsoft.Web/sites",
        "targetResourceRegion": "[parameters('webSiteRegion')]",
        "criteria": {
          "allOf": [
            {
              "metricName": "Http5xx",
              "metricNamespace": "Microsoft.Web/sites",
              "operator": "GreaterThan",
              "timeAggregation": "Total",
              "name": "Metric1",
              "dimensions": [],
              "monitorTemplateType": "8",
              "criterionType": "StaticThresholdCriterion",
              "threshold": 5
            }
          ],
          "odata.type": "Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriteria"
        },
        "autoMitigate": true,
        "actions": [
          {
            "actionGroupId": "[concat(resourceGroup().id, '/providers/microsoft.insights/actiongroups/', parameters('actionGroupName'))]",
            "webhookProperties": {}
          }
        ]
      }
    }
  ],
  "outputs": { }
}

Which was executed using this PowerShell code:

# azure login
Login-AzureRmAccount

# run the deployment
New-AzureRmResourceGroupDeployment -ResourceGroupName TestRG -Name TestDeployment -TemplateFile 'built-in-metric-alert.json' -Verbose -webSiteName "dashboards-testing-webapp" -webSiteRegion "westus2" -actionGroupName "paging-alerts-actiongroup"

A few things to note when examining this template:

  • The scopes property specifies the target resource the metric is found on. In this case we are providing the fully qualified resource name of a sample website in.
  • The criteria property describes the target metric and number of times it must be reached.
  • The actions property describes the name of the action group that should be invoked when this alert fires.

If we did everything correctly, we should see a new alert show up in the resource group (under hidden resources), or by viewing the alerts tab of the Application Insights resource:

Alert example: Custom metrics

Scenario: You have a deployed Azure Web App and you want to trigger an alert when one of your custom metrics reaches a threshold. These are metrics that you are logging directly to Application Insights through your instrumented application code, instead of a metric baked into an Azure resource.

For the example template I am creating an alert whenever a custom metric called ‘EmailQueue‘ exceeds a value of 1000. The ARM template object is still Microsoft.Insights/metricAlerts, but the template looks slightly different because now the scope is set to the fully qualified name of the Application Insights resource, and the metric namespace is different.

{
  "$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "webSiteName": {
      "type": "string"
    },
    "aiResourceName": {
      "type": "string"
    },
    "actionGroupName": {
      "type": "string"
    }
  },
  "variables": { },
  "functions": [ ],
  "resources": [
    {
      "name": "[concat(parameters('webSiteName'), '-email-queue-is-too-high')]",
      "type": "Microsoft.Insights/metricAlerts",
      "apiVersion": "2018-03-01",
      "location": "global",
      "tags": {},
      "properties": {
        "description": "The custom metric for EmailQueue has exceed 1000 in the last 5 minutes.",
        "severity": 2,
        "enabled": true,
        "scopes": [
          "[concat(resourceGroup().id, '/providers/microsoft.insights/components/', parameters('aiResourceName'))]"
        ],
        "evaluationFrequency": "PT1M",
        "windowSize": "PT5M",
        "targetResourceType": "microsoft.insights/components",
        "criteria": {
          "allOf": [
            {
              "metricName": "EmailQueue",
              "metricNamespace": "Azure.ApplicationInsights",
              "operator": "GreaterThan",
              "timeAggregation": "Maximum",
              "name": "Metric1",
              "dimensions": [],
              "monitorTemplateType": "8",
              "criterionType": "StaticThresholdCriterion",
              "threshold": 1000
            }
          ],
          "odata.type": "Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriteria"
        },
        "autoMitigate": true,
        "actions": [
          {
            "actionGroupId": "[concat(resourceGroup().id, '/providers/microsoft.insights/actiongroups/', parameters('actionGroupName'))]",
            "webhookProperties": {}
          }
        ]
      }
    }
  ],
  "outputs": { }
}

I can trip this alert by logging some test metrics beyond the threshold, which invokes my text message action group. Then I receive this message a few moments later:

Screenshot: an incoming text message alert.

The text message alert follows the format below, so keep this in mind when naming the alerts and action groups:

[Name of action group]:(Alert Status):(Alert Severity) Azure Monitor Alert [Name of alert] on [Name of Azure Resource]

Alert example: Log search alerts

Scenario: You have a deployed Azure Web App and you want to trigger an alert when a custom event is fired. These are events that you are logging directly to Application Insights through your instrumented application code.

For this example I am creating an alert whenever a custom event called ‘StorageLayerCapacityLimitReached‘ has fired — a disastrous (but fictional) event that surely requires on-call engineer intervention.

The ARM template object has changed to Microsoft.Insights/scheduledQueryRules (documented here), and we are leveraging a Kusto query language string that allows us to search the log for recent occurrences of this event.

{
  "$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "aiResourceName": {
      "type": "string"
    },
    "actionGroupName": {
      "type": "string"
    }
  },
  "variables": { },
  "functions": [ ],
  "resources": [
    {
      "type": "Microsoft.Insights/scheduledQueryRules",
      "apiVersion": "2018-04-16",
      "name": "StorageLayerCapacityLimitReachedAlert",
      "location": "westus2",
      "properties": {
        "description": "The storage layer capacity limit has been reached. New storage must be provisioned immediately.",
        "enabled": "true",
        "source": {
          "query": "customEvents | where name == 'StorageLayerCapacityLimitReached'",
          "authorizedResources": [],
          "dataSourceId": "[concat(resourceGroup().id, '/providers/microsoft.insights/components/', parameters('aiResourceName'))]",
          "queryType": "ResultCount"
        },
        "schedule": {
          "frequencyInMinutes": 5,
          "timeWindowInMinutes": 5
        },
        "action": {
          "odata.type": "Microsoft.WindowsAzure.Management.Monitoring.Alerts.Models.Microsoft.AppInsights.Nexus.DataContracts.Resources.ScheduledQueryRules.AlertingAction",
          "severity": "2",
          "aznsAction":{
            "actionGroup": [ "[concat(resourceGroup().id, '/providers/Microsoft.Insights/actionGroups/', parameters('actionGroupName'))]" ]
          },
          "trigger":{
            "thresholdOperator": "GreaterThan",
            "threshold": 0
          }
        }
      }
    }
  ],
  "outputs": { }
}

The resulting text message for this alert follows the same format as the metric alert:

[Name of action group]:(Alert Status):(Alert Severity) Azure Monitor Alert [Name of alert] on [Name of Azure Resource]

ScheduledQueryRule alerts are powerful because you can create alerts off of the results of any conceivable log query, not just queries looking for events. If you get familiar with Kusto you can write some queries that detect more complex alert situations in your application.

Conclusion

This concludes the 3rd and final post in the Application Insights series, where we have covered code instrumentation, monitoring dashboards, and alerts. Hopefully you learned a few things along the way. Good luck and happy monitoring!

2 thoughts on “Azure Application Insights series part 3: How to configure monitoring alerts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s