Setup Anomaly Detection

When you enable anomaly detection for a metric, CloudWatch applies statistical and machine learning algorithms. These algorithms continuously analyze metrics of systems and applications, determine normal baselines, and surface anomalies with minimal user intervention.

The algorithms generate an anomaly detection model. The model generates a range of expected values that represent normal metric behavior.

Setup Anomaly Detection on a CloudWatch Metric

  1. In the AWS Management Console on the Services menu, click CloudWatch.
  2. In the left navigation menu, click on Metrics.
  3. Click on Container Insights, then the ClusterName, Namespace, PodName dimension.

This will take you to the metrics of the EKS deployment of PetSite.

  1. Type petsite-deployment into the search bar.
  2. Check the checkbox for the metric with the Metric Name pod_cpu_utilization.

Your screen should look similar to the one below:

AD1

  1. Click on the Graphed metrics tab.
  2. Click on the Anomaly Detection icon as shown in the picture below.

AD2

Anomaly Detection (AD) will be enabled immediately. A model is created based on the metric data points for a 2 week period. AD can also be enabled even if there is no data available for 2 week period.

AD3

Modifying the Anomaly Detection expression

Notice the expression below in the Details tab:

ANOMALY_DETECTION_BAND(m1, 2)

This indicates that AD has been enabled for the metric with Id m1 with a standard deviation of 2 as default. You can also adjust the standard deviation to give more wiggle room for the metric datapoint if desired. Simply edit the expression as shown below.

AD4

Editing and deleting an existing model

You can edit a model by clicking on Edit model link located in the Actions row. This will take you to the edit screen, where you can edit the model to exclude a specific time period from the model calculation. For example, if you have a deployment coming up and you expect the metrics during that time to affect the AD model, just add that duration to the excluded period, which will make AD ignore metrics during that timeframe.

AD5

To delete a model, simply click on Delete model.

Creating a model programmatically

You can use the put-anomaly-detector CLI command to create an Anomaly Detection model on a metric programmatically.

  1. In the AWS Management Console on the Services menu, click CloudShell.
  2. Execute the following command in the terminal:

If you deployed petsite on EKS, the following command creates an AD model on the metric pod_cpu_utilization in the ContainerInsights namespace under ClusterName dimension. If your clustername is different, replace petsite to the appropriate name.

 aws cloudwatch put-anomaly-detector --namespace ContainerInsights --metric-name pod_cpu_utilization --stat Average --dimensions Name=ClusterName,Value=petsite

If you deployed petsite on ECS, the following command creates an AD model on the metric CpuUtilized in the ECS/ContainerInsights namespace under ClusterName dimension. Make sure to enter one of the ECS clusters name in the Value parameter.

 aws cloudwatch put-anomaly-detector --namespace ECS/ContainerInsights --metric-name CpuUtilized --stat Average --dimensions Name=ClusterName,Value=<REPLACE_THE_ECS_CLUSTER_NAME_HERE>

Learn about the syntax of this CLI command here.

List all AD models in your account

  1. Execute the following command in the terminal:

This command lists all AD models in your account. It will show details about the ADs in your account along with their training information

aws cloudwatch describe-anomaly-detectors

You should see a result similar to the one below.

{
    "AnomalyDetectors": [
       {
            "Namespace": "ContainerInsights",
            "MetricName": "pod_cpu_utilization",
            "Dimensions": [
                {
                    "Name": "ClusterName",
                    "Value": "petsite"
                }
            ],
            "Stat": "Average",
            "Configuration": {
                "ExcludedTimeRanges": []
            },
            "StateValue": "TRAINED_INSUFFICIENT_DATA"
        }
    ]
}

This concludes the Anomoly Detection module.