You can create an alarm based on CloudWatch anomaly detection, which mines past metric data and creates a model of expected values. The expected values take into account the typical hourly, daily, and weekly patterns in the metric.
You set a value for the anomaly detection threshold, and CloudWatch uses this threshold with the model to determine the “normal” range of values for the metric. A higher value for the threshold produces a thicker band of “normal” values.
You can choose whether the alarm is triggered when the metric value is above the band of expected values, below the band, or either above or below the band.
Learn more about Anomaly Detection here.
Go to CloudWatch Metrics
This will take you to the Metrics home page, where you will see all the Namespaces available in the account. Select
ClusterNamedimension and select one of the CPU Utilization metrics. If you see
pod_memory_utilization, select that which will show your screen similar to the one below
Graphed Metricstab and select the
Anomaly Detectionbutton as shown below
Get a deep dive experience on Anomaly Detection in the workshop module here.
Your screen should look like the one below with the Anomaly Detection band immediately created
Now click on the create alarm button (🔔) which will take you to the screen similar to the one below
Notice that the
Anomaly detection option is selected. Under
Whenever pod_memory_utilization is... you can select any of the 3 options presented there. Select
Outside of the band
You can also select the standard deviation value under
Anomaly detection threshold section. Change the value to see that the band around the metric timeline changing dynamically
Additional configuration where you can indicate how many occurences of the breach qualifies for the alarm to be triggered. Set the values to 2 out of 5, which will make the alarm get triggered if there has been 2 breaches in 5 evaluation periods. Notice the message at the top of the graph describing the setting as
This alarm will trigger when the blue line goes above the red line for 2 datapoints within 25 minutes.
Configure actions screen, you can set what action you want to take when the alarm changes to different states such as
The available options for actions include, - Send a notification to an SNS topic - Take an Auto scaling action - EC2 action if the metric is from an EC2 instance
Create a new topic to create a new SNS topic to send the notification to and provide your email address.
Create topicto create the SNS topic
Next, give the alarm a name and click
Nextagain to review the configuration
Createto create the alarm
Once you have created the alarm, you will notice that the alarm is now in
Insufficient data state which indicates that there is not enough data to validate the alarm. Waiting for 5 minutes will change the alarm state to
OK in green.
Click on the alarm to see the alarm details as shown below. Notice the alarm state now changed to ‘OK’ after evaluation was complete.