Prometheus metrics

Learn all about Prometheus metrics support on CloudWatch here

CloudWatch Container Insights monitoring for Prometheus automates the discovery of Prometheus metrics from containerized systems and workloads. Prometheus is an open-source systems monitoring and alerting toolkit.

The PetAdoptions application is already exposing metrics in Prometheus exposition format. Simply add /metrics to the application URL to see the metrics on the browser. You should see a screen similar to the one below.

Prometheus Metrics

Deploy the CloudWatch Prometheus agent on the Petsite ECS cluster

Make sure you are under /cdk/pet_stack folder and execute the following commands:


STACK_NAME=$(aws ssm get-parameter --name '/petstore/stackname' --region $AWS_REGION | jq .Parameter.Value -r)

PETSITE_ECS_CLUSTER=$(aws cloudformation describe-stack-resources  --stack-name $STACK_NAME | jq -r '.StackResources[] | select(.ResourceType == "AWS::ECS::Cluster") | select(.LogicalResourceId | contains("PetSite")) | .PhysicalResourceId')

PETSITE_ECS_SG=$(aws cloudformation describe-stack-resources  --stack-name $STACK_NAME | jq -r '.StackResources[] | select(.ResourceType == "AWS::EC2::SecurityGroup") | select(.LogicalResourceId | contains("petsiteserviceecsserviceServiceSecurityGroup")) | .PhysicalResourceId')

PETSITE_ECS_SUBNET=$(aws cloudformation describe-stack-resources  --stack-name $STACK_NAME | jq -r '[.StackResources[] | select(.ResourceType == "AWS::EC2::Subnet") | select(.LogicalResourceId | contains("MicroservicesPrivateSubnet"))][0] | .PhysicalResourceId')

aws ec2 authorize-security-group-ingress --group-id ${PETSITE_ECS_SG} --protocol tcp --port 80 --source-group ${PETSITE_ECS_SG}

aws cloudformation create-stack --stack-name CWProm-ECS-${PETSITE_ECS_CLUSTER} \
    --template-body file://./resources/cwagent-ecs-prometheus-metric-for-awsvpc.yaml \
    --parameters ParameterKey=ECSClusterName,ParameterValue=${PETSITE_ECS_CLUSTER} \
                 ParameterKey=CreateIAMRoles,ParameterValue=True \
                 ParameterKey=ECSLaunchType,ParameterValue=FARGATE \
                 ParameterKey=SecurityGroupID,ParameterValue=${PETSITE_ECS_SG} \
                 ParameterKey=SubnetID,ParameterValue=${PETSITE_ECS_SUBNET} \
                 ParameterKey=TaskRoleName,ParameterValue=CWProm-Task-${PETSITE_ECS_CLUSTER} \
                 ParameterKey=ExecutionRoleName,ParameterValue=CWProm-Exec-${PETSITE_ECS_CLUSTER} \
    --capabilities CAPABILITY_NAMED_IAM \
    --region ${AWS_REGION} \

Once deployed, execute the following command to verify that the service is deployed successfully.

aws ecs list-services --cluster $PETSITE_ECS_CLUSTER

You should see an output similar to the one below. Notice the service that starts with cwgent-prometheus. That’s the new service which got deployed as a result of this action.

Validate agent deployment

Explore Prometheus metrics

The CloudWatch Prometheus agent creates a new log group named /aws/ecs/containerinsights/<ECS_SERVICE_NAME>/prometheus and collects metrics from the environment in the form of EMF (Embedded Metric Format).

Go to the CloudWatch Log Groups console and look into the log group. You should see a log stream called petsite-webapp under which the log events are collected as shown below.

Prometheus Log events

Now go to CloudWatch Metrics and you will be able to see that a new namespace called ECS/ContainerInsights/Prometheus got created under which there are 2 different dimensions.

Prometheus Metrics dimensions

You can graph and do all metric related actions on the prometheus metrics collected.

Prometheus Metrics Graph

How are metrics being collected?

Open the cwagent-ecs-prometheus-metric-for-awsvpc.yaml file under resources folder. You will see the following JSON configurations as part of the yaml file.

This section below is part of the service discovery setting for the agent. We are using the task definition based service discovery here and hence we are providing the task definition arn pattern so the agent can look into the task definition and retrieve necessary information using the ECS APIs. You can also see that we are configuring the port (port 80) and the url (/metrics) at which the agent can scrape Prometheus metrics.

The ECS CloudWatch Prometheus agent also supports docker label based service discovery as well. Read all about that here

{
    "sd_job_name": "petsite-webapp",
    "sd_metrics_ports": "80",
    "sd_task_definition_arn_pattern": ".*:task-definition/Servicespetsite.*:[0-9]+",
    "sd_metrics_path": "/metrics"
}

This section indicates the specifc metrics that need to be scraped from the metric endpoint. Take a look at the documentation to learn more about scrape configuration options.

{
    "source_labels": ["container_name"],
    "label_matcher": "container",
    "dimensions": [["ClusterName","TaskDefinitionFamily"]],
    "metric_selectors": [
    "^process_cpu_seconds_total$",
    "^process_open_handles$",
    "^process_virtual_memory_bytes$",
    "^process_start_time_seconds$",
    "^process_private_memory_bytes$",
    "^process_working_set_bytes$",
    "^process_num_threads$"
    ]
},
{
    "source_labels": ["container_name"],
    "label_matcher": "container",
    "dimensions": [["ClusterName","TaskDefinitionFamily"]],
    "metric_selectors": [
    "^dotnet_total_memory_bytes$",
    "^dotnet_collection_count_total$",
    "^dotnet_gc_finalization_queue_length$",
    "^dotnet_jit_method_seconds_total$",
    "^dotnet_jit_method_total$",
    "^dotnet_threadpool_adjustments_total$",
    "^dotnet_threadpool_io_num_threads$",
    "^dotnet_threadpool_num_threads$",
    "^dotnet_gc_pinned_objects$",
    "^dotnet_gc_allocated_bytes_total$"
    ]
},
{
    "source_labels": ["container_name"],
    "label_matcher": "container",
    "dimensions": [["ClusterName","TaskGroup"]],
    "metric_selectors": [
    "petsite_pet_bunny_searches_total",
    "petsite_pets_waiting_for_adoption",
    "petsite_petadoptions_total"
    ]
}

Clean up the stack

aws cloudformation delete-stack --stack-name CWProm-ECS-${PETSITE_ECS_CLUSTER}