In the previous section, we have collected metrics from EKS, in this module, we will collect application and platform metrics from services running on ECS with the AWS Distro for OpenTelemetry collector.
Execute the following command to see the raw metrics from the application.
curl -s "$(aws ssm get-parameter --name /petstore/petlistadoptionsmetricsurl --query Parameter.Value --output text)" | grep petlistadoptions
You should see outputs similar to this:
Let’s now use the AWS OpenTelemetry collector to scrape those metrics. Execute the following script to configure an ECR image with a custom configuration called a pipeline to collect application and ECS platform metrics.
WORKSPACE_ID=$(aws amp list-workspaces --alias observability-workshop | jq .workspaces.workspaceId -r) ECR_REPOSITORY_URI=$(aws ecr create-repository --repository aws-otel-collector-petlistadoptions --query repository.repositoryUri --output text) ./resources/build-adot-ecr.sh $WORKSPACE_ID $ECR_REPOSITORY_URI $AWS_REGION
As the previous section created an Amazon ECR image with a custom configuration. The entire configuration describes a pipeline. It defines a path the data follows in the collector starting from reception, then further processing or modification, and finally exiting the collector via exporters. Let’s actually walk through the configuration section. Execute the following script to view the configuration file.
Our pipeline consist of four sections:
They will be in charge of getting telemetry data from the network. In this case, we have two receivers:
prometheuswith a standard Prometheus configuration to scrape the application metrics
awsecscontainermetricswhich reads task metadata and docker stats from Amazon ECS Task Metadata Endpoint and generates resource usage metrics (such as CPU, memory, network, and disk) from them
receivers: prometheus: config: global: scrape_interval: 15s scrape_timeout: 10s scrape_configs: - job_name: "test-prometheus-sample-app" static_configs: - targets: [ 0.0.0.0:80 ] awsecscontainermetrics: collection_interval: 15s
A pipeline can contain sequentially connected processors. The first processor gets the data from one or more receivers that are configured for the pipeline, the last processor sends the data to one or more exporters that are configured for the pipeline.
Getting data from the receiver, the processor section specify exactly which metrics we are expecting from ECS using
filter, we apply
transform to rename our metrics, and finally, we use
resource to rename resource attributes which will be stored as metrics labels.
processors: filter: metrics: include: match_type: strict metric_names: - ecs.task.memory.utilized - ecs.task.memory.reserved - ecs.task.memory.usage - ecs.task.cpu.utilized - ecs.task.cpu.reserved - ecs.task.cpu.usage.vcpu - ecs.task.network.rate.rx - ecs.task.network.rate.tx - ecs.task.storage.read_bytes - ecs.task.storage.write_bytes metricstransform: transforms: - metric_name: ecs.task.memory.utilized action: update new_name: MemoryUtilized - metric_name: ecs.task.memory.reserved action: update new_name: MemoryReserved - metric_name: ecs.task.memory.usage action: update new_name: MemoryUsage - metric_name: ecs.task.cpu.utilized action: update new_name: CpuUtilized - metric_name: ecs.task.cpu.reserved action: update new_name: CpuReserved - metric_name: ecs.task.cpu.usage.vcpu action: update new_name: CpuUsage - metric_name: ecs.task.network.rate.rx action: update new_name: NetworkRxBytes - metric_name: ecs.task.network.rate.tx action: update new_name: NetworkTxBytes - metric_name: ecs.task.storage.read_bytes action: update new_name: StorageReadBytes - metric_name: ecs.task.storage.write_bytes action: update new_name: StorageWriteBytes resource: attributes: - key: ClusterName from_attribute: aws.ecs.cluster.name action: insert - key: aws.ecs.cluster.name action: delete - key: ServiceName from_attribute: aws.ecs.service.name action: insert - key: aws.ecs.service.name action: delete - key: TaskId from_attribute: aws.ecs.task.id action: insert - key: aws.ecs.task.id action: delete - key: TaskDefinitionFamily from_attribute: aws.ecs.task.family action: insert - key: aws.ecs.task.family action: delete
Exporters are responsible for forwarding the data to one or multiple destinations. In this case, we use the awsprometheusremotewrite configuration to send metrics collected to an AMP Workspace.
exporters: awsprometheusremotewrite: aws_auth: endpoint: https://aps-workspaces.eu-west-1.amazonaws.com/workspaces/<workspace-id>/api/v1/remote_write region: eu-west-1 service: aps resource_to_telemetry_conversion: enabled: true logging: loglevel: debug
service glues everything together and defines the pipeline
service: pipelines: metrics: receivers: [prometheus] exporters: [logging, awsprometheusremotewrite] metrics/ecs: receivers: [awsecscontainermetrics] processors: [filter] exporters: [logging, awsprometheusremotewrite]
Visit the OpenTelemetry collector design for more details.
ServiceslistadoptionsservicetaskDefinition*, select the latest version, open the IAM task role in a new window and create a new revision.
public.ecr.aws/aws-observability/aws-otel-collector:latestwith the ECR image URI created earlier.
The script we used to create the image outputs the image URI with an output similar to: The push refers to repository [123456789012.dkr.ecr.eu-west-1.amazonaws.com/aws-otel-collector-petlistadoptions].
In order to sucessfully push metrics into the AMP workspace, ECS will need some IAM permissions.
Go the previously opened tab about the Task IAM role. You can also search for
listadoptionsservicetaskRole in the IAM console.
Click Attach policies, search for
AmazonPrometheusRemoteWriteAccess and proceed.
*PetListAdoptions*cluster, and select the
Open the CloudWatch console and navigate to CloudWatch logs insights. Select the log group
/ecs/PetListAdoptions and enter the following command
fields @message | filter @logStream like 'aws-otel-collector' | sort @timestamp desc | limit 100
For more information on how to query logs using CloudWatch Logs insights, please refer to this section of the workshop
This concludes this section. To visualize the metrics collected, you may continue on to the next sections covering self managed Grafana and Amazon Managed Service for Grafana.