Skip to content

AWS CloudWatch logs for Container Insights contain no CPU usage metrics when setting collection_interval to more than 300s #36109

@oleksandr-san

Description

@oleksandr-san

Component(s)

receiver/awscontainerinsight

What happened?

Description

We've tried to increase the collection_interval parameter for the receivers.awscontainerinsight component to optimize AWS CloudWatch costs.

I've figured, that it is related to the TTL in the map used to store metric deltas: when the collection interval is more than 5 minutes, collecting deltas breaks because older deltas get removed before new deltas are applied.

Increasing the cleanInterval to 15 minutes helps.

Steps to Reproduce

  1. Create any EKS cluster
  2. Install OTEL to collect AWS Container Insights
  3. Set receivers.awscontainerinsightreceiver.collection_interval to 600s
  4. Restart the daemonset
  5. Wait for 15-20 minutes

Expected Result

Log events in CloudWatch contain CPU usage metrics

Actual Result

Log events in CloudWatch do not contain CPU usage metrics

Collector version

0.41.1

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

extensions:
    health_check:

 receivers:
   awscontainerinsightreceiver:
     collection_interval: 600s

 processors:
   batch/metrics:
     timeout: 60s
 
   exporters:
      awsemf:
        namespace: ContainerInsights
        log_group_name: '/aws/containerinsights/{ClusterName}/performance'
        log_stream_name: '{NodeName}'
        resource_to_telemetry_conversion:
          enabled: true
        dimension_rollup_option: NoDimensionRollup
        parse_json_encoded_attr_values: [Sources, kubernetes]
        metric_declarations:
          # cluster metrics
          - dimensions: [[ClusterName]]
            metric_name_selectors:
              - cluster_node_count
              - cluster_failed_node_count

    service:
      pipelines:
        metrics:
          receivers: [awscontainerinsightreceiver]
          processors: [batch/metrics]
          exporters: [awsemf]

      extensions: [health_check]

Log output

No response

Additional context

Log event with collection_interval == 600s:

{
    "AutoScalingGroupName": "eks-agent-ng-arm64-4ac815a7-3a71-20b4-a604-aa35acfabcd4",
    "ClusterName": "cluster-with-agent",
    "InstanceId": "i-019f99ea685e48c83",
    "InstanceType": "t4g.medium",
    "Namespace": "kube-system",
    "NodeName": "ip-172-31-28-91.eu-north-1.compute.internal",
    "PodName": "aws-node",
    "Sources": [
        "cadvisor",
        "pod",
        "calculated"
    ],
    "Timestamp": "1730302312567",
    "Type": "Container",
    "Version": "0",
    "container_memory_cache": 106377216,
    "container_memory_failcnt": 0,
    "container_memory_mapped_file": 811008,
    "container_memory_max_usage": 160075776,
    "container_memory_rss": 28655616,
    "container_memory_swap": 0,
    "container_memory_usage": 136433664,
    "container_memory_utilization": 1.1755803143695827,
    "container_memory_working_set": 47341568,
    "container_status": "Running",
    "kubernetes": {
        "container_name": "aws-node",
        "containerd": {
            "container_id": "aabb7c4bea02cfe72371bb5a36bbcd23eff478078c6e920b77e1e9e0ade591b9"
        },
        "host": "ip-172-31-28-91.eu-north-1.compute.internal",
        "labels": {
            "app.kubernetes.io/instance": "aws-vpc-cni",
            "app.kubernetes.io/name": "aws-node",
            "controller-revision-hash": "588469c5c6",
            "k8s-app": "aws-node",
            "pod-template-generation": "2"
        },
        "namespace_name": "kube-system",
        "pod_id": "c3476737-e9d4-44cb-a20f-dcb812ac9091",
        "pod_name": "aws-node-wghkn",
        "pod_owners": [
            {
                "owner_kind": "DaemonSet",
                "owner_name": "aws-node"
            }
        ]
    },
    "number_of_container_restarts": 0
}

Log event with the default configuration:

{
    "AutoScalingGroupName": "eks-agent-ng-1ac79c42-2aa5-ff45-0c1e-b03d703c0d47",
    "ClusterName": "cluster-with-agent",
    "InstanceId": "i-0becbf3535f001cb4",
    "InstanceType": "t3.medium",
    "Namespace": "kube-system",
    "NodeName": "ip-172-31-25-41.eu-north-1.compute.internal",
    "PodName": "aws-node",
    "Sources": [
        "cadvisor",
        "pod",
        "calculated"
    ],
    "Timestamp": "1730371819323",
    "Type": "Container",
    "Version": "0",
    "container_cpu_request": 25,
    "container_cpu_usage_system": 1.3264307613654849,
    "container_cpu_usage_total": 2.9252373450029627,
    "container_cpu_usage_user": 1.393591812573864,
    "container_cpu_utilization": 0.14626186725014814,
    "container_memory_cache": 24600576,
    "container_memory_failcnt": 0,
    "container_memory_hierarchical_pgfault": 267.61999880258816,
    "container_memory_hierarchical_pgmajfault": 0,
    "container_memory_mapped_file": 270336,
    "container_memory_max_usage": 56954880,
    "container_memory_pgfault": 267.61999880258816,
    "container_memory_pgmajfault": 0,
    "container_memory_rss": 26337280,
    "container_memory_swap": 0,
    "container_memory_usage": 52269056,
    "container_memory_utilization": 1.1655047122298874,
    "container_memory_working_set": 47063040,
    "container_status": "Running",
    "kubernetes": {
        "container_name": "aws-node",
        "containerd": {
            "container_id": "b038c0f909602224fa9e1b1351379ff2dc48d0de3e96f720ed80316ada28aca2"
        },
        "host": "ip-172-31-25-41.eu-north-1.compute.internal",
        "labels": {
            "app.kubernetes.io/instance": "aws-vpc-cni",
            "app.kubernetes.io/name": "aws-node",
            "controller-revision-hash": "588469c5c6",
            "k8s-app": "aws-node",
            "pod-template-generation": "2"
        },
        "namespace_name": "kube-system",
        "pod_id": "5e453328-d24c-45d8-9451-7274248cd447",
        "pod_name": "aws-node-wt85g",
        "pod_owners": [
            {
                "owner_kind": "DaemonSet",
                "owner_name": "aws-node"
            }
        ]
    },
    "number_of_container_restarts": 0
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions