-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Closed as not planned
Closed as not planned
Copy link
Labels
StalebugSomething isn't workingSomething isn't workingclosed as inactiveneeds triageNew item requiring triageNew item requiring triagereceiver/awscontainerinsight
Description
Component(s)
receiver/awscontainerinsight
What happened?
Description
We've tried to increase the collection_interval parameter for the receivers.awscontainerinsight
component to optimize AWS CloudWatch costs.
I've figured, that it is related to the TTL in the map used to store metric deltas: when the collection interval is more than 5 minutes, collecting deltas breaks because older deltas get removed before new deltas are applied.
Increasing the cleanInterval
to 15 minutes helps.
Steps to Reproduce
- Create any EKS cluster
- Install OTEL to collect AWS Container Insights
- Set receivers.awscontainerinsightreceiver.collection_interval to 600s
- Restart the daemonset
- Wait for 15-20 minutes
Expected Result
Log events in CloudWatch contain CPU usage metrics
Actual Result
Log events in CloudWatch do not contain CPU usage metrics
Collector version
0.41.1
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
extensions:
health_check:
receivers:
awscontainerinsightreceiver:
collection_interval: 600s
processors:
batch/metrics:
timeout: 60s
exporters:
awsemf:
namespace: ContainerInsights
log_group_name: '/aws/containerinsights/{ClusterName}/performance'
log_stream_name: '{NodeName}'
resource_to_telemetry_conversion:
enabled: true
dimension_rollup_option: NoDimensionRollup
parse_json_encoded_attr_values: [Sources, kubernetes]
metric_declarations:
# cluster metrics
- dimensions: [[ClusterName]]
metric_name_selectors:
- cluster_node_count
- cluster_failed_node_count
service:
pipelines:
metrics:
receivers: [awscontainerinsightreceiver]
processors: [batch/metrics]
exporters: [awsemf]
extensions: [health_check]
Log output
No response
Additional context
Log event with collection_interval == 600s:
{
"AutoScalingGroupName": "eks-agent-ng-arm64-4ac815a7-3a71-20b4-a604-aa35acfabcd4",
"ClusterName": "cluster-with-agent",
"InstanceId": "i-019f99ea685e48c83",
"InstanceType": "t4g.medium",
"Namespace": "kube-system",
"NodeName": "ip-172-31-28-91.eu-north-1.compute.internal",
"PodName": "aws-node",
"Sources": [
"cadvisor",
"pod",
"calculated"
],
"Timestamp": "1730302312567",
"Type": "Container",
"Version": "0",
"container_memory_cache": 106377216,
"container_memory_failcnt": 0,
"container_memory_mapped_file": 811008,
"container_memory_max_usage": 160075776,
"container_memory_rss": 28655616,
"container_memory_swap": 0,
"container_memory_usage": 136433664,
"container_memory_utilization": 1.1755803143695827,
"container_memory_working_set": 47341568,
"container_status": "Running",
"kubernetes": {
"container_name": "aws-node",
"containerd": {
"container_id": "aabb7c4bea02cfe72371bb5a36bbcd23eff478078c6e920b77e1e9e0ade591b9"
},
"host": "ip-172-31-28-91.eu-north-1.compute.internal",
"labels": {
"app.kubernetes.io/instance": "aws-vpc-cni",
"app.kubernetes.io/name": "aws-node",
"controller-revision-hash": "588469c5c6",
"k8s-app": "aws-node",
"pod-template-generation": "2"
},
"namespace_name": "kube-system",
"pod_id": "c3476737-e9d4-44cb-a20f-dcb812ac9091",
"pod_name": "aws-node-wghkn",
"pod_owners": [
{
"owner_kind": "DaemonSet",
"owner_name": "aws-node"
}
]
},
"number_of_container_restarts": 0
}
Log event with the default configuration:
{
"AutoScalingGroupName": "eks-agent-ng-1ac79c42-2aa5-ff45-0c1e-b03d703c0d47",
"ClusterName": "cluster-with-agent",
"InstanceId": "i-0becbf3535f001cb4",
"InstanceType": "t3.medium",
"Namespace": "kube-system",
"NodeName": "ip-172-31-25-41.eu-north-1.compute.internal",
"PodName": "aws-node",
"Sources": [
"cadvisor",
"pod",
"calculated"
],
"Timestamp": "1730371819323",
"Type": "Container",
"Version": "0",
"container_cpu_request": 25,
"container_cpu_usage_system": 1.3264307613654849,
"container_cpu_usage_total": 2.9252373450029627,
"container_cpu_usage_user": 1.393591812573864,
"container_cpu_utilization": 0.14626186725014814,
"container_memory_cache": 24600576,
"container_memory_failcnt": 0,
"container_memory_hierarchical_pgfault": 267.61999880258816,
"container_memory_hierarchical_pgmajfault": 0,
"container_memory_mapped_file": 270336,
"container_memory_max_usage": 56954880,
"container_memory_pgfault": 267.61999880258816,
"container_memory_pgmajfault": 0,
"container_memory_rss": 26337280,
"container_memory_swap": 0,
"container_memory_usage": 52269056,
"container_memory_utilization": 1.1655047122298874,
"container_memory_working_set": 47063040,
"container_status": "Running",
"kubernetes": {
"container_name": "aws-node",
"containerd": {
"container_id": "b038c0f909602224fa9e1b1351379ff2dc48d0de3e96f720ed80316ada28aca2"
},
"host": "ip-172-31-25-41.eu-north-1.compute.internal",
"labels": {
"app.kubernetes.io/instance": "aws-vpc-cni",
"app.kubernetes.io/name": "aws-node",
"controller-revision-hash": "588469c5c6",
"k8s-app": "aws-node",
"pod-template-generation": "2"
},
"namespace_name": "kube-system",
"pod_id": "5e453328-d24c-45d8-9451-7274248cd447",
"pod_name": "aws-node-wt85g",
"pod_owners": [
{
"owner_kind": "DaemonSet",
"owner_name": "aws-node"
}
]
},
"number_of_container_restarts": 0
}
matskula-profisealabs
Metadata
Metadata
Assignees
Labels
StalebugSomething isn't workingSomething isn't workingclosed as inactiveneeds triageNew item requiring triageNew item requiring triagereceiver/awscontainerinsight