Description
Component(s)
exporter/loadbalancingexporter
What happened?
Description
When configuring our load-balancing collector to target our backend collectors via the k8s resolver, we noticed that while the DNS resolution worked fine and the collectors received evenly distributed traffic, the load-balancer would consistently recycle the endpoints at a set cadence (around every 3 minutes). The endpoints would be unchanged.
We added some log statements to the k8s resolver/handler, and they revealed that the OnUpdate()
function in the handler was being invoked. This would imply that some event was triggering the update, but k get endpoints opentelemetry-global-gateway-collector --watch --output-watch-events=true
returned no events for several hours when ran manually.
The net result was no actual changes to the service endpoints, but the exporter would consistently dispose and construct new exporters.
Steps to Reproduce
Configure the k8s resolver to point to a service representing
Expected Result
The OnUpdate()
call in k8s handler only runs when updates occur in the service endpoints pointed to by the k8s resolver.
Actual Result
OnUpdate()
is invoked at a recurring frequency of around every 3 minutes, regardless of changes to the service it points to.
Collector version
v0.105.0
Environment information
Environment
OS: Ubuntu 22.04
Compiler: go1.22.6
OpenTelemetry Collector configuration
receivers:
otlp:
protocols:
grpc: {}
http: {}
processors:
batch:
timeout: 1s
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 20
exporters:
loadbalancing:
protocol:
otlp:
tls:
insecure: true
sending_queue:
queue_size: 100000
num_consumers: 25
resolver:
k8s:
service: opentelemetry-global-gateway-collector-headless.opentelemetry-global-collector
extensions:
health_check:
endpoint: 0.0.0.0:13133
zpages:
endpoint: 0.0.0.0:55679
pprof:
endpoint: localhost:1777
service:
extensions: [health_check, zpages, pprof]
telemetry:
logs:
level: info
encoding: json
metrics:
address: 0.0.0.0:8888
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [loadbalancing]
Log output
Sample Log Output:
{"stream":"stderr","timestamp":1727987229349,"log":{"name":"loadbalancing","ts":1.7279872293494482E9,"data_type":"traces","oldEps":"&Endpoints{ObjectMeta:{opentelemetry-global-gateway-collector-headless opentelemetry-global-collector 6382211d-bb57-4141-8bed-165f8f002e94 2635273389 0 2024-07-18 16:38:27 +0000 UTC <nil> <nil> map[app.kubernetes.io/component:opentelemetry-collector app.kubernetes.io/instance:opentelemetry-global-collector.opentelemetry-global-gateway app.kubernetes.io/managed-by:opentelemetry-operator app.kubernetes.io/name:opentelemetry-global-gateway-collector app.kubernetes.io/part-of:opentelemetry app.kubernetes.io/version:0.105.0 operator.opentelemetry.io/collector-headless-service:Exists operator.opentelemetry.io/collector-service-type:headless service.kubernetes.io/headless:] map[endpoints.kubernetes.io/last-change-trigger-time:2024-10-02T13:28:28Z] [] [] [{kube-controller-manager Update v1 2024-10-02 13:28:28 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:endpoints.kubernetes.io/last-change-trigger-time\":{}},\"f:labels\":{\".\":{},\"f:app.kubernetes.io/component\":{},\"f:app.kubernetes.io/instance\":{},\"f:app.kubernetes.io/managed-by\":{},\"f:app.kubernetes.io/name\":{},\"f:app.kubernetes.io/part-of\":{},\"f:app.kubernetes.io/version\":{},\"f:operator.opentelemetry.io/collector-headless-service\":{},\"f:operator.opentelemetry.io/collector-service-type\":{},\"f:service.kubernetes.io/headless\":{}}},\"f:subsets\":{}} }]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.100.148.213,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-rgz8b,UID:61cdd493-8900-408a-a0fd-8df916f790d7,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-157-74.ec2.internal,},EndpointAddress{IP:10.100.181.244,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-lk86p,UID:31c74954-6e14-4fe7-8a33-9ce1e48013e2,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-187-149.ec2.internal,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:otlp-grpc,Port:4317,Protocol:TCP,AppProtocol:*grpc,},EndpointPort{Name:otlp-http,Port:4318,Protocol:TCP,AppProtocol:*http,},},},},}","resolver":"k8s service","msg":"OnUpDate: Old endpoints > 0, deleting them from endpoints. First callback to 'resolve' invoked.","kind":"exporter","caller":"loadbalancingexporter/resolver_k8s_handler.go:60","epRemove":["10.100.148.213","10.100.181.244"],"level":"info"}}
{"stream":"stderr","timestamp":1727987229349,"log":{"name":"loadbalancing","ts":1.7279872293496487E9,"epAdd":["10.100.148.213","10.100.181.244"],"data_type":"traces","newEps":"&Endpoints{ObjectMeta:{opentelemetry-global-gateway-collector-headless opentelemetry-global-collector 6382211d-bb57-4141-8bed-165f8f002e94 2635273389 0 2024-07-18 16:38:27 +0000 UTC <nil> <nil> map[app.kubernetes.io/component:opentelemetry-collector app.kubernetes.io/instance:opentelemetry-global-collector.opentelemetry-global-gateway app.kubernetes.io/managed-by:opentelemetry-operator app.kubernetes.io/name:opentelemetry-global-gateway-collector app.kubernetes.io/part-of:opentelemetry app.kubernetes.io/version:0.105.0 operator.opentelemetry.io/collector-headless-service:Exists operator.opentelemetry.io/collector-service-type:headless service.kubernetes.io/headless:] map[endpoints.kubernetes.io/last-change-trigger-time:2024-10-02T13:28:28Z] [] [] [{kube-controller-manager Update v1 2024-10-02 13:28:28 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:endpoints.kubernetes.io/last-change-trigger-time\":{}},\"f:labels\":{\".\":{},\"f:app.kubernetes.io/component\":{},\"f:app.kubernetes.io/instance\":{},\"f:app.kubernetes.io/managed-by\":{},\"f:app.kubernetes.io/name\":{},\"f:app.kubernetes.io/part-of\":{},\"f:app.kubernetes.io/version\":{},\"f:operator.opentelemetry.io/collector-headless-service\":{},\"f:operator.opentelemetry.io/collector-service-type\":{},\"f:service.kubernetes.io/headless\":{}}},\"f:subsets\":{}} }]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.100.148.213,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-rgz8b,UID:61cdd493-8900-408a-a0fd-8df916f790d7,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-157-74.ec2.internal,},EndpointAddress{IP:10.100.181.244,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-lk86p,UID:31c74954-6e14-4fe7-8a33-9ce1e48013e2,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-187-149.ec2.internal,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:otlp-grpc,Port:4317,Protocol:TCP,AppProtocol:*grpc,},EndpointPort{Name:otlp-http,Port:4318,Protocol:TCP,AppProtocol:*http,},},},},}","resolver":"k8s service","msg":"OnUpDate: endpoint changes detected, second callback to 'resolve' invoked.","kind":"exporter","caller":"loadbalancingexporter/resolver_k8s_handler.go:77","level":"info"}}
Additional context
No response