Closed
Description
Component(s)
exporter/loadbalancing
What happened?
Description
We are running v0.94.0 in a number of k8s clusters, and are experiencing panics in the agent setup
Steps to Reproduce
I don't have an exact steps to reproduce, but this panic happens quite other across our clusters
Expected Result
No panic
Actual Result
Panic :)
Collector version
v0.94.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
connectors:
null
exporters:
file/logs:
path: /dev/null
file/traces:
path: /dev/null
loadbalancing/traces:
protocol:
otlp:
retry_on_failure:
enabled: true
max_elapsed_time: 30s
max_interval: 5s
sending_queue:
enabled: true
num_consumers: 20
queue_size: 50000
timeout: 20s
tls:
insecure: true
resolver:
k8s:
service: opentelemetry-collector.default
extensions:
health_check: {}
processors:
batch:
send_batch_max_size: 4096
send_batch_size: 4096
timeout: 100ms
filter/fastpath:
traces:
span:
- (end_time_unix_nano - start_time_unix_nano <= 1000000) and parent_span_id.string
!= ""
k8sattributes:
extract:
annotations: null
labels:
- key: app
metadata:
- k8s.deployment.name
- k8s.namespace.name
- k8s.node.name
- k8s.pod.name
- k8s.pod.uid
- container.id
- container.image.name
- container.image.tag
filter:
node_from_env_var: K8S_NODE_NAME
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: host.name
memory_limiter:
check_interval: 1s
limit_percentage: 95
spike_limit_percentage: 10
resource:
attributes:
- action: insert
key: k8s.node.name
value: ${K8S_NODE_NAME}
resource/add_agent_k8s:
attributes:
- action: insert
key: k8s.pod.name
value: ${K8S_POD_NAME}
- action: insert
key: k8s.pod.uid
value: ${K8S_POD_UID}
- action: insert
key: k8s.namespace.name
value: ${K8S_NAMESPACE}
resource/add_cluster_name:
attributes:
- action: upsert
key: k8s.cluster.name
value: test-eu3
resource/add_environment:
attributes:
- action: insert
key: deployment.environment
value: test
resourcedetection:
detectors:
- env
- eks
- ec2
- system
override: false
timeout: 10s
receivers:
otlp:
protocols:
grpc: {}
http: {}
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-agent
scrape_interval: 10s
static_configs:
- targets:
- ${K8S_POD_IP}:9090
service:
extensions:
- health_check
pipelines:
traces:
exporters:
- loadbalancing/traces
processors:
- memory_limiter
- filter/fastpath
- k8sattributes
- resource
- resource/add_cluster_name
- resource/add_environment
- resource/add_agent_k8s
- resourcedetection
receivers:
- otlp
telemetry:
logs:
encoding: json
initial_fields:
service: opentelemetry-agent
level: INFO
sampling:
enabled: true
initial: 3
thereafter: 0
tick: 60s
metrics:
address: 0.0.0.0:9090
Log output
net/http/server.go:3086 +0x4cc
created by net/http.(*Server).Serve in goroutine 745
net/http/server.go:2009 +0x518
net/http.(*conn).serve(0x4004ab4000, {0x862ab78, 0x4001dcfaa0})
net/http/server.go:2938 +0xbc
net/http.serverHandler.ServeHTTP({0x85dff10?}, {0x8608d30?, 0x40001dc380?}, 0x6?)
go.opentelemetry.io/collector/config/[email protected]/clientinfohandler.go:26 +0x100
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP(0x400212cd08, {0x8608d30, 0x40001dc380}, 0x4005a3ae00)
net/http/server.go:2136 +0x38
net/http.HandlerFunc.ServeHTTP(0x4005a3ae00?, {0x8608d30?, 0x40001dc380?}, 0x4005a31af0?)
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:83 +0x40
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1({0x8608d30?, 0x40001dc380?}, 0x4005a31ad8?)
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:225 +0xf44
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP(0x4002c0d110, {0x8608d30?, 0x40001dc380}, 0x4005a3af00, {0x85aa920, 0x4001854080})
go.opentelemetry.io/collector/config/[email protected]/compression.go:147 +0x150
go.opentelemetry.io/collector/config/confighttp.(*decompressor).ServeHTTP(0x4001854080, {0x8620d30, 0x40070b4540}, 0x4005a3b000)
net/http/server.go:2514 +0x144
net/http.(*ServeMux).ServeHTTP(0x4001854080?, {0x8620d30, 0x40070b4540}, 0x4005a3b000)
net/http/server.go:2136 +0x38
net/http.HandlerFunc.ServeHTTP(0x4005a31398?, {0x8620d30?, 0x40070b4540?}, 0x0?)
go.opentelemetry.io/collector/receiver/[email protected]/otlp.go:129 +0x28
go.opentelemetry.io/collector/receiver/otlpreceiver.(*otlpReceiver).startHTTPServer.func1({0x8620d30?, 0x40070b4540?}, 0x6422920?)
go.opentelemetry.io/collector/receiver/[email protected]/otlphttp.go:43 +0xb0
go.opentelemetry.io/collector/receiver/otlpreceiver.handleTraces({0x8620d30, 0x40070b4540}, 0x4005a3b000, 0x400521c4e0?)
go.opentelemetry.io/collector/receiver/[email protected]/internal/trace/otlp.go:42 +0xa4
go.opentelemetry.io/collector/receiver/otlpreceiver/internal/trace.(*Receiver).Export(0x400212ca38, {0x862ab78, 0x400521c630}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/[email protected]/internal/fanoutconsumer/traces.go:60 +0x208
go.opentelemetry.io/collector/internal/fanoutconsumer.(*tracesConsumer).ConsumeTraces(0x4002c82c60, {0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/trace_exporter.go:121 +0x160
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/loadbalancingexporter.(*traceExporterImp).ConsumeTraces(0x4002c0f170, {0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/trace_exporter.go:134 +0x16c
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/loadbalancingexporter.(*traceExporterImp).consumeTrace(0x40055565b8?, {0x862ab78, 0x400521c690}, 0xa?)
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/exporterhelper/traces.go:99 +0xb4
go.opentelemetry.io/collector/exporter/exporterhelper.NewTracesExporter.func1({0x862ab78, 0x400521c690}, {0x400614f3f8?, 0x4013439754?})
go.opentelemetry.io/collector/[email protected]/exporterhelper/common.go:199 +0x50
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send(0x401368f540, {0x862ab78?, 0x400521c690?}, {0x85dfa10?, 0x400614f758?})
go.opentelemetry.io/collector/[email protected]/exporterhelper/queue_sender.go:154 +0xa8
go.opentelemetry.io/collector/exporter/exporterhelper.(*queueSender).send(0x400faaaf00, {0x862ab78?, 0x400521c690?}, {0x85dfa10, 0x400614f758})
go.opentelemetry.io/collector/[email protected]/internal/queue/bounded_memory_queue.go:43
go.opentelemetry.io/collector/exporter/internal/queue.(*boundedMemoryQueue[...]).Offer(...)
runtime/panic.go:914 +0x218
panic({0x64d27e0?, 0x8591d40?})
go.opentelemetry.io/otel/[email protected]/trace/span.go:437 +0x7f8
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End(0x4003f3c900, {0x0, 0x0, 0x286b4?})
go.opentelemetry.io/otel/[email protected]/trace/span.go:405 +0x2c
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.func1()
runtime/panic.go:920 +0x26c
panic({0x64d27e0?, 0x8591d40?})
net/http/server.go:1868 +0xb0
net/http.(*conn).serve.func1()
goroutine 419349 [running]:
2024/02/26 12:34:34 http: panic serving 10.0.58.206:49066: send on closed channel
Additional context
My guess is that the k8s resolver doesn't shutdown exporters properly?