Skip to content

panic using the load balancing exporter #31410

Closed
@grzn

Description

@grzn

Component(s)

exporter/loadbalancing

What happened?

Description

We are running v0.94.0 in a number of k8s clusters, and are experiencing panics in the agent setup

Steps to Reproduce

I don't have an exact steps to reproduce, but this panic happens quite other across our clusters

Expected Result

No panic

Actual Result

Panic :)

Collector version

v0.94.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

connectors:
      null
    exporters:
      file/logs:
        path: /dev/null
      file/traces:
        path: /dev/null
      loadbalancing/traces:
        protocol:
          otlp:
            retry_on_failure:
              enabled: true
              max_elapsed_time: 30s
              max_interval: 5s
            sending_queue:
              enabled: true
              num_consumers: 20
              queue_size: 50000
            timeout: 20s
            tls:
              insecure: true
        resolver:
          k8s:
            service: opentelemetry-collector.default
    extensions:
      health_check: {}
    processors:
      batch:
        send_batch_max_size: 4096
        send_batch_size: 4096
        timeout: 100ms
      filter/fastpath:
        traces:
          span:
          - (end_time_unix_nano - start_time_unix_nano <= 1000000) and parent_span_id.string
            != ""
      k8sattributes:
        extract:
          annotations: null
          labels:
          - key: app
          metadata:
          - k8s.deployment.name
          - k8s.namespace.name
          - k8s.node.name
          - k8s.pod.name
          - k8s.pod.uid
          - container.id
          - container.image.name
          - container.image.tag
        filter:
          node_from_env_var: K8S_NODE_NAME
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: host.name
      memory_limiter:
        check_interval: 1s
        limit_percentage: 95
        spike_limit_percentage: 10
      resource:
        attributes:
        - action: insert
          key: k8s.node.name
          value: ${K8S_NODE_NAME}
      resource/add_agent_k8s:
        attributes:
        - action: insert
          key: k8s.pod.name
          value: ${K8S_POD_NAME}
        - action: insert
          key: k8s.pod.uid
          value: ${K8S_POD_UID}
        - action: insert
          key: k8s.namespace.name
          value: ${K8S_NAMESPACE}
      resource/add_cluster_name:
        attributes:
        - action: upsert
          key: k8s.cluster.name
          value: test-eu3
      resource/add_environment:
        attributes:
        - action: insert
          key: deployment.environment
          value: test
      resourcedetection:
        detectors:
        - env
        - eks
        - ec2
        - system
        override: false
        timeout: 10s
    receivers:
      otlp:
        protocols:
          grpc: {}
          http: {}
      prometheus:
        config:
          scrape_configs:
          - job_name: opentelemetry-agent
            scrape_interval: 10s
            static_configs:
            - targets:
              - ${K8S_POD_IP}:9090
    service:
      extensions:
      - health_check
      pipelines:
        traces:
          exporters:
          - loadbalancing/traces
          processors:
          - memory_limiter
          - filter/fastpath
          - k8sattributes
          - resource
          - resource/add_cluster_name
          - resource/add_environment
          - resource/add_agent_k8s
          - resourcedetection
          receivers:
          - otlp
      telemetry:
        logs:
          encoding: json
          initial_fields:
            service: opentelemetry-agent
          level: INFO
          sampling:
            enabled: true
            initial: 3
            thereafter: 0
            tick: 60s
        metrics:
          address: 0.0.0.0:9090

Log output

net/http/server.go:3086 +0x4cc
created by net/http.(*Server).Serve in goroutine 745
net/http/server.go:2009 +0x518
net/http.(*conn).serve(0x4004ab4000, {0x862ab78, 0x4001dcfaa0})
net/http/server.go:2938 +0xbc
net/http.serverHandler.ServeHTTP({0x85dff10?}, {0x8608d30?, 0x40001dc380?}, 0x6?)
go.opentelemetry.io/collector/config/[email protected]/clientinfohandler.go:26 +0x100
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP(0x400212cd08, {0x8608d30, 0x40001dc380}, 0x4005a3ae00)
net/http/server.go:2136 +0x38
net/http.HandlerFunc.ServeHTTP(0x4005a3ae00?, {0x8608d30?, 0x40001dc380?}, 0x4005a31af0?)
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:83 +0x40
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1({0x8608d30?, 0x40001dc380?}, 0x4005a31ad8?)
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:225 +0xf44
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP(0x4002c0d110, {0x8608d30?, 0x40001dc380}, 0x4005a3af00, {0x85aa920, 0x4001854080})
go.opentelemetry.io/collector/config/[email protected]/compression.go:147 +0x150
go.opentelemetry.io/collector/config/confighttp.(*decompressor).ServeHTTP(0x4001854080, {0x8620d30, 0x40070b4540}, 0x4005a3b000)
net/http/server.go:2514 +0x144
net/http.(*ServeMux).ServeHTTP(0x4001854080?, {0x8620d30, 0x40070b4540}, 0x4005a3b000)
net/http/server.go:2136 +0x38
net/http.HandlerFunc.ServeHTTP(0x4005a31398?, {0x8620d30?, 0x40070b4540?}, 0x0?)
go.opentelemetry.io/collector/receiver/[email protected]/otlp.go:129 +0x28
go.opentelemetry.io/collector/receiver/otlpreceiver.(*otlpReceiver).startHTTPServer.func1({0x8620d30?, 0x40070b4540?}, 0x6422920?)
go.opentelemetry.io/collector/receiver/[email protected]/otlphttp.go:43 +0xb0
go.opentelemetry.io/collector/receiver/otlpreceiver.handleTraces({0x8620d30, 0x40070b4540}, 0x4005a3b000, 0x400521c4e0?)
go.opentelemetry.io/collector/receiver/[email protected]/internal/trace/otlp.go:42 +0xa4
go.opentelemetry.io/collector/receiver/otlpreceiver/internal/trace.(*Receiver).Export(0x400212ca38, {0x862ab78, 0x400521c630}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/[email protected]/internal/fanoutconsumer/traces.go:60 +0x208
go.opentelemetry.io/collector/internal/fanoutconsumer.(*tracesConsumer).ConsumeTraces(0x4002c82c60, {0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/processorhelper/traces.go:60 +0x1c0
go.opentelemetry.io/collector/processor/processorhelper.NewTracesProcessor.func1({0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/trace_exporter.go:121 +0x160
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/loadbalancingexporter.(*traceExporterImp).ConsumeTraces(0x4002c0f170, {0x862ab78, 0x400521c690}, {0x400614e510?, 0x400ee82104?})
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/trace_exporter.go:134 +0x16c
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/loadbalancingexporter.(*traceExporterImp).consumeTrace(0x40055565b8?, {0x862ab78, 0x400521c690}, 0xa?)
go.opentelemetry.io/collector/[email protected]/traces.go:25
go.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces(...)
go.opentelemetry.io/collector/[email protected]/exporterhelper/traces.go:99 +0xb4
go.opentelemetry.io/collector/exporter/exporterhelper.NewTracesExporter.func1({0x862ab78, 0x400521c690}, {0x400614f3f8?, 0x4013439754?})
go.opentelemetry.io/collector/[email protected]/exporterhelper/common.go:199 +0x50
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send(0x401368f540, {0x862ab78?, 0x400521c690?}, {0x85dfa10?, 0x400614f758?})
go.opentelemetry.io/collector/[email protected]/exporterhelper/queue_sender.go:154 +0xa8
go.opentelemetry.io/collector/exporter/exporterhelper.(*queueSender).send(0x400faaaf00, {0x862ab78?, 0x400521c690?}, {0x85dfa10, 0x400614f758})
go.opentelemetry.io/collector/[email protected]/internal/queue/bounded_memory_queue.go:43
go.opentelemetry.io/collector/exporter/internal/queue.(*boundedMemoryQueue[...]).Offer(...)
runtime/panic.go:914 +0x218
panic({0x64d27e0?, 0x8591d40?})
go.opentelemetry.io/otel/[email protected]/trace/span.go:437 +0x7f8
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End(0x4003f3c900, {0x0, 0x0, 0x286b4?})
go.opentelemetry.io/otel/[email protected]/trace/span.go:405 +0x2c
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.func1()
runtime/panic.go:920 +0x26c
panic({0x64d27e0?, 0x8591d40?})
net/http/server.go:1868 +0xb0
net/http.(*conn).serve.func1()
goroutine 419349 [running]:
2024/02/26 12:34:34 http: panic serving 10.0.58.206:49066: send on closed channel

Additional context

My guess is that the k8s resolver doesn't shutdown exporters properly?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions