Skip to content

[exporterhelper/batchsender] batchsender deadlock preventing shutdown #10255

Closed
@timannguyen

Description

@timannguyen

Describe the bug

deadlock during batchsender shutdown

  1. main goroutine shutdowns
  2. main goroutine close(shutdownch) https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/batch_sender.go#L217
  3. ticker goroutine tries to obtain lock https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/batch_sender.go#L69
  4. sending goroutine already have the lock and is trying to flush https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/batch_sender.go#L197
  5. Deadlock because ticker goroutine will never be able to obtain the lock and to receive resetTimerCh sending goroutine is waiting to push

Steps to reproduce

I have a unit test to reproduce the issue timannguyen@adf2fb9

the mergeFunc would need to take a bit of time in the sendMergeBatch during shutdown to cause this deadlock

What did you expect to see?

to shutdown without deadlock

What did you see instead?

deadlock when a sending goroutine holding to the lock while the ticket goroutine is trying to get the lock. This prevents shutdown

What version did you use?

pdata 1.7.0
otel 0.100.0

What config did you use?

Environment

MACOS

ubuntu

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions