[processor/batchprocessor] Improve batch processor edge case performance #13272
+146
−25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
As the batch size approaches one, the time complexity of the batch processor approaches O(n^2). This is caused by repeatedly counting the number of data points in a message. This commit introduces a "short-circuit" behavior that allows the core loop to exit quickly when the desired number of data points has been found.
Testing
Performance
I added a new benchmark for
splitMetrics
that tests performance with various numbers of metrics and a batch size of one. This is the before and after comparison produced bybenchstat
:Correctness
I'm relying on the existing test suite to catch issues here.