Skip to content

Add both bytes and items sizes to the persistent metadata #13262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 2, 2025

Conversation

bogdandrutu
Copy link
Member

No changelog since this is not released yet.

@bogdandrutu bogdandrutu requested review from dmitryax and a team as code owners June 24, 2025 16:26
@bogdandrutu bogdandrutu added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Jun 24, 2025
Copy link

codecov bot commented Jun 24, 2025

Codecov Report

Attention: Patch coverage is 88.23529% with 14 lines in your changes missing coverage. Please review.

Project coverage is 91.59%. Comparing base (6408887) to head (1d561b3).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
.../exporterhelper/internal/queue/persistent_queue.go 91.37% 4 Missing and 1 partial ⚠️
exporter/exporterhelper/internal/queue/queue.go 70.58% 3 Missing and 2 partials ⚠️
.../exporterhelper/internal/queuebatch/queue_batch.go 75.00% 2 Missing and 2 partials ⚠️

❌ Your patch status has failed because the patch coverage (88.23%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13262      +/-   ##
==========================================
- Coverage   91.63%   91.59%   -0.04%     
==========================================
  Files         522      521       -1     
  Lines       29168    29240      +72     
==========================================
+ Hits        26727    26783      +56     
- Misses       1923     1938      +15     
- Partials      518      519       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@malus2077 malus2077 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation looks clear and solid—really appreciate the thoughtful design.

One remaining question: After a user updates the sizer configuration, how should the queue reliably determine the correct sizer to use for mixed entries? The current logic doesn’t seem to fully address this scenario.

Do you have any suggestions or guidance on how we might handle this cleanly? Would greatly appreciate your insights here. Thanks!

// Current total size of the queue (in bytes, items, or requests).
sfixed64 queue_size = 2;
// PersistentMetadata holds all persistent metadata for the queue.
// The items and bytes sizes are recorded explicitly, the requests size can be calculated as (write_index - read_index).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d lean toward keeping request_size here — once a request is Read() into currentDispatchItems but hasn’t hit Done() yet, queue_size is roughly writeIndex - readIndex + len(currentDispatchItems).

If we don’t stash request_size at that point, we’d need to do extra work during that in-between state. Holding onto it keeps things simple and lines up with how the other two sizer types are handled.

@bogdandrutu
Copy link
Member Author

One remaining question: After a user updates the sizer configuration, how should the queue reliably determine the correct sizer to use for mixed entries? The current logic doesn’t seem to fully address this scenario.

With this proposal we have all possible sizes recorded in the metadata. It does not matter what sizer user used before or they will use now, since we have all correct sizes.

@bogdandrutu bogdandrutu force-pushed the persistent-metadata branch from 57a96d4 to f4f149d Compare June 25, 2025 20:57
@malus2077
Copy link
Contributor

With this proposal we have all possible sizes recorded in the metadata. It does not matter what sizer user used before or they will use now, since we have all correct sizes.

I’ve completed the remaining implementation based on this PR—see #13274 for details. Please pay special attention to how legacy data sizes are handled in that PR. The current implementation is primarily a quick validation to align ideas, and doesn’t yet cover all finer details.

@malus2077
Copy link
Contributor

I'd love your input on a couple of things.

I look forward to your thoughts!

@bogdandrutu bogdandrutu force-pushed the persistent-metadata branch 2 times, most recently from 24a46d5 to f35f0d5 Compare June 28, 2025 08:11
@bogdandrutu
Copy link
Member Author

@malus2077 I want to land #13043 first, because we don't need to care about previous saved size, since we only allowed request sizer the backup size is useless and only complicates logic.

@malus2077
Copy link
Contributor

@malus2077 I want to land #13043 first, because we don't need to care about previous saved size, since we only allowed request sizer the backup size is useless and only complicates logic.

I’m concerned about a potential runtime issue after loading old data. Consider this scenario:

  1. An old item from the persistent queue is read and processed, but it lacks accurate itemsSize or bytesSize information.
  2. When processing completes, onDone() is called with itemsSize=0 and bytesSize=0.
  3. As a result, the queue’s internal metadata.ItemsSize and metadata.BytesSize counters are decremented incorrectly, which could eventually lead to an inconsistent state (such as a negative queue size).

What do you think about this potential issue?

@bogdandrutu bogdandrutu force-pushed the persistent-metadata branch 7 times, most recently from 1ab5468 to f9eca56 Compare July 2, 2025 16:11
@bogdandrutu
Copy link
Member Author

@malus2077 here is the complete implementation, still not saving the sizes, but you have a PR to do the conversion from old to new and save the new format.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach LGTM. The only concern is that it might be costly to calculate the bytes size for every request even if it's not being used. But I don't have any better solution in mind

@bogdandrutu
Copy link
Member Author

This approach LGTM. The only concern is that it might be costly to calculate the bytes size for every request even if it's not being used. But I don't have any better solution in mind

I have an optimization in mind, if items sizer is used to not call the bytes sizer, but to use len(byte[]) returned since we anyway serialize the value. Will do that in a followup.

@bogdandrutu bogdandrutu force-pushed the persistent-metadata branch 2 times, most recently from c725cdf to 8d9206e Compare July 2, 2025 20:41
@bogdandrutu bogdandrutu force-pushed the persistent-metadata branch from 8d9206e to e43a984 Compare July 2, 2025 20:52
@bogdandrutu bogdandrutu force-pushed the persistent-metadata branch from e43a984 to 1d561b3 Compare July 2, 2025 21:02
@bogdandrutu bogdandrutu enabled auto-merge July 2, 2025 21:12
@bogdandrutu bogdandrutu added this pull request to the merge queue Jul 2, 2025
Merged via the queue into open-telemetry:main with commit acb60bc Jul 2, 2025
55 of 56 checks passed
@bogdandrutu bogdandrutu deleted the persistent-metadata branch July 2, 2025 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants