-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add both bytes and items sizes to the persistent metadata #13262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add both bytes and items sizes to the persistent metadata #13262
Conversation
Codecov ReportAttention: Patch coverage is
❌ Your patch status has failed because the patch coverage (88.23%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #13262 +/- ##
==========================================
- Coverage 91.63% 91.59% -0.04%
==========================================
Files 522 521 -1
Lines 29168 29240 +72
==========================================
+ Hits 26727 26783 +56
- Misses 1923 1938 +15
- Partials 518 519 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation looks clear and solid—really appreciate the thoughtful design.
One remaining question: After a user updates the sizer configuration, how should the queue reliably determine the correct sizer to use for mixed entries? The current logic doesn’t seem to fully address this scenario.
Do you have any suggestions or guidance on how we might handle this cleanly? Would greatly appreciate your insights here. Thanks!
// Current total size of the queue (in bytes, items, or requests). | ||
sfixed64 queue_size = 2; | ||
// PersistentMetadata holds all persistent metadata for the queue. | ||
// The items and bytes sizes are recorded explicitly, the requests size can be calculated as (write_index - read_index). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d lean toward keeping request_size
here — once a request is Read()
into currentDispatchItems
but hasn’t hit Done()
yet, queue_size
is roughly writeIndex - readIndex + len(currentDispatchItems)
.
If we don’t stash request_size
at that point, we’d need to do extra work during that in-between state. Holding onto it keeps things simple and lines up with how the other two sizer types are handled.
With this proposal we have all possible sizes recorded in the metadata. It does not matter what sizer user used before or they will use now, since we have all correct sizes. |
57a96d4
to
f4f149d
Compare
I’ve completed the remaining implementation based on this PR—see #13274 for details. Please pay special attention to how legacy data sizes are handled in that PR. The current implementation is primarily a quick validation to align ideas, and doesn’t yet cover all finer details. |
I'd love your input on a couple of things.
I look forward to your thoughts! |
24a46d5
to
f35f0d5
Compare
@malus2077 I want to land #13043 first, because we don't need to care about previous saved size, since we only allowed request sizer the backup size is useless and only complicates logic. |
I’m concerned about a potential runtime issue after loading old data. Consider this scenario:
What do you think about this potential issue? |
1ab5468
to
f9eca56
Compare
@malus2077 here is the complete implementation, still not saving the sizes, but you have a PR to do the conversion from old to new and save the new format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach LGTM. The only concern is that it might be costly to calculate the bytes size for every request even if it's not being used. But I don't have any better solution in mind
I have an optimization in mind, if items sizer is used to not call the bytes sizer, but to use len(byte[]) returned since we anyway serialize the value. Will do that in a followup. |
c725cdf
to
8d9206e
Compare
8d9206e
to
e43a984
Compare
Signed-off-by: Bogdan Drutu <[email protected]>
e43a984
to
1d561b3
Compare
acb60bc
No changelog since this is not released yet.