Skip to content

KAFKA-19463: nextFetchOffset does not take ongoing state transition into account #20080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 2, 2025

Conversation

adixitconfluent
Copy link
Contributor

@adixitconfluent adixitconfluent commented Jul 1, 2025

About

nextFetchOffset function in SharePartition updates the fetch offsets without considering batches/offsets which might be undergoing state transition. This can cause problems in updating to the right fetch offset.

Testing

The new code added has been tested with the help of unit tests.

Reviewers: Apoorv Mittal [email protected]

@github-actions github-actions bot added triage PRs from the community core Kafka Broker KIP-932 Queues for Kafka labels Jul 1, 2025
@apoorvmittal10 apoorvmittal10 added ci-approved and removed triage PRs from the community labels Jul 1, 2025
Copy link
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minor comments.

Comment on lines 2091 to 2093
if (state.state != RecordState.ARCHIVED) {
findNextFetchOffset.set(true);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change here makes sense, but do we need to remove findNextFetchOffset.set(true); at other places where we just started the transaction i.e. in acknowledgement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I have created https://issues.apache.org/jira/browse/KAFKA-19464 to handle it for the future.

@@ -2088,6 +2088,9 @@ void rollbackOrProcessStateUpdates(
state.completeStateTransition(true);
// Cancel the acquisition lock timeout task for the state since it is acknowledged/released successfully.
state.cancelAndClearAcquisitionLockTimeoutTask();
if (state.state != RecordState.ARCHIVED) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be:

Suggested change
if (state.state != RecordState.ARCHIVED) {
if (state.state == AVAILABLE) {

Comment on lines 7391 to 7392
// Mocking the persister write state RPC to return future 1 when acknowledgement occurs for offsets 0-9.
// Mocking the persister write state RPC to return future 2 when acknowledgement occurs for offsets 10-19.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Mocking the persister write state RPC to return future 1 when acknowledgement occurs for offsets 0-9.
// Mocking the persister write state RPC to return future 2 when acknowledgement occurs for offsets 10-19.
// Mocking the persister write state RPC to return future 1 and future 2 when acknowledgement occurs for offsets 0-9 and 10-19 respectively.

Comment on lines 7438 to 7439
// Mocking the persister write state RPC to return future 1 when acknowledgement occurs for offsets 5-9.
// Mocking the persister write state RPC to return future 2 when acknowledgement occurs for offsets 20-24.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above.

Copy link
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, LGTM!

@apoorvmittal10 apoorvmittal10 merged commit 7cb370b into apache:trunk Jul 2, 2025
20 checks passed
jiafu1115 pushed a commit to jiafu1115/kafka that referenced this pull request Jul 3, 2025
…nto account (apache#20080)

### About
`nextFetchOffset` function in `SharePartition` updates the fetch offsets
without considering batches/offsets which might be undergoing state
transition. This can cause problems in updating to the right fetch
offset.

### Testing
The new code added has been tested with the help of unit tests.

Reviewers: Apoorv Mittal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved core Kafka Broker KIP-932 Queues for Kafka
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants