Skip to content

RDR: avoid rebuilding dependent crates after comment changes #143249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

yaahc
Copy link
Member

@yaahc yaahc commented Jun 30, 2025

an initial step of Relink don't Rebuild

The goal is to avoid needing to rebuild downstream crates just because an upstream crate changed a comment.

My current plan is to treat a hash of the rmeta as the indicator that "something that would require a rebuild has changed" and add an unstable compiler flag to start moving things that are "relink independent" (I don't know what to call this, "relink compatible" maybe? "relink safe"? /shrug) into a separate file alongside the rmeta.

Right now I've added a test to verify that changing comments changes the rmeta and now I'm trying to dig into the rmeta to understand precisely where that change is happening.

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 30, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bjorn3
Copy link
Member

bjorn3 commented Jul 1, 2025

The crate metadata contains a serialized version of the source map, which in turn contains hashes of all source files. In addition the crate hash also has all source files as one of the inputs.

@yaahc yaahc force-pushed the test-lib-reuse branch from 643a476 to 884e375 Compare July 3, 2025 00:07
@rust-log-analyzer

This comment has been minimized.

@yaahc yaahc force-pushed the test-lib-reuse branch 2 times, most recently from 860979e to 74d596f Compare July 3, 2025 00:23
@rust-log-analyzer

This comment has been minimized.

@yaahc yaahc force-pushed the test-lib-reuse branch from 74d596f to 289cbc4 Compare July 3, 2025 00:34
@rust-log-analyzer

This comment has been minimized.

@yaahc
Copy link
Member Author

yaahc commented Jul 3, 2025

okay so I have a pretty good understanding of why the rmeta is changing now. For the test I've added so far its because of the source_map's sourcefile hash. If the comments are of different lengths it also changes spans elsewhere in the rmeta but I'm gonna leave that aside for now and focus on the case where only characters change rather than being added or removed.

I'm trying to address this by simply moving the sourcemap to a separate file when encoding. My first step goal is to try to just have the encoder output two files, libfoo.rmeta and libfoo.rmeta-extras, rather than one. Then ill hash only the rmeta and show that changing comments doesn't impact that file's hash anymore. After that is done I'll move onto properly decoding from the extra file and making sure everything still else works properly.

Right now I'm running into issues with the EncodeContext. It and the FileEncoder both keep track of where they're at index wise to ensure that you're writing all the values and in the right order. I'm pretty sure when I try to write to a second FileEncoder it is giving a much lower position than the last place it wrote to in the first file encoder, and when I pass that into the encodecontext it fails an assert because its way lower than the last observed position.

My first idea for how to approach this is to have the encodecontext keep track of a second lazy_state when we're also encoding the second file and to somehow tag the positions we pass in with which FileEncoder they're associated with so the encode context can make sure to compare them against the right lazy_state.

@yaahc
Copy link
Member Author

yaahc commented Jul 3, 2025

one concern I have with my current approach is that it feels pretty magic and gross to break the 1-1 relationship between the struct being serialized CrateRoot and the files being output, such that serializing produces two outputs and deserialization requires two inputs.

Another approach could be to split the CrateRoot itself and just encode the main section separately from the extras section with it's own independent EncodeContext. The more I think about it the more maintainable this feels compared to the first approach, so I think I'm gonna go with this approach for now.

@yaahc
Copy link
Member Author

yaahc commented Jul 3, 2025

based on a comment from bjorn in https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/RDR.3A.20insensitivity.20to.20comment.20changes/with/527063126 I'm gonna go ahead and start with not even emitting a second file and just killing the sourcemap when the rdr flag is enabled

@yaahc yaahc force-pushed the test-lib-reuse branch 2 times, most recently from 1ab2d47 to c471f79 Compare July 3, 2025 19:45
@rust-log-analyzer

This comment has been minimized.

@yaahc yaahc force-pushed the test-lib-reuse branch from c471f79 to 9e3d79b Compare July 3, 2025 20:33
@rust-log-analyzer

This comment has been minimized.

@yaahc yaahc force-pushed the test-lib-reuse branch 3 times, most recently from 678e288 to 0a6146b Compare July 3, 2025 23:20
@yaahc yaahc force-pushed the test-lib-reuse branch from 0a6146b to f21539a Compare July 3, 2025 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-run-make Area: port run-make Makefiles to rmake.rs T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants