-
Notifications
You must be signed in to change notification settings - Fork 539
Fix and clarify CR LF normalization and CR in string literals #1944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was slightly incorrect before. Relevant commits changing this: - fa56fdb - 27e1ec9 The normalization is not applied repeatedly, so CR LF pairs can still exist. Further, given that the normalization happens before lexing, the part "other than as part of such a string continuation escape" is not useful. Either it was CR LF in the raw input, but has already been transformed already (so the lexical grammar does not see CR). Or there is a surviving CR LF pair after the normalization, which is disallowed tho. Here are two test programs showing this behavior: printf 'fn main() { "a\r\r\n\nb"; }' > code.rs | rustc - Results in: error: bare CR not allowed in string, use `\r` instead --> <anon>:1:15 | 1 | fn main() { "a␍ | ^ | help: escape the character | 1 | fn main() { "a\r | ++ And printf 'fn main() { "a\\\r\r\n\nb"; }' > code.rs | rustc - Results in error: unknown character escape: `\r` --> <anon>:1:16 | 1 | fn main() { "a\␍ | ^ unknown character escape | = help: this is an isolated carriage return; consider checking your editor and version control settings
ehuss
approved these changes
Jul 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
LukasKalbertodt
added a commit
to LukasKalbertodt/litrs
that referenced
this pull request
Jul 25, 2025
The specification now says that CR LF normalization is part of the pre-processing prior to tokenization. Since it's not part of the lex grammar, this commit removes it from the parsing code as well. CR are simply disallowed fully now. This is technically a breaking change, but unlikely to be noticed by any real world input. In proc macros, all input has been normalized by the Rust compiler anyway, so only very weird input (CR CR LF) would have been accepted previously, but not after this commit. See: - rust-lang/reference#1944 - rust-lang/reference@fa56fdb - rust-lang/reference@27e1ec9
tgross35
added a commit
to tgross35/rust
that referenced
this pull request
Aug 7, 2025
Update books ## rust-lang/book 5 commits in b2d1a0821e12a676b496d61891b8e3d374a8e832..3e9dc46aa563ca0c53ec826c41b05f10c5915925 2025-08-02 01:33:29 UTC to 2025-07-14 21:23:38 UTC - Appendix B and Appendix D from tech review (rust-lang/book#4466) - Chapter 21 from tech review (rust-lang/book#4464) - Chapter 20 from tech review (rust-lang/book#4460) - Chapter 19 from tech review (rust-lang/book#4446) - Chapter 18 from tech review (rust-lang/book#4445) ## rust-lang/reference 12 commits in 1f45bd41fa6c17b7c048ed6bfe5f168c4311206a..1be151c051a082b542548c62cafbcb055fa8944f 2025-08-05 19:51:40 UTC to 2025-07-14 19:49:01 UTC - Fix build output directory in README (rust-lang/reference#1950) - Update `link_name` to use the attribute template (rust-lang/reference#1896) - Update `no_link` to use the attribute template (rust-lang/reference#1898) - Update `proc_macro_derive` to use the attribute template (rust-lang/reference#1888) - Update `automatically_derived` to use the attribute template (rust-lang/reference#1884) - Update `derive` to use the attribute template (rust-lang/reference#1883) - Fix and clarify CR LF normalization and CR in string literals (rust-lang/reference#1944) - glossary.md: tweak description of "dispatch" (rust-lang/reference#1938) - add missing id, r[asm.operand-type.supported-operands.const] (rust-lang/reference#1939) - &str and &[u8] have the same layout (rust-lang/reference#1848) - Rename and rewrite the "question mark operator" (rust-lang/reference#1931) - Change "allocated object" to "allocation". (rust-lang/reference#1930) ## rust-lang/rust-by-example 3 commits in e386be5f44af711854207c11fdd61bb576270b04..bd1279cdc9865bfff605e741fb76a0b2f07314a7 2025-08-04 13:41:04 UTC to 2025-08-02 15:41:59 UTC - Improve the activity instructions in `print_display` (rust-lang/rust-by-example#1948) - Minor fixes (whitespace, typo, i32->u32) (rust-lang/rust-by-example#1947) - Document drawbacks of alternatives to match binding (rust-lang/rust-by-example#1946)
Zalathar
added a commit
to Zalathar/rust
that referenced
this pull request
Aug 7, 2025
Update books ## rust-lang/book 5 commits in b2d1a0821e12a676b496d61891b8e3d374a8e832..3e9dc46aa563ca0c53ec826c41b05f10c5915925 2025-08-02 01:33:29 UTC to 2025-07-14 21:23:38 UTC - Appendix B and Appendix D from tech review (rust-lang/book#4466) - Chapter 21 from tech review (rust-lang/book#4464) - Chapter 20 from tech review (rust-lang/book#4460) - Chapter 19 from tech review (rust-lang/book#4446) - Chapter 18 from tech review (rust-lang/book#4445) ## rust-lang/reference 12 commits in 1f45bd41fa6c17b7c048ed6bfe5f168c4311206a..1be151c051a082b542548c62cafbcb055fa8944f 2025-08-05 19:51:40 UTC to 2025-07-14 19:49:01 UTC - Fix build output directory in README (rust-lang/reference#1950) - Update `link_name` to use the attribute template (rust-lang/reference#1896) - Update `no_link` to use the attribute template (rust-lang/reference#1898) - Update `proc_macro_derive` to use the attribute template (rust-lang/reference#1888) - Update `automatically_derived` to use the attribute template (rust-lang/reference#1884) - Update `derive` to use the attribute template (rust-lang/reference#1883) - Fix and clarify CR LF normalization and CR in string literals (rust-lang/reference#1944) - glossary.md: tweak description of "dispatch" (rust-lang/reference#1938) - add missing id, r[asm.operand-type.supported-operands.const] (rust-lang/reference#1939) - &str and &[u8] have the same layout (rust-lang/reference#1848) - Rename and rewrite the "question mark operator" (rust-lang/reference#1931) - Change "allocated object" to "allocation". (rust-lang/reference#1930) ## rust-lang/rust-by-example 3 commits in e386be5f44af711854207c11fdd61bb576270b04..bd1279cdc9865bfff605e741fb76a0b2f07314a7 2025-08-04 13:41:04 UTC to 2025-08-02 15:41:59 UTC - Improve the activity instructions in `print_display` (rust-lang/rust-by-example#1948) - Minor fixes (whitespace, typo, i32->u32) (rust-lang/rust-by-example#1947) - Document drawbacks of alternatives to match binding (rust-lang/rust-by-example#1946)
rust-timer
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 7, 2025
Rollup merge of #145026 - rustbot:docs-update, r=ehuss Update books ## rust-lang/book 5 commits in b2d1a0821e12a676b496d61891b8e3d374a8e832..3e9dc46aa563ca0c53ec826c41b05f10c5915925 2025-08-02 01:33:29 UTC to 2025-07-14 21:23:38 UTC - Appendix B and Appendix D from tech review (rust-lang/book#4466) - Chapter 21 from tech review (rust-lang/book#4464) - Chapter 20 from tech review (rust-lang/book#4460) - Chapter 19 from tech review (rust-lang/book#4446) - Chapter 18 from tech review (rust-lang/book#4445) ## rust-lang/reference 12 commits in 1f45bd41fa6c17b7c048ed6bfe5f168c4311206a..1be151c051a082b542548c62cafbcb055fa8944f 2025-08-05 19:51:40 UTC to 2025-07-14 19:49:01 UTC - Fix build output directory in README (rust-lang/reference#1950) - Update `link_name` to use the attribute template (rust-lang/reference#1896) - Update `no_link` to use the attribute template (rust-lang/reference#1898) - Update `proc_macro_derive` to use the attribute template (rust-lang/reference#1888) - Update `automatically_derived` to use the attribute template (rust-lang/reference#1884) - Update `derive` to use the attribute template (rust-lang/reference#1883) - Fix and clarify CR LF normalization and CR in string literals (rust-lang/reference#1944) - glossary.md: tweak description of "dispatch" (rust-lang/reference#1938) - add missing id, r[asm.operand-type.supported-operands.const] (rust-lang/reference#1939) - &str and &[u8] have the same layout (rust-lang/reference#1848) - Rename and rewrite the "question mark operator" (rust-lang/reference#1931) - Change "allocated object" to "allocation". (rust-lang/reference#1930) ## rust-lang/rust-by-example 3 commits in e386be5f44af711854207c11fdd61bb576270b04..bd1279cdc9865bfff605e741fb76a0b2f07314a7 2025-08-04 13:41:04 UTC to 2025-08-02 15:41:59 UTC - Improve the activity instructions in `print_display` (rust-lang/rust-by-example#1948) - Minor fixes (whitespace, typo, i32->u32) (rust-lang/rust-by-example#1947) - Document drawbacks of alternatives to match binding (rust-lang/rust-by-example#1946)
github-actions bot
pushed a commit
to rust-lang/miri
that referenced
this pull request
Aug 8, 2025
Update books ## rust-lang/book 5 commits in b2d1a0821e12a676b496d61891b8e3d374a8e832..3e9dc46aa563ca0c53ec826c41b05f10c5915925 2025-08-02 01:33:29 UTC to 2025-07-14 21:23:38 UTC - Appendix B and Appendix D from tech review (rust-lang/book#4466) - Chapter 21 from tech review (rust-lang/book#4464) - Chapter 20 from tech review (rust-lang/book#4460) - Chapter 19 from tech review (rust-lang/book#4446) - Chapter 18 from tech review (rust-lang/book#4445) ## rust-lang/reference 12 commits in 1f45bd41fa6c17b7c048ed6bfe5f168c4311206a..1be151c051a082b542548c62cafbcb055fa8944f 2025-08-05 19:51:40 UTC to 2025-07-14 19:49:01 UTC - Fix build output directory in README (rust-lang/reference#1950) - Update `link_name` to use the attribute template (rust-lang/reference#1896) - Update `no_link` to use the attribute template (rust-lang/reference#1898) - Update `proc_macro_derive` to use the attribute template (rust-lang/reference#1888) - Update `automatically_derived` to use the attribute template (rust-lang/reference#1884) - Update `derive` to use the attribute template (rust-lang/reference#1883) - Fix and clarify CR LF normalization and CR in string literals (rust-lang/reference#1944) - glossary.md: tweak description of "dispatch" (rust-lang/reference#1938) - add missing id, r[asm.operand-type.supported-operands.const] (rust-lang/reference#1939) - &str and &[u8] have the same layout (rust-lang/reference#1848) - Rename and rewrite the "question mark operator" (rust-lang/reference#1931) - Change "allocated object" to "allocation". (rust-lang/reference#1930) ## rust-lang/rust-by-example 3 commits in e386be5f44af711854207c11fdd61bb576270b04..bd1279cdc9865bfff605e741fb76a0b2f07314a7 2025-08-04 13:41:04 UTC to 2025-08-02 15:41:59 UTC - Improve the activity instructions in `print_display` (rust-lang/rust-by-example#1948) - Minor fixes (whitespace, typo, i32->u32) (rust-lang/rust-by-example#1947) - Document drawbacks of alternatives to match binding (rust-lang/rust-by-example#1946)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This was slightly incorrect before. Relevant commits changing this:
The normalization is not applied repeatedly, so CR LF pairs can still exist. Further, given that the normalization happens before lexing, the part "other than as part of such a string continuation escape" is not useful. Either it was CR LF in the raw input, but has already been transformed already (so the lexical grammar does not see CR). Or there is a surviving CR LF pair after the normalization, which is disallowed tho.
Here are two test programs showing this behavior:
Results in:
And
Results in