-
Notifications
You must be signed in to change notification settings - Fork 793
[Strings] Add a string lowering pass using magic imports #6497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest idea for efficient string constants is to encode the constants in the import names of their globals and implement fast paths in the engines for materializing those constants at instantiation time without needing to parse anything in JS. This strategy only works for valid strings (i.e. strings without unpaired surrogates) because only valid strings can be used as import names in the WebAssembly syntax. Add a new configuration of the StringLowering pass that encodes valid string contents in import names, falling back to the JSON custom section approach for invalid strings. To test this chang, update the printer to escape import and export names properly and update the legacy parser to parse escapes in import and export names properly. As a drive-by, remove the incorrect check in the parser that the import module and base names are non-empty.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % questions
@@ -516,5 +528,6 @@ struct StringLowering : public StringGathering { | |||
|
|||
Pass* createStringGatheringPass() { return new StringGathering(); } | |||
Pass* createStringLoweringPass() { return new StringLowering(); } | |||
Pass* createStringLoweringMagicImportPass() { return new StringLowering(true); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass* createStringLoweringMagicImportPass() { return new StringLowering(true); } | |
Pass* createMagicStringLoweringPass() { return new StringLowering(true); } |
/jk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🪄 ✨
} | ||
} else if (!allowWTF && 0xDC00 <= *u && *u < 0xE000) { | ||
// Unpaired low surrogate. | ||
return std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these bugfixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is newly necessary to catch strings that are valid WTF-16 but not valid UTF-16.
(assert_invalid | ||
(module (import "" "" (table 10 funcref)) (table 10 funcref)) | ||
"multiple tables" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We of course have supported multiple tables for a long time, and this test should have been removed when that support was introduced. But it was not noticed and continued failing successfully until now because of the empty import names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are empty import names not valid? I thought they were.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are valid, but the legacy text parser was incorrectly rejecting them, which caused this assert_invalid
check to pass. Now that I've fixed the legacy parser to allow empty names, this test started failing.
The latest idea for efficient string constants is to encode the constants in
the import names of their globals and implement fast paths in the engines for
materializing those constants at instantiation time without needing to parse
anything in JS. This strategy only works for valid strings (i.e. strings without
unpaired surrogates) because only valid strings can be used as import names in
the WebAssembly syntax.
Add a new configuration of the StringLowering pass that encodes valid string
contents in import names, falling back to the JSON custom section approach for
invalid strings.
To test this chang, update the printer to escape import and export names
properly and update the legacy parser to parse escapes in import and export
names properly. As a drive-by, remove the incorrect check in the parser that the
import module and base names are non-empty.