-
Notifications
You must be signed in to change notification settings - Fork 10.5k
WIP vectorization for UTF16->UTF8 #83073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
stdlib/public/core/UTF16.swift
Outdated
#endif | ||
let mask = Word(truncatingIfNeeded: 0x80808080_80808080 as UInt64) | ||
|
||
#if (arch(arm64) || arch(arm64_32))// && SWIFT_STDLIB_ENABLE_VECTOR_TYPES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do an x86 version of this, but it was giving me weird errors and I wanted to get the core thing working before dealing with them. TBH it's probably still faster on x86 than it was even this way though.
stdlib/public/core/UTF16.swift
Outdated
} else { | ||
isASCII = false | ||
var tmp: ( | ||
UInt32, UInt32, UInt32, UInt32, UInt32, UInt32, UInt32, UInt32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making a temporary buffer here is sort of awful and I want to improve it at some point, but it's also not really hurting anything and simplifies the rest of the code a lot
@swift-ci please test |
@swift-ci please benchmark |
@swift-ci please Apple Silicon benchmark |
I'll take that. (The |
Some of those failures do look real, so this'll stay as a draft for now |
Somehow the x86 results look better despite not using the hand vectorized path? I guess I should try using the fallback path on arm64 and see if it does ok there 😂
|
@swift-ci please test |
@swift-ci please Apple Silicon benchmark |
@swift-ci please benchmark |
Turns out not accidentally processing twice as much data improves the speedup!
|
@swift-ci please Apple Silicon benchmark |
@swift-ci please benchmark |
@swift-ci please Apple Silicon benchmark |
Just as good as before, so I think that means I get to delete all the architecture-specific bits of the patch :) |
… an unnecessary usableFromInline
@swift-ci please benchmark |
@swift-ci please Apple Silicon benchmark |
…, and special case empty buffers
No description provided.