-
Notifications
You must be signed in to change notification settings - Fork 645
Qualcomm AI Engine Direct - GA Static Phi-4-mini #13179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualcomm AI Engine Direct - GA Static Phi-4-mini #13179
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13179
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled JobAs of commit d6dda6c with merge base 18098a4 ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
d0fe026
to
76f4d50
Compare
Hi @cccclai Thanks!! |
# TODO: Encountered the following error during runtime, so switched behavior for now. | ||
# Error: libc++abi: terminating due to uncaught exception of type std::runtime_error: invert=true is not supported for Split PreTokenizer. Only invert=false is supported. | ||
data["pre_tokenizer"]["pretokenizers"][-2]["invert"] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cccclai,
And here is a temporary workaround for the tokenizer error:
invert=true is currently not supported for Split PreTokenizer, and only invert=false is allowed.
We would appreciate it if this could be fixed.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DannyYuyang-quic is it particularly for phi tokenizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow-up!
I’m not sure if tokenizers from other GA models would encounter the same issue, but based on what I’ve seen so far, models like Qwen2, Qwen3, and Gemma3 don’t seem to have the invert
kwarg in their tokenizer.
76f4d50
to
a14610e
Compare
@pytorchbot label "release notes: qualcomm" |
a14610e
to
013d222
Compare
|
||
|
||
APPLY_ROPE_EMBEDDING_FUNCTIONS = { | ||
"phi_4_mini": apply_partial_rotary_emb_single, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this stay in this folder? https://github.com/pytorch/executorch/tree/main/examples/models/phi_4_mini/config or is it qnn specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not qnn-specific.
apply_partial_rotary_emb_single
(partial ROPE embedding) is required whenever the condition partial_rotary_factor < 1.0
is met.
I've updated the condition to explicitly check for partial_rotary_factor < 1.0
, ensuring that partial ROPE is applied only when the condition is met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow...you're still awake...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we refer to the config in this folder #13086 instead of having phi specific logic inside static llama?
013d222
to
371c80a
Compare
if config.partial_rotary_factor < 1: | ||
self.apply_rope_emb = apply_partial_rotary_emb_single | ||
else: | ||
self.apply_rope_emb = apply_rotary_emb_single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is conflict, can you resolve? |
Summary: - Support Phi-4-mini-instruct for static llama path - add P-ROPE for phi-4-mini - add EOS tok for Phi-4-mini
46f019a
to
d6dda6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for enabling phi4!
Summary
Test plan
cc: @haowhsu-quic, @shewu-quic, @winskuo-quic, @cccclai