-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Closed
Labels
A-codegenArea: Code generationArea: Code generation
Description
Similar to the behavior observed in #32031, tuples of f32
and f64
seem to be passed to functions in GPRs.
The f32
tuple takes an especially large hit, since the two f32 are passed inside a single 64 bit GPR and have to be excracted and compressed via shift
and or
instructions. Even with inlining turned on, this does not go away.
The f64
tuple is not as bad as the f32
tuple. Without inlining it does some move
s to and from the SIMD registers and with inlining turned on, the tuple is kept in a SIMD register and the loop is vectorized and unrolled.
EDIT: Forgot to link to the code example on playpen.
Metadata
Metadata
Assignees
Labels
A-codegenArea: Code generationArea: Code generation