-
Notifications
You must be signed in to change notification settings - Fork 73
Closed
Description
Steps to reproduce
- Clone https://github.com/hsivonen/encoding_rs
- Checkout the
simd
branch - Edit
src/mem.rs
to changecopy_ascii_to_ascii
toinline(never)
- Compile in release mode
objdump -d
the result- Checkout the
packed_simd
branch - Edit
src/mem.rs
to changecopy_ascii_to_ascii
toinline(never)
- Compile in release mode
objdump -d
the result
Actual results
With the old simd
crate, boolean reductions on ARMv7 use vpmax.u8
twice to fold the vector onto itself and then uses vmov.32
once. packed_simd
instead uses vmov.32
four times and them ORs them together on the ALU.
simd
:
6c17e: f921 0a0f vld1.8 {d0-d1}, [r1]
6c182: ef89 2050 vshr.s8 q1, q0, #7
6c186: ff02 2a03 vpmax.u8 d2, d2, d3
6c18a: ff02 2a00 vpmax.u8 d2, d2, d0
6c18e: ee12 1b10 vmov.32 r1, d2[0]
6c192: 2900 cmp r1, #0
packed_simd
:
56334: f961 0a0f vld1.8 {d16-d17}, [r1]
56338: efc9 2070 vshr.s8 q9, q8, #7
5633c: ee33 1b90 vmov.32 r1, d19[1]
56340: ee32 4b90 vmov.32 r4, d18[1]
56344: ee13 5b90 vmov.32 r5, d19[0]
56348: ee12 6b90 vmov.32 r6, d18[0]
5634c: 4321 orrs r1, r4
5634e: ea46 0405 orr.w r4, r6, r5
56352: 4321 orrs r1, r4
Expected results
Expected packed_simd
to implement horizontal boolean reductions on ARMv7 the same way as simd
.
Metadata
Metadata
Assignees
Labels
No labels