You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix for Type Check in Quantized CPU op_dequantize (#13174)
### Summary
When quantizing a model (without delegating to a specific backend), an
exported model relies on the operator library in
`kernels/quantized/cpu/`. Specifically, the essential operation of
`op_dequantize` is performing:
`out = (in - offset) * scale`
where the offset is an integer type. While initially, this offset is
assumed to be an `uint64_t` (see
[here](https://github.com/pytorch/executorch/blob/a44e4aca7cddf91e8ed7282a70d6c40493a50883/kernels/quantized/cpu/op_dequantize.cpp#L426)),
when it is used to perform the operation above, it is cast down to a
`uint32_t` (see
[here](https://github.com/pytorch/executorch/blob/a44e4aca7cddf91e8ed7282a70d6c40493a50883/kernels/quantized/cpu/op_dequantize.cpp#L463)).
It seems an implicit assumption is that the quantization offset is a
`uint32_t` value, and the `uint64_t` declaration is simply safeguarding
for future proofing. In any event, the type check for the offset should
allow the offset to be either `uint32_t` or uint64_t`. This PR allows
for that change.
### Test plan
Tested with mobilenet V2 on Arm backend. Quantized model runner
initially crashed do to this check only allowing the offset to be
`uint64_t`. When examining the values, none were larger than
`UINT32_MAX`, so it should be safe to permit the offset to have
`uint32_t` values. When this change was made, the mobilenet V2 runner
was able to complete.
0 commit comments