-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
Reduction Operationssum, mean, min, max, etc.sum, mean, min, max, etc.RefactorInternal refactoring of codeInternal refactoring of code
Description
We have reductions implemented in nanops, _libs.groupby, and _libs.window.aggregations. We should refactor these with the following goals in mind:
- Have one/fewer distinct implementations
- Avoid copies, particularly in the nanops versions where we do something like
values[notna(values)]
- Chunked-friendliness, so that we can re-write ArrowExtensionArray._groupby_op to operate chunk-by-chunk, avoiding a copy in multi-chunk cases. (This could also be useful for hypothetical distributed EAs)
- Avoid casting/inference in nanops
- update Do axis=1 reductions without transposing/copying, inspired by PERF: axis=1 reductions with EA dtypes #54341
The implementation of group_skew is derived from https://www.johndcook.com/blog/skewness_kurtosis/ which includes a method for "adding" multiple RunningStats instances. Something like that could be adapted for 3).
Metadata
Metadata
Assignees
Labels
Reduction Operationssum, mean, min, max, etc.sum, mean, min, max, etc.RefactorInternal refactoring of codeInternal refactoring of code