-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Open
Labels
EnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatenp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateNA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arraysPerformanceMemory or execution speed performanceMemory or execution speed performance
Description
Our nullable, mask-based ExtensionArrays (currently integer and boolean, inheriting from MaskedArray) have a _data
and _mask
numpy arrays stored under the hood. SO we use a numpy boolean array as mask (8bit), also when there are no missing values.
One, relatively easy, memory + performance improvement could be achieved by allowing the mask to be None when there are no missing data. Since the mask data is completely internal to the Array implementations, this should be possible to do.
(to be checked how involved the ops code would become to handle this as optional)
Metadata
Metadata
Assignees
Labels
EnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatenp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateNA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arraysPerformanceMemory or execution speed performanceMemory or execution speed performance