-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
API DesignExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.SparseSparse Data TypeSparse Data Type
Milestone
Description
We should make SparseArray a proper ExtensionArray.
It seems like this will be somewhat difficult to do properly when SparseArray subclasses ndarray. Basic things like np.asarray(sparse_array)
don't match the required ExtensionArray API (#14167). Fixing this, especially when we subclass ndarray, is going to be difficult. I can't override the behavior of np.asarray(sparse_array)
in Python.
So, some questions
- Do people rely on SparseArray being an ndarray subclass?
- Do we want to make a clean break, or introduce deprecations for things that will need changing (but with no clear upgrade path)?
My current preference is to just break things, but I don't use sparse. SparseArray would compose an ndarray of dense values and a SparseIndex
, but it would no longer subclass ndarray.
CCing some people who seem to use pandas' sparse: @hexgnu @kernc @Licht-T
Metadata
Metadata
Assignees
Labels
API DesignExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.SparseSparse Data TypeSparse Data Type