Skip to content

Commit 97bba51

Browse files
Dr-Irvjreback
authored andcommitted
CLN: Deprecate pandas.SparseArray for pandas.arrays.SparseArray (#30656)
1 parent 6f96331 commit 97bba51

31 files changed

+156
-142
lines changed

doc/source/development/contributing_docstring.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -399,7 +399,7 @@ DataFrame:
399399
* DataFrame
400400
* pandas.Index
401401
* pandas.Categorical
402-
* pandas.SparseArray
402+
* pandas.arrays.SparseArray
403403

404404
If the exact type is not relevant, but must be compatible with a numpy
405405
array, array-like can be specified. If Any type that can be iterated is

doc/source/getting_started/basics.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1951,7 +1951,7 @@ documentation sections for more on each type.
19511951
| period | :class:`PeriodDtype` | :class:`Period` | :class:`arrays.PeriodArray` | ``'period[<freq>]'``, | :ref:`timeseries.periods` |
19521952
| (time spans) | | | | ``'Period[<freq>]'`` | |
19531953
+-------------------+---------------------------+--------------------+-------------------------------+-----------------------------------------+-------------------------------+
1954-
| sparse | :class:`SparseDtype` | (none) | :class:`SparseArray` | ``'Sparse'``, ``'Sparse[int]'``, | :ref:`sparse` |
1954+
| sparse | :class:`SparseDtype` | (none) | :class:`arrays.SparseArray` | ``'Sparse'``, ``'Sparse[int]'``, | :ref:`sparse` |
19551955
| | | | | ``'Sparse[float]'`` | |
19561956
+-------------------+---------------------------+--------------------+-------------------------------+-----------------------------------------+-------------------------------+
19571957
| intervals | :class:`IntervalDtype` | :class:`Interval` | :class:`arrays.IntervalArray` | ``'interval'``, ``'Interval'``, | :ref:`advanced.intervalindex` |

doc/source/getting_started/dsintro.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -741,7 +741,7 @@ implementation takes precedence and a Series is returned.
741741
np.maximum(ser, idx)
742742
743743
NumPy ufuncs are safe to apply to :class:`Series` backed by non-ndarray arrays,
744-
for example :class:`SparseArray` (see :ref:`sparse.calculation`). If possible,
744+
for example :class:`arrays.SparseArray` (see :ref:`sparse.calculation`). If possible,
745745
the ufunc is applied without converting the underlying data to an ndarray.
746746

747747
Console display

doc/source/reference/arrays.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -444,13 +444,13 @@ Sparse data
444444
-----------
445445

446446
Data where a single value is repeated many times (e.g. ``0`` or ``NaN``) may
447-
be stored efficiently as a :class:`SparseArray`.
447+
be stored efficiently as a :class:`arrays.SparseArray`.
448448

449449
.. autosummary::
450450
:toctree: api/
451451
:template: autosummary/class_without_autosummary.rst
452452

453-
SparseArray
453+
arrays.SparseArray
454454

455455
.. autosummary::
456456
:toctree: api/

doc/source/user_guide/sparse.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ can be chosen, including 0) is omitted. The compressed values are not actually s
1515
1616
arr = np.random.randn(10)
1717
arr[2:-2] = np.nan
18-
ts = pd.Series(pd.SparseArray(arr))
18+
ts = pd.Series(pd.arrays.SparseArray(arr))
1919
ts
2020
2121
Notice the dtype, ``Sparse[float64, nan]``. The ``nan`` means that elements in the
@@ -51,7 +51,7 @@ identical to their dense counterparts.
5151
SparseArray
5252
-----------
5353

54-
:class:`SparseArray` is a :class:`~pandas.api.extensions.ExtensionArray`
54+
:class:`arrays.SparseArray` is a :class:`~pandas.api.extensions.ExtensionArray`
5555
for storing an array of sparse values (see :ref:`basics.dtypes` for more
5656
on extension arrays). It is a 1-dimensional ndarray-like object storing
5757
only values distinct from the ``fill_value``:
@@ -61,7 +61,7 @@ only values distinct from the ``fill_value``:
6161
arr = np.random.randn(10)
6262
arr[2:5] = np.nan
6363
arr[7:8] = np.nan
64-
sparr = pd.SparseArray(arr)
64+
sparr = pd.arrays.SparseArray(arr)
6565
sparr
6666
6767
A sparse array can be converted to a regular (dense) ndarray with :meth:`numpy.asarray`
@@ -144,7 +144,7 @@ to ``SparseArray`` and get a ``SparseArray`` as a result.
144144

145145
.. ipython:: python
146146
147-
arr = pd.SparseArray([1., np.nan, np.nan, -2., np.nan])
147+
arr = pd.arrays.SparseArray([1., np.nan, np.nan, -2., np.nan])
148148
np.abs(arr)
149149
150150
@@ -153,7 +153,7 @@ the correct dense result.
153153

154154
.. ipython:: python
155155
156-
arr = pd.SparseArray([1., -1, -1, -2., -1], fill_value=-1)
156+
arr = pd.arrays.SparseArray([1., -1, -1, -2., -1], fill_value=-1)
157157
np.abs(arr)
158158
np.abs(arr).to_dense()
159159
@@ -194,7 +194,7 @@ From an array-like, use the regular :class:`Series` or
194194
.. ipython:: python
195195
196196
# New way
197-
pd.DataFrame({"A": pd.SparseArray([0, 1])})
197+
pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})
198198
199199
From a SciPy sparse matrix, use :meth:`DataFrame.sparse.from_spmatrix`,
200200

@@ -256,10 +256,10 @@ Instead, you'll need to ensure that the values being assigned are sparse
256256

257257
.. ipython:: python
258258
259-
df = pd.DataFrame({"A": pd.SparseArray([0, 1])})
259+
df = pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})
260260
df['B'] = [0, 0] # remains dense
261261
df['B'].dtype
262-
df['B'] = pd.SparseArray([0, 0])
262+
df['B'] = pd.arrays.SparseArray([0, 0])
263263
df['B'].dtype
264264
265265
The ``SparseDataFrame.default_kind`` and ``SparseDataFrame.default_fill_value`` attributes

doc/source/whatsnew/v0.19.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1225,6 +1225,7 @@ Previously, sparse data were ``float64`` dtype by default, even if all inputs we
12251225
As of v0.19.0, sparse data keeps the input dtype, and uses more appropriate ``fill_value`` defaults (``0`` for ``int64`` dtype, ``False`` for ``bool`` dtype).
12261226

12271227
.. ipython:: python
1228+
:okwarning:
12281229
12291230
pd.SparseArray([1, 2, 0, 0], dtype=np.int64)
12301231
pd.SparseArray([True, False, False, False])

doc/source/whatsnew/v0.25.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -354,6 +354,7 @@ When passed DataFrames whose values are sparse, :func:`concat` will now return a
354354
:class:`Series` or :class:`DataFrame` with sparse values, rather than a :class:`SparseDataFrame` (:issue:`25702`).
355355

356356
.. ipython:: python
357+
:okwarning:
357358
358359
df = pd.DataFrame({"A": pd.SparseArray([0, 1])})
359360
@@ -910,6 +911,7 @@ by a ``Series`` or ``DataFrame`` with sparse values.
910911
**New way**
911912
912913
.. ipython:: python
914+
:okwarning:
913915
914916
df = pd.DataFrame({"A": pd.SparseArray([0, 0, 1, 2])})
915917
df.dtypes

doc/source/whatsnew/v1.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -578,6 +578,7 @@ Deprecations
578578
- :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_feather`, and :meth:`DataFrame.to_parquet` argument "fname" is deprecated, use "path" instead (:issue:`23574`)
579579
- The deprecated internal attributes ``_start``, ``_stop`` and ``_step`` of :class:`RangeIndex` now raise a ``FutureWarning`` instead of a ``DeprecationWarning`` (:issue:`26581`)
580580
- The ``pandas.util.testing`` module has been deprecated. Use the public API in ``pandas.testing`` documented at :ref:`api.general.testing` (:issue:`16232`).
581+
- ``pandas.SparseArray`` has been deprecated. Use ``pandas.arrays.SparseArray`` (:class:`arrays.SparseArray`) instead. (:issue:`30642`)
581582

582583
**Selecting Columns from a Grouped DataFrame**
583584

pandas/__init__.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@
115115
DataFrame,
116116
)
117117

118-
from pandas.core.arrays.sparse import SparseArray, SparseDtype
118+
from pandas.core.arrays.sparse import SparseDtype
119119

120120
from pandas.tseries.api import infer_freq
121121
from pandas.tseries import offsets
@@ -246,6 +246,19 @@ class Panel:
246246

247247
return type(name, (), {})
248248

249+
elif name == "SparseArray":
250+
251+
warnings.warn(
252+
"The pandas.SparseArray class is deprecated "
253+
"and will be removed from pandas in a future version. "
254+
"Use pandas.arrays.SparseArray instead.",
255+
FutureWarning,
256+
stacklevel=2,
257+
)
258+
from pandas.core.arrays.sparse import SparseArray as _SparseArray
259+
260+
return _SparseArray
261+
249262
raise AttributeError(f"module 'pandas' has no attribute '{name}'")
250263

251264

@@ -308,6 +321,9 @@ def __getattr__(self, item):
308321

309322
datetime = __Datetime().datetime
310323

324+
class SparseArray:
325+
pass
326+
311327

312328
# module level doc-string
313329
__doc__ = """

pandas/_testing.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1492,7 +1492,7 @@ def assert_sp_array_equal(
14921492
block indices.
14931493
"""
14941494

1495-
_check_isinstance(left, right, pd.SparseArray)
1495+
_check_isinstance(left, right, pd.arrays.SparseArray)
14961496

14971497
assert_numpy_array_equal(left.sp_values, right.sp_values, check_dtype=check_dtype)
14981498

0 commit comments

Comments
 (0)