Skip to content

fix: pd.Series in pandas>=3 does not preserve object dtype metadata #10564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Jul 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
946dd78
fix: keep dtype as `object` for `pd.StringDtype` in `safe_cast_to_index`
ilan-gold Jul 22, 2025
147e3a7
chore: comment
ilan-gold Jul 22, 2025
3cf92dd
fix: broader fix
ilan-gold Jul 23, 2025
287e4be
feat: ban use of `pd.api.types.is_extension_array_dtype`
ilan-gold Jul 23, 2025
ffc905d
fix: type ignore
ilan-gold Jul 23, 2025
aa68745
fix: `pd.Series` in `pandas>=3` does not preserve object dtype metada…
ilan-gold Jul 24, 2025
b97a64e
Merge branch 'main' into ig/fix_string_dtype
ilan-gold Jul 24, 2025
cae22f7
Merge branch 'main' into ig/fix_series_cast
ilan-gold Jul 24, 2025
29c5224
fix: bytes for catgeorical repr [test-upstream]
ilan-gold Jul 24, 2025
903ddcd
Merge branch 'ig/fix_series_cast' of github.com:ilan-gold/xarray into…
ilan-gold Jul 24, 2025
ed86c09
Update xarray/core/extension_array.py
ilan-gold Jul 24, 2025
f603aa0
fix: repr
ilan-gold Jul 24, 2025
02f2496
Update variable.py
ilan-gold Jul 25, 2025
9ea9d8d
? mypy
ilan-gold Jul 25, 2025
d131442
Merge branch 'ig/fix_string_dtype' of github.com:ilan-gold/xarray int…
ilan-gold Jul 25, 2025
14869fb
Merge branch 'ig/fix_string_dtype' into ig/fix_series_cast
ilan-gold Jul 25, 2025
3185a18
Merge branch 'ig/fix_series_cast' of github.com:ilan-gold/xarray into…
ilan-gold Jul 25, 2025
7dc9662
fix: remove comment
ilan-gold Jul 25, 2025
6d845b8
Merge branch 'ig/fix_string_dtype' into ig/fix_series_cast
ilan-gold Jul 25, 2025
f05e703
try blanket ignore
ilan-gold Jul 28, 2025
7343f6d
try blanket ignore
ilan-gold Jul 28, 2025
cc1776f
try blanket ignore again
ilan-gold Jul 28, 2025
2e8ed67
fix: mypy
ilan-gold Jul 28, 2025
ac77713
use Any
ilan-gold Jul 28, 2025
3ecc62f
Merge branch 'main' into ig/fix_string_dtype
ilan-gold Jul 29, 2025
c0b7b4c
Merge branch 'main' into ig/fix_string_dtype
ilan-gold Jul 29, 2025
bfbe244
chore: add comment
ilan-gold Jul 29, 2025
594c164
Update xarray/core/utils.py
ilan-gold Jul 29, 2025
0c6ba04
Merge branch 'ig/fix_string_dtype' of github.com:ilan-gold/xarray int…
ilan-gold Jul 29, 2025
a436d78
refactor: use is_allowed_extension_array more
ilan-gold Jul 29, 2025
5e603a7
fix: remove one of the loops
ilan-gold Jul 29, 2025
a8d1aa1
Update xarray/computation/ops.py
ilan-gold Jul 29, 2025
bc27b15
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2025
6b90606
Merge branch 'main' into ig/fix_series_cast
ilan-gold Jul 29, 2025
d5c6c3b
Merge branch 'ig/fix_string_dtype' into ig/fix_series_cast
ilan-gold Jul 29, 2025
f914dc7
fix: result handling
ilan-gold Jul 29, 2025
9c57cd0
Merge branch 'main' into ig/fix_series_cast
dcherian Jul 29, 2025
8c2f73e
fix: avoid extra copy + comment
ilan-gold Jul 30, 2025
3cf3b88
Apply suggestions from code review
dcherian Jul 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion xarray/core/variable.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import numpy as np
import pandas as pd
from numpy.typing import ArrayLike
from packaging.version import Version

import xarray as xr # only for Dataset and DataArray
from xarray.compat.array_api_compat import to_like_array
Expand Down Expand Up @@ -208,7 +209,10 @@ def _maybe_wrap_data(data):

def _possibly_convert_objects(values):
"""Convert object arrays into datetime64 and timedelta64 according
to the pandas convention.
to the pandas convention. For backwards compat, as of 3.0.0 pandas,
object dtype inputs are cast to strings by `pandas.Series`
but we output them as object dtype with the input metadata preserved as well.


* datetime.datetime
* datetime.timedelta
Expand All @@ -223,6 +227,17 @@ def _possibly_convert_objects(values):
result.flags.writeable = True
except ValueError:
result = result.copy()
# For why we need this behavior: https://github.com/pandas-dev/pandas/issues/61938
# Object datatype inputs that are strings
# will be converted to strings by `pandas.Series`, and as of 3.0.0, lose
# `dtype.metadata`. If the roundtrip back to numpy in this function yields an
# object array again, the dtype.metadata will be preserved.
if (
result.dtype.kind == "O"
and values.dtype.kind == "O"
and Version(pd.__version__) >= Version("3.0.0dev0")
):
result.dtype = values.dtype
return result


Expand Down
3 changes: 2 additions & 1 deletion xarray/tests/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import numpy as np
import pandas as pd
import pytest
from packaging.version import Version
from pandas.core.indexes.datetimes import DatetimeIndex

# remove once numpy 2.0 is the oldest supported version
Expand Down Expand Up @@ -299,7 +300,7 @@ def test_repr(self) -> None:
var1 (dim1, dim2) float64 576B -0.9891 -0.3678 1.288 ... -0.2116 0.364
var2 (dim1, dim2) float64 576B 0.953 1.52 1.704 ... 0.1347 -0.6423
var3 (dim3, dim1) float64 640B 0.4107 0.9941 0.1665 ... 0.716 1.555
var4 (dim1) category 32B b c b a c a c a{var5}
var4 (dim1) category 3{6 if Version(pd.__version__) >= Version("3.0.0dev0") else 2}B b c b a c a c a{var5}
Attributes:
foo: bar"""
)
Expand Down
Loading