-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
API - ConsistencyInternal Consistency of API/BehaviorInternal Consistency of API/BehaviorBugIndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselves
Description
Follow up on #37427
Currently, there is an inconsistency between __getitem__
and loc
related to how the RHS value in an assignment gets re-aligned or not.
In this example, the values in Series get just assigned as is, ignoring the index of the series (only requiring that the length matches the number of values in the indexer):
s = pd.Series(range(5))
idx = np.array([1, 4])
>>> s[idx] = Series([10, 11])
>>> s
0 0
1 10
2 2
3 3
4 11
dtype: int64
However, with loc
we have a different behaviour: the Series gets basically aligned with s
first, and then this realigned series is also indexed with idx
, and those values get set:
s = pd.Series(range(5))
idx = np.array([1, 4])
>>> s.loc[idx] = Series([10, 11])
>>> s
0 0.0
1 11.0
2 2.0
3 3.0
4 NaN
dtype: float64
you could write what basically happens more explicitly as:
s.loc[idx] = Series([10, 11]).reindex(s.index)[idx]
To have the same behaviour as __getitem__
, you need to assign with an array instead of series:
s = pd.Series(range(5))
idx = np.array([1, 4])
>>> s.loc[idx] = Series([10, 11]).values
>>> s
0 0
1 10
2 2
3 3
4 11
dtype: int64
Metadata
Metadata
Assignees
Labels
API - ConsistencyInternal Consistency of API/BehaviorInternal Consistency of API/BehaviorBugIndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselves