-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Open
Labels
EnhancementIndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselvesMultiIndex
Description
Best shown with an example.
import numpy as np, pandas as pd
timestamps = map(pd.Timestamp, ['2014-01-01', '2014-02-01'])
categories = ['A', 'B', 'C', 'D']
df = pd.DataFrame(index=pd.MultiIndex.from_product([timestamps, categories], names=['ts', 'cat']),
columns=['Col1', 'Col2'])
>>> df
Col1 Col2
ts cat
2014-01-01 A NaN NaN
B NaN NaN
C NaN NaN
D NaN NaN
2014-02-01 A NaN NaN
B NaN NaN
C NaN NaN
D NaN NaN
I want to set the values for all categories in a single month. These examples work just fine.
df.loc['2014-01-01', 'Col1'] = 5
df.loc['2014-01-01', 'Col2'] = [1,2,3,4]
>>> df
Col1 Col2
ts cat
2014-01-01 A 5 1
B 5 2
C 5 3
D 5 4
2014-02-01 A NaN NaN
B NaN NaN
C NaN NaN
D NaN NaN
These examples don't work.
df.loc['2014-01-01', 'Col1'] += 1
df.loc['2014-02-01', 'Col2'] = df.loc['2014-01-01', 'Col2']
>>> df
Col1 Col2
ts cat
2014-01-01 A NaN 1
B NaN 2
C NaN 3
D NaN 4
2014-02-01 A NaN NaN
B NaN NaN
C NaN NaN
D NaN NaN
It doesn't seem to be a "setting a value on a copy" issue. Instead, Pandas is writing the NaNs.
My current workaround is to unstack each column into a DataFrame with simple indexes. This works, but I have lots of columns to work with. One dataframe is much easier to work with than a pile of dataframes.
The computations for each month depend on the values computed in the previous month, hence why it can't be done fully vectorized on an entire column.
xialu4820723, louis925, creyesk, wmayner, 46319943 and 4 more
Metadata
Metadata
Assignees
Labels
EnhancementIndexingRelated to indexing on series/frames, not to indexes themselvesRelated to indexing on series/frames, not to indexes themselvesMultiIndex