-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
BugConstructorsSeries/DataFrame/Index/pd.array ConstructorsSeries/DataFrame/Index/pd.array ConstructorsDeprecateFunctionality to remove in pandasFunctionality to remove in pandasDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Description
In most contexts, Series is strict about dtype, so will always either return the given dtype or raise. DataFrame is the opposite, often silently ignoring dtype (xref #24435) (i think on the theory that dtype may be intended to apply to some columns but not others).
With floating data and integer dtypes, its the opposite:
arr = np.random.randn(5)
>>> pd.Series(arr, dtype="int16")
0 1.002695
1 0.259332
2 -1.111468
3 -0.680714
4 -0.008943
>>> pd.DataFrame(arr, dtype="int16")
0
0 1
1 0
2 -1
3 0
4 0
We have exactly one test that is broken if we change the latter behavior, and that is mostly by coincidence. There are other bugs (e.g. Series(bigints, dtype="int8")
silently overflowing) that would be easier to fix if maintaining this behavior weren't a consideration.
Is this intentional? cc @jreback @jorisvandenbossche @TomAugspurger
5j9
Metadata
Metadata
Assignees
Labels
BugConstructorsSeries/DataFrame/Index/pd.array ConstructorsSeries/DataFrame/Index/pd.array ConstructorsDeprecateFunctionality to remove in pandasFunctionality to remove in pandasDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action