-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
bugSomething isn't workingSomething isn't working
Description
When loading from pandas in the table with dates, the UTC timezone is added to the dtype.
This is confusing.
Is this correct or a bug?
Package Version
crate 2.0.0
pandas 2.2.3
SQLAlchemy 2.0.39
sqlalchemy-cratedb 0.42.0.dev0
test
import sqlalchemy as sa
import pandas as pd
data = {
"date_1": ["2020-01-01", "2021-01-01", "2022-01-01", "2023-01-01", "2027-12-30"],
"date_2": ["2020-09-24", "2020-10-24", "2020-11-24", "2020-12-24", "2027-09-24"],
}
df_data = pd.DataFrame.from_dict(data, dtype="datetime64[ns]")
print(df_data.dtypes)
print(df_data.sort_values(by="date_1").reset_index(drop=True))
dburi = "crate://panduser:[email protected]:4200?ssl=false"
engine = sa.create_engine(dburi, echo=False)
conn = engine.connect()
df_data.to_sql(
"test_date",
conn,
if_exists="replace",
index=False,
)
conn.exec_driver_sql("REFRESH TABLE test_date;")
df_load = pd.read_sql_table("test_date", conn)
print("\ndataframe after loading")
df_load = df_load.sort_values(by="date_1").reset_index(drop=True)
print(df_load.dtypes)
print(df_load)
Output:
date_1 datetime64[ns]
date_2 datetime64[ns]
dtype: object
date_1 date_2
0 2020-01-01 2020-09-24
1 2021-01-01 2020-10-24
2 2022-01-01 2020-11-24
3 2023-01-01 2020-12-24
4 2027-12-30 2027-09-24
dataframe after loading
date_1 datetime64[ns, UTC]
date_2 datetime64[ns, UTC]
dtype: object
date_1 date_2
0 2020-01-01 00:00:00+00:00 2020-09-24 00:00:00+00:00
1 2021-01-01 00:00:00+00:00 2020-10-24 00:00:00+00:00
2 2022-01-01 00:00:00+00:00 2020-11-24 00:00:00+00:00
3 2023-01-01 00:00:00+00:00 2020-12-24 00:00:00+00:00
4 2027-12-30 00:00:00+00:00 2027-09-24 00:00:00+00:00
After loading, to remove the time zone, I do this
df2 = df_load.select_dtypes("datetimetz")
df_load[df2.columns] = df2.apply(lambda x: x.dt.tz_convert(None))
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working