-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
DatetimeDatetime data dtypeDatetime data dtypePerformanceMemory or execution speed performanceMemory or execution speed performance
Description
xref PR #17077
Now that a cache
keyword has been added to to_datetime
, ideally the default should be set to cache='infer'
which would inspect the input data to determine whether caching would be a more efficient conversion.
From some research (here and here), date strings, especially ones with timezones offsets, can benefit from conversion with a cache of dates. The rules of thumb of whether to convert with a cache should be based on a combination of input data type, proportion of duplicate values, and number of dates to convert.
Additionally, I'd be nice to resolve existing to_datetime
performance issues (e.g. #17410) just so the rules of thumb informing the inference step are not misguided by these issues.
Metadata
Metadata
Assignees
Labels
DatetimeDatetime data dtypeDatetime data dtypePerformanceMemory or execution speed performanceMemory or execution speed performance