-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Closed
Labels
API DesignEnhancementIO CSVread_csv, to_csvread_csv, to_csvIO DataIO issues that don't fit into a more specific labelIO issues that don't fit into a more specific labelIO HDF5read_hdf, HDFStoreread_hdf, HDFStoreIdeasLong-Term Enhancement DiscussionsLong-Term Enhancement Discussions
Description
discussed in #4698
convert_data
would return an appropriate Converter object to do various types of conversions. Mainly useful when you have to chunk both sides of these.
csv -> hdf
sql -> hdf
hdf -> csv
c = pd.convert_data('csv','hdf', chunksize=....)
c.read_csv(input_path. ......)
c.to_hdf(output_path, key.....)
c.execute()
pretty simple under the hood...
biggest issue is MAY need to read the input file twice (for read_csv) (if ints are detected in the first input chunk), because the dtype MAY change (e.g. you get a NaN later in later chunks)
useful?
Metadata
Metadata
Assignees
Labels
API DesignEnhancementIO CSVread_csv, to_csvread_csv, to_csvIO DataIO issues that don't fit into a more specific labelIO issues that don't fit into a more specific labelIO HDF5read_hdf, HDFStoreread_hdf, HDFStoreIdeasLong-Term Enhancement DiscussionsLong-Term Enhancement Discussions