`read_gbq(table_id, columns=[list of columns])` should actually filter the amount of columns downloaded from the API

 **Is your feature request related to a problem? Please describe.**

Currently, one only uses the columns parameter to re-order the list of columns and it has to exactly match the columns provided in the query or table. See this TODO:

https://github.com/googleapis/python-bigquery-pandas/blob/912b615b6d8d0ff11451c247fb65e9a293b06490/pandas_gbq/gbq.py#L939-L944

 **Describe the solution you'd like**

Only download the selected columns if the user passes a list of columns to `read_gbq`

*For queries:*

Maybe these still need to have the columns match since one can specify these in SQL? I don't see a selected_fields option in https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_client_Client_query_and_wait

*For table IDs:*

Pass the list of columns through as `selected_fields` to https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_client_Client_list_rows

Starting here: https://github.com/googleapis/python-bigquery-pandas/blob/912b615b6d8d0ff11451c247fb65e9a293b06490/pandas_gbq/gbq.py#L914-L919 going through to https://github.com/googleapis/python-bigquery-pandas/blob/912b615b6d8d0ff11451c247fb65e9a293b06490/pandas_gbq/gbq.py#L396

 **Describe alternatives you've considered**

A clear and concise description of any alternative solutions or features you've considered.

 **Additional context**

Aside: https://googleapis.dev/python/pandas-gbq/latest/reading.html has no mention that a table ID is supported. We should add a sample there.


	# TODO(kiraksi): allow columns to be a subset of all columns in the table, with follow up PR
	if columns is not None:
	if sorted(columns) == sorted(final_df.columns):
	final_df = final_df[columns]
	else:
	raise InvalidColumnOrder("Column order does not match this DataFrame.")

	final_df = connector.download_table(
	query_or_table,
	max_results=max_results,
	progress_bar_type=progress_bar_type,
	dtypes=dtypes,
	)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`read_gbq(table_id, columns=[list of columns])` should actually filter the amount of columns downloaded from the API #872

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

read_gbq(table_id, columns=[list of columns]) should actually filter the amount of columns downloaded from the API #872

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`read_gbq(table_id, columns=[list of columns])` should actually filter the amount of columns downloaded from the API #872