You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the runners require the data to be provided on a line by line basis.
This means we lose efficiency when loading data if the runner could load it more efficiently (such as csv in pandas, dask etc).
It would be nice if a runner could elect to get the data stream rather than taking it on a row by row basis, falling back to the row by row implementation if the stream isn't available.
Maybe we add csv_stream, sql_stream etc to the DataConnection which will raise NotImplemented by default. Alternatively a connector could provide a data type and stream object.
The text was updated successfully, but these errors were encountered:
Currently the runners require the data to be provided on a line by line basis.
This means we lose efficiency when loading data if the runner could load it more efficiently (such as csv in pandas, dask etc).
It would be nice if a runner could elect to get the data stream rather than taking it on a row by row basis, falling back to the row by row implementation if the stream isn't available.
Maybe we add
csv_stream
,sql_stream
etc to theDataConnection
which will raiseNotImplemented
by default. Alternatively a connector could provide a data type and stream object.The text was updated successfully, but these errors were encountered: