Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

Support for compressed CSV files #23

Open
Enchufa2 opened this issue Jun 6, 2018 · 4 comments
Open

Support for compressed CSV files #23

Enchufa2 opened this issue Jun 6, 2018 · 4 comments

Comments

@Enchufa2
Copy link

Enchufa2 commented Jun 6, 2018

Most CSV readers (notably, base R readers) support compressed (gz, bzip2...) CSV files transparently. It would be a nice addition to MonetDBLite::monetdb.read.csv, because big CSVs are commonly gzip'ed.

@hannes
Copy link
Contributor

hannes commented Jun 6, 2018

Yes we had removed this feature because it adds a dependency, which is a pain especially on Windows.

@Enchufa2
Copy link
Author

Enchufa2 commented Jun 6, 2018

And what about allowing an external command to be specified? May this be possible? Example: data.table::fread. You can read a compressed CSV as follows:

data.table::fread("zcat somefile.csv.gz")

zcat is invoked and its output feeds the reader.

@hannes
Copy link
Contributor

hannes commented Jun 7, 2018

That sounds pretty good. I am unlikely to implement this at the moment. Happy to review a PR though.

@Enchufa2
Copy link
Author

Enchufa2 commented Jun 7, 2018

And another quick and portable option would be to rely on the R.utils package, which implements gunzip and the like based on base R (efficiently copying from a gzfile, bzfile... connection to a file connection). But this would mean adding R.utils as a dependency (or porting to MonetDBLite just the relevant code). What do you think?

@hannes hannes changed the title [request] support for compressed CSV files Support for compressed CSV files Jun 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants