Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can one iterate over files in a zip file? #82

Open
chananshgong opened this issue Nov 22, 2017 · 1 comment
Open

can one iterate over files in a zip file? #82

chananshgong opened this issue Nov 22, 2017 · 1 comment

Comments

@chananshgong
Copy link

If I have one big file with 10,000 files. Can rufus iterate (build the pipeline) assuming atomic operations are over the files withing the zip file (or alternatively do I need to unzip first and only then iterate).

@jbarlow83
Copy link
Contributor

I did something similar. Use split task that scans the container file and creates an empty placeholder for every compressed file to extract, followed by atransform task that takes the name of a placeholders and actually does the unzipping. That will parallelize decompression.

But if that's all you want ruffus is overkill. concurrent.futures would be easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants