Allow reading genotype data in compressed archives #237

nevrome · 2023-03-15T12:38:19Z

Maybe this could be implemented with sth. like pipes-zlib. It would allow for even smaller file sizes, which in turn would simplify and speed up a lot of our operations.

Ideally poseidon-hs should recognize .[bed|bim|geno|snp].gz suffixes in file names and stream the respective files accordingly when reading a package.

I suggest we play around with this here to see if it's possible and feasible. Later we could consider adding it to the standard.

The text was updated successfully, but these errors were encountered:

stschiff · 2023-03-15T13:44:44Z

yes. Note that last time I tried pipes-zlib sadly suffered from this bug: k0001/pipes-zlib#16 which was actually a bug in some other library upstream. I ended up decompressing directly from lazy bytestring (https://hackage.haskell.org/package/zlib-0.6.3.0/docs/Codec-Compression-Zlib.html) before then piping it through a suitable Pipes.Parser. So, definitely possible, but definitely also requires some playing around.

nevrome added enhancement New feature or request for the future labels Mar 21, 2023

nevrome mentioned this issue Jun 28, 2024

Add support for PACKEDANCESTRYMAP format #303

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow reading genotype data in compressed archives #237

Allow reading genotype data in compressed archives #237

nevrome commented Mar 15, 2023 •

edited

Loading

stschiff commented Mar 15, 2023

Allow reading genotype data in compressed archives #237

Allow reading genotype data in compressed archives #237

Comments

nevrome commented Mar 15, 2023 • edited Loading

stschiff commented Mar 15, 2023

nevrome commented Mar 15, 2023 •

edited

Loading