Benchmark biopython gff, pysam fastx and fasta #35

ambrosejcarr · 2018-04-14T17:31:53Z

Biopython and pysam have iterators for some of the objects implemented in SC Tools.

See if those tools are more efficient than the ones implemented here, and if so, determine how difficult it would be to extend their tools with single-cell functionality.

heuermh · 2018-05-22T19:18:00Z

Please also consider the Python APIs to ADAM, https://pypi.org/project/bdgenomics.adam/.

Since the meeting in April I've been thinking mostly about representations for the matrix service, but if I could help with benchmarking feature and sequence readers and perhaps also implementing some of the tools here or in https://github.com/HumanCellAtlas/fastq_utils in a scalable fashion on Spark+ADAM, please let me know.

ambrosejcarr · 2018-07-19T14:05:11Z

Great ideas! We'll definitely include these in any benchmarking. I think @mbabadi is considering doing some of this work either this or next quarter, and our team's engineers @dshiga may also work on getting these things tools running a bit more performantly.

When we start doing this, we'll sync you in and see how to best include you. @heuermh :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark biopython gff, pysam fastx and fasta #35

Benchmark biopython gff, pysam fastx and fasta #35

ambrosejcarr commented Apr 14, 2018

heuermh commented May 22, 2018 •

edited

Loading

ambrosejcarr commented Jul 19, 2018 •

edited

Loading

Benchmark biopython gff, pysam fastx and fasta #35

Benchmark biopython gff, pysam fastx and fasta #35

Comments

ambrosejcarr commented Apr 14, 2018

heuermh commented May 22, 2018 • edited Loading

ambrosejcarr commented Jul 19, 2018 • edited Loading

heuermh commented May 22, 2018 •

edited

Loading

ambrosejcarr commented Jul 19, 2018 •

edited

Loading