Index-structure-and-mapping

Small project for INF589: Computational analysis of high-throughput sequencing data. Here we are dealing with index structure / mapping with Burrows-Wheeler transform (BWT)

Instructions:

Index structures/Mapping Implement a tool that constructs a BWT index and maps reads using the index.

Use additional tables (suffix array, count tables). Take care of the size of your index in main memory - can you implement the ‘tricks’ like storing only every K-th row (e.g. K=100) and recompute in-between to save main memory?
Test on RNA viruses (HIV, SARS-Cov2, . . . ) and map simulated reads without errors. Report performance (depending on genome size, read lengths; use sufficiently many reads for a good estimation) and space consumption.
Can you show memory / speed trade-offs (e.g. by using count tables vs. no count tables)? Take care to get an efficient implementation of search - note that full count tables would allow linear search or at least O(n log n).

Project alternative: Use enhanced suffix array in place of BWT. Implement linear search using LCP and child tables; compare against search without such extra-tables. (Abouelhoda, et al. 2004)

Remark: No need for very efficient construction of the data structure; rather focus on search efficiency (due to additional tables) and space consumption.

Ideas:

Memory consumption: use https://pypi.org/project/memory-profiler/
Time consumption: use https://docs.python.org/fr/3/library/time.html git

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
__pycache__		__pycache__
data		data
0_shotgun-sequencing.ipynb		0_shotgun-sequencing.ipynb
1_IndexStructures.ipynb		1_IndexStructures.ipynb
2_Mapping.ipynb		2_Mapping.ipynb
BWT.py		BWT.py
EXAM-Genome mapping.pptx		EXAM-Genome mapping.pptx
README.md		README.md
Time_cumsumption.png		Time_cumsumption.png
Time_cumsumption_total.png		Time_cumsumption_total.png
Trees.py		Trees.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Index-structure-and-mapping

Instructions:

About

Releases

Packages

Contributors 2

Languages

Coohrentiin/Index-structure-and-mapping

Folders and files

Latest commit

History

Repository files navigation

Index-structure-and-mapping

Instructions:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages