Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DUMMY] Add comments to bin/CountSNPASE.py #12

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

carloartieri
Copy link
Collaborator

Created a dummy branch to add some comments to CountSNPASE.py. In particular, I've indicated:

  • How pandas can be used to read in tables with more clarity.
  • How the latest versions of pysam allow you to parse variants from a read much more efficiently than the code I wrote last year. It's important here that you're using htslib 1.3+ versions as a bug in the original implementation of this code led to errors.
  • How pysam allows indexed retrieval of sequence from FASTA files making the whole fasta_to_dict() construction useless.

Unfortunately I don't have time to formally implement this stuff and test it, but hopefully it will lead to some improvements in code speed, reliability, and maintainability. This isn't the deepest dive back into the code, but I'm happy to check in periodically and discuss tweaks and improvements.

@carloartieri
Copy link
Collaborator Author

Also, there may be a few syntax errors in there (I think I used added too many brackets around the pandas.DataFrame.columns call), so consider it pseudocode for testing ⚠️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant