-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some sequences are missing in pyfastx.Fasta object #41
Comments
Thank you for reporting this issue. I will check that. A new version will be released soon. |
Any updates on this? I'm getting the same error: I'm loading a large fasta file (~59M entries), and for some of the indices (when accessing by string key and by integer index), I'm getting a key does not exist error. Reloading the file solves the problem for given keys, but shifts it to others. |
Thanks. Could you provide me your code and data https links. |
I'm using the unzipped version of this file https://stringdb-downloads.org/download/protein.sequences.v12.0.fa.gz. import pyfastx Maybe it has to do with multiple workers accessing the same fasta file? I'm afraid I cannot post the actual code I'm using at this point. |
I loaded a fasta file containing 4542 sequences with average length of 2.5kb, however only 4539 sequences were in the pyfastx.Fasta object.
Besides, I could access a sequence e.g.
fa['contig_999']
for the first time. But when I try to access it again I got keyError.The version of pyfastx I used is
0.8.4
, Python version 3.7The text was updated successfully, but these errors were encountered: