-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modified RNA residues not working correctly #9
Comments
Thanks! So, I'm torn on this. Allowing non-ASCII characters introduces a lot of complexity - in particular when it comes to figuring out the textwidth, and hence how to display it, but also for coloring, and sequence validation. So my response is: No, this is too complicated, sorry. I'll keep this issue open though to think more about it. The error messages could be improved, though, to make it clearer what is happening. |
Sorry i put multiple things in one issue, if you want to i can open two separate issues. Disregarding the unicode question, my suggestion would be:
Re: autodetection / nonstandard ASCII nucleobasesEven after removing/modifying sequences with unicode, i wasn't able to view the alignment:
This is pure ASCII and fails with the error message:
Real world RNA (and DNA to a lesser extent) has quite a few modified bases and modern sequencing techniques such as nanopore sequencing are starting to detect them routinely, so this is something that is going to become more common. Re: unicodeThe problem is that there aren't enough ASCII characters to represent all modified RNA residues with one ASCII letter. Not sure what the pragmatic solution is. I agree all of Unicode is overkill. Maybe any printable grapheme that can be displayed with the same width as an ASCII character. That's what makes sense in the context of multiple sequence alignments IMHO. There is a rust crate that can answer this question: Re: FASTA formatIn my very humble opinion, the most liberal specification of the FASTA format seems the most useful to me:
In any case, an alignment viewer that can deal with "strange" FASTA files will always be more useful than a viewer that can't. |
Reasonable suggestions. So:
|
Hi,
i wanted to say thank you for making this alignment viewer, i really like it!
I have had some problems viewing MSAs of RNAs with modified residues.
The problem seems to be that with all the modified RNA residues that exist, the sequences contain a lot of unusual characters.
A database of modified RNA residues can be found here:
https://iimcb.genesilico.pl/modomics/modifications
There seem to be two problems as far as i can see
°
Error message:
Error message:
My suggestion would be to have additional command-line options to
Not sure how to deal with it in visualisation. I guess one color for any nonstandard RNA base would be ok.
What do you think? I could try and make the changes if you agree
Data source:
Data was downloaded from tRNAdb:
http://trna.bioinf.uni-leipzig.de/DataOutput/Search
On the
Search Database
page, choosetRNA sequences
and then press thesearch database
button. This will return RNA sequences with modified sequences. You can save them by selecting theselect all sequences of search
checkbox on the search results page, and then saving the sequences by choosingDownload alignment
in the drop down button next to the checkbox.The text was updated successfully, but these errors were encountered: