Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on the randomer dataset for the RNA inosine_m6A model #195

Open
ChaseWLW opened this issue Nov 8, 2024 · 1 comment
Open

Question on the randomer dataset for the RNA inosine_m6A model #195

ChaseWLW opened this issue Nov 8, 2024 · 1 comment

Comments

@ChaseWLW
Copy link

ChaseWLW commented Nov 8, 2024

Hi,

I've been using the inosine_m6A model with some RNA004 data. Out of technical curiosity, I would like to ask whether the randomer dataset for training this model includes RNAs that are co-modified with both of these modifications? Or are the sequences either m6A or inosine modified (i.e. not both on the same RNA)?

Thanks!

@marcus1487
Copy link
Collaborator

The generation of models are trained from each modified base in isolation. We are looking into adding neighboring bases into training and have done this to a limited extent in DNA models, including multiple 5mC sites in randomers. We are working on ways to scale up this modified neighbors effort into training. Note that we are interested in including all modified bases neighboring each other modified base. For example including A mods (e.g. m6A) next to C mods (e.g. m5C) is on our roadmap to improve the robustness of our models, but is not in the current generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants