Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

including invalid transcripts in training data #461

Open
sparthib opened this issue Jan 27, 2025 · 1 comment
Open

including invalid transcripts in training data #461

sparthib opened this issue Jan 27, 2025 · 1 comment

Comments

@sparthib
Copy link

sparthib commented Jan 27, 2025

Hi, I have a question about how XGboost is used to train TPS prediction for read classes. I can see how true transcripts are readily available from a reference annotation (y_ij = 1) ? But it is unclear to me how invalid transcript observations are input for training (y_ij = 0) ?

Thanks,
Sowmya

@cying111
Copy link
Collaborator

cying111 commented Feb 3, 2025

Hi @sparthib ,

As described in our paper, TPS prediction for all RCs is performed using a supervised machine learning algorithm. During training, the labels for these RCs are determined based on whether their intron junctions align exactly with those of annotated transcripts—in other words, only annotated reference transcripts are used for labeling during training. For more details, please check out our paper here.

I hope this clarifies your question! Let me know if you need any further clarification.

Thank you
Warm regards,
Ying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants