Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: change arriba fusion detection algorithm input parameters #204

Merged
merged 3 commits into from
Nov 25, 2024

Conversation

jarbesfeld
Copy link
Contributor

Closes #203

Notes: This translator includes a variable that tells us if a transcript segment starts or ends at breakpoint, so new logic was added to take this into account.

From Arriba:
direction1 and direction2 : These columns indicate the orientation of the fusion breakpoints. A value of downstream means that the fusion partner is fused downstream of the breakpoint, i.e., at a coordinate higher than the breakpoint.In other words, the supporting split reads are right-clipped. A value of upstream means the partner is fused at a coordinate lower than the breakpoint. In other words, the supporting split reads are left-clipped. When the prediction of the strands or of the 5' gene fails, this information gives insight into which parts of the fused genes are retained in the fusion.

This required the strand variable to be re-added as this indicates the direction of transcription and allows us to choose whether we should choose seg_start_genomic or seg_end_genomic for a breakpoint.

@jarbesfeld jarbesfeld added enhancement New feature or request priority:medium Medium priority labels Nov 22, 2024
@jarbesfeld jarbesfeld self-assigned this Nov 22, 2024
@jarbesfeld jarbesfeld marked this pull request as ready for review November 22, 2024 16:45
@jarbesfeld
Copy link
Contributor Author

test_arriba failed for me when this is run:

pydantic_core._pydantic_core.ValidationError: 1 validation error for AssayedFusion
E         Value error, 5' TranscriptSegmentElement fusion partner must contain ending exon position
E       3' fusion partner junction must include starting position [type=value_error, input_value={'structure': [Transcript...ngFramePreserved': True}, input_type=dict]
E           For further information visit https://errors.pydantic.dev/2.4/v/value_error

I think this test should be changed as I don't believe the ending and start positions are needed to validate the fusion

Copy link
Member

@korikuzma korikuzma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should talk about some cleanup work that could be done in the future. There is very similar code for these fusion callers that could be refactored to make maintenance easier

src/fusor/translator.py Outdated Show resolved Hide resolved
@jarbesfeld jarbesfeld changed the title feat!: Reorganize Arriba translator feat!: change fusion catcher detection algorithm input parameters #201 Nov 22, 2024
@jarbesfeld jarbesfeld changed the title feat!: change fusion catcher detection algorithm input parameters #201 feat!: change arriba fusion detection algorithm input parameters Nov 22, 2024
@jarbesfeld jarbesfeld merged commit 1b067f0 into main Nov 25, 2024
10 checks passed
@jarbesfeld jarbesfeld deleted the issue-203 branch November 25, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority:medium Medium priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add translator for Arriba
2 participants