Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gtf2gtf --method=exons2introns: Documentation and method do not match #362

Open
IanSudbery opened this issue Oct 9, 2017 · 3 comments
Open

Comments

@IanSudbery
Copy link
Member

The documentation for gtf2gtf --method=exons2introns says:

exons2introns
Merges overlapping introns for all transcripts of a gene,
outputting the merged introns.

But this is not what exons2introns does. Taken the following example with two transcripts:

Input:
   |>>>>>|----------|>>>>>>|       
   |>>>>>|------------|>>>>|         
         |          | |
Output:  |          | |
         |>>>>>>>>>>| |    Actual Output
         |>>>>>>>>>>>>|    Described Output

As described I would expect the output to be as indicated by "Described Ouput", but that isn't what you get, you get what is described by "Actual Output" - which is the result of intersecting introns, not merging them.

Does anyone know of anywhere this is used? Should we change the documentation to match what actually happens, or change what happens to match the documentation?

@Acribbs
Copy link
Member

Acribbs commented Oct 10, 2017

Hi ian,

It seems as though this is used in the buildIntronGeneModels of pipeline mapping. I can't see it being used anywhere else.

The documentation is as follows:

Retain the protein coding genes from the input gene set and
    convert the exonic sequences to intronic sequences. 10 bp is
    truncated on either end of an intron and need to have a minimum
    length of 100. Introns from nested genes might overlap, but all
    exons are removed.

The only issue I can see is that our pipeline_mapping tests may fail if we made changes to the code. I will try and have a look at what tests are implemented today if I have time and get back to you on this.

Since buildIntronLevelReadCounts function is only used to count the number of reads that overlap introns and doesn't really get used in the downstream analysis of our data, my feeling would be to change the documentation to reflect the code. However, @sebastian-luna-valero and @AndreasHeger may have differing opinions.

@sebastian-luna-valero
Copy link
Member

Hi Ian,

Thanks for reporting the issue.

If you provide some toy input data to reproduce the issue in gtf2gtf --method=exons2introns, I am happy to have a look and update either the code or the docs depending on what we decide, and provide a test to capture this scenario.

Best regards,
Sebastian

@Acribbs
Copy link
Member

Acribbs commented Nov 17, 2017

Hi @IanSudbery was this issue resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants