This project aims at creating a package with useful functions to deal with annotation files.
The script create_focused_filtered_annotation.py creates a focused annotation, without overlaps from a NON-FOCUSED annotation. This is how myUTRs_filtered_May_16_sorted_300_focus_December_23.gtf has been created, starting from myUTRs_filtered_May_16_sorted.gtf. The overlaps are removed by starting from the shortest one.
The script create_ground_truth_no_overlaps.py creates a annotation from an annotation already focused but with overlaps. The overlaps will be removed based on RefSeq and Ensembl annotation; and in case of overlapping known annotation, the Ensembl one will be kept. In case of overlapping Ensembl annotations, the longest one will be kept. The remaining overlaps are removed by starting from the longest one.
To install this library in another Python project, execute simply:
pip install git+https://github.com/idiap/genomemanager.git