-
Notifications
You must be signed in to change notification settings - Fork 5
GitHub release file details
We distribute pre-built gz files of transcripts via GitHub releases
The main files contain every historical transcript - so are useful for maximum compatibility when you want to resolve as many HGVSs as possible.
The gene symbol associated with a gene (and thus a transcript version) can change between GTFs (ie it is not part of what is frozen in a transcript version)
When generating these historical files, we use gene/transcript version from the most recent GTF
Annotation Consortium | Build | Example File |
---|---|---|
Ensembl | GRCh37 | cdot-0.2.14.ensembl.grch37.json.gz |
Ensembl | GRCh38 | cdot-0.2.14.ensembl.grch38.json.gz |
Ensembl | GRCh37/GRCh38 | cdot-0.2.14.ensembl.grch37_grch38.json.gz |
RefSeq | GRCh37 | cdot-0.2.14.refseq.grch37.json.gz |
RefSeq | GRCh38 | cdot-0.2.14.refseq.grch38.json.gz |
RefSeq | GRCh37 / GRCh38 | cdot-0.2.14.refseq.grch37_grch38.json.gz |
As files above contain the most recent transcript, if you want the transcript versions / gene symbols to match exactly what was in a release, you need to use .json.gz files produced from a single GTF release.
We host some of the versions used in VariantGrid, where they match the Release version used in a Ensembl VEP annotation, eg:
Annotation Consortium | Build | Example File | VEP release |
---|---|---|---|
RefSeq | GRCh37 | cdot-0.2.14.GCF_000001405.25_GRCh37.p13_genomic.105.20201022.gff.json.gz | 100-109 |
RefSeq | GRCh38 | cdot-0.2.14.GCF_000001405.39_GRCh38.p13_genomic.109.20211119.gff.json.gz | -108 |