Skip to content

Commit

Permalink
Merge pull request #94 from genome-nexus/revert-92-mutation-assessor-v4
Browse files Browse the repository at this point in the history
Revert "Mutation assessor v4"
  • Loading branch information
leexgh authored Aug 28, 2024
2 parents cf2941c + 84efa73 commit 34d4053
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 28 deletions.
2 changes: 1 addition & 1 deletion data/grch37_ensembl92/export/annotation_version.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ HGNC 2023-10 mirrored hgnc The resource for approved human gene nomenclature. Ge
Cancer Hotspots v2 mirrored cancer_hotspots A resource for statistically significant mutations in cancer https://www.cancerhotspots.org
3D Hotspots v2 mirrored 3d_hotspots A resource for statistically significant mutations clustering in 3d protein structures in cancer https://www.3dhotspots.org/
reVUE https://github.com/knowledgesystems/reVUE-data/blob/main/VUEs.json mirrored revue A Repository for Variants with Unexpected Effects (VUE) in Cancer https://www.cancerrevue.org/
Mutation Assessor v4 mirrored mutation_assessor Mutation Assessor predicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. http://mutationassessor.org/r3/
Mutation Assessor v3 mirrored mutation_assessor Mutation Assessor predicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. http://mutationassessor.org/r3/
My Variant Info Includes many annotation sources, see https://docs.myvariant.info/en/latest/doc/data.html external my_variant_info MyVariant.info provides simple-to-use REST web services to query/retrieve variant annotation data, aggregated from many popular data resources. https://myvariant.info
ClinVar 20230722 mirrored clinvar ClinVar aggregates information about genomic variation and its relationship to human health. https://www.ncbi.nlm.nih.gov/clinvar/
8 changes: 0 additions & 8 deletions data/grch38_ensembl92/input/annotation_version.txt

This file was deleted.

1 change: 0 additions & 1 deletion data/grch38_ensembl95/export/annotation_version.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,5 @@ VEP grch38 mirrored vep VEP determines the effect of your variants(SNPs, inserti
HGNC 2023-10 mirrored hgnc The resource for approved human gene nomenclature. Genome Nexus uses HGNC gene symbols in annotation http://ftp.ebi.ac.uk/pub/databases/genenames/hgnc/archive/monthly/tsv/
Cancer Hotspots v2 mirrored cancer_hotspots A resource for statistically significant mutations in cancer https://www.cancerhotspots.org
3D Hotspots v2 mirrored 3d_hotspots A resource for statistically significant mutations clustering in 3d protein structures in cancer https://www.3dhotspots.org/
Mutation Assessor v4 mirrored mutation_assessor Mutation Assessor predicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. http://mutationassessor.org/r3/
My Variant Info Includes many annotation sources, see https://docs.myvariant.info/en/latest/doc/data.html external my_variant_info MyVariant.info provides simple-to-use REST web services to query/retrieve variant annotation data, aggregated from many popular data resources. https://myvariant.info
ClinVar 20230722 mirrored clinvar ClinVar aggregates information about genomic variation and its relationship to human health. https://www.ncbi.nlm.nih.gov/clinvar/
34 changes: 16 additions & 18 deletions scripts/import_mongo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -52,24 +52,22 @@ elif [[ ${REF_ENSEMBL_VERSION} == *"grch38"* ]]; then
fi

# import mutation assessor
echo "Downloading Mutation assessor data"
# Data source: https://drive.google.com/file/d/1V6r65xJFF5fJ7b9JHwqkvCe8wWDrIBhd/view.
# The copy is stored in S3 bucket: ttps://genome-nexus-static-data.s3.amazonaws.com/mutationassessor4_for_genome_nexus.tsv.xz
curl https://genome-nexus-static-data.s3.amazonaws.com/mutationassessor4_for_genome_nexus.tsv.xz -o ${DIR}/../data/common_input/mutationassessor4_for_genome_nexus.tsv.xz
echo "Download completed."

echo "Extracting Mutation assessor data"
unxz -k ${DIR}/../data/common_input/mutationassessor4_for_genome_nexus.tsv.xz

echo "Transforming Mutation assessor data"
sed -i '' 's/uniprotId\tSV\thgvspShort\tF_score\tF_impact\tMSA\tMAV/uniprotId\tsv\thgvspShort\tf_score\tf_impact\tmsa\tmav/' ${DIR}/../data/common_input/mutationassessor4_for_genome_nexus.tsv
awk -F'\t' 'BEGIN{OFS="\t"} NR==1{print "_id",$0; next} {print $1","$3,$0}' ${DIR}/../data/common_input/mutationassessor4_for_genome_nexus.tsv > processed_mutaiton_assessor_tsv_file.tsv
rm ${DIR}/../data/common_input/mutationassessor4_for_genome_nexus.tsv.xz
rm ${DIR}/../data/common_input/mutationassessor4_for_genome_nexus.tsv

echo "Importing Mutation assessor data"
import mutation_assessor.annotation processed_mutaiton_assessor_tsv_file.tsv "--type tsv --headerline"
rm processed_mutaiton_assessor_tsv_file.tsv
if [[ ${REF_ENSEMBL_VERSION} == *"grch37"* && ${MUTATIONASSESSOR} == true ]]; then
echo "Downloading Mutation assessor data"

curl http://mutationassessor.org/r3/MA_scores_rel3_hg19_full.tar.gz -o ${DIR}/../data/common_input/MA_scores_rel3_hg19_full.tar.gz

echo "Extracting Mutation assessor data"
tar -xvf ${DIR}/../data/common_input/MA_scores_rel3_hg19_full.tar.gz
rm ${DIR}/../data/common_input/MA_scores_rel3_hg19_full.tar.gz

echo "Transforming Mutation assessor data"
sed -i -e 's/"Mutation","RefGenome variant","Gene","Uniprot","Info","Uniprot variant","Func. Impact","FI score"/_id,rgaa,gene,uprot,info,var,F_impact,F_score/g' MA_scores_rel3_hg19_full/MA_scores_rel3_hg19_chr*
sed -i -e 's/hg19,//g' MA_scores_rel3_hg19_full/MA_scores_rel3_hg19_chr*

echo "Importing Mutation assessor data"
for filename in MA_scores_rel3_hg19_full/*.csv; do import mutation_assessor.annotation $filename '--type csv --headerline' && rm $filename; done
fi

# import annotation sources version
import version ${DIR}/../data/${REF_ENSEMBL_VERSION}/export/annotation_version.txt '--drop --type tsv --headerline'

0 comments on commit 34d4053

Please sign in to comment.