The PhyloProfileData package provides a collection of datasets to accompany the R package PhyloProfile pakage (Tran et al. 2018), where they are used to illustrate how to run PhyloProfile and analyse its results. Briefly, it contains the phylogenetic profiles, the fasta sequences and the domain annotations for two experimental data sets, including
- 147 human proteins in the AMPK-TOR pathway across 83 species, and
- 1011 BUSCO arthropoda ortholog groups across 88 species in the three domains of life.
if (!requireNamespace("BiocManager"))
install.packages("BiocManager")
BiocManager::install("PhyloProfileData")
The data are stored in the ExperimentHub of Bioconductor and can be accessed using the following R commands:
# Load the data of the PhyloProfileData package
library(ExperimentHub)
eh = ExperimentHub()
myData <- query(eh, "PhyloProfileData")
# View the metadata of this data package
myData
ExperimentHub with 6 records
# snapshotDate(): 2019-05-29
# $dataprovider: Applied Bioinformatics Dept., Goethe University Frankfurt
# $species: NA
# $rdataclass: data.frame, AAStringSet
# additional mcols(): taxonomyid, genome, description, coordinate_1_based,
# maintainer, rdatadateadded, preparerclass, tags, rdatapath, sourceurl,
# sourcetype
# retrieve records with, e.g., 'object[["EH2544"]]'
title
EH2544 | Phylogenetic profiles of human AMPK-TOR pathway
EH2545 | FASTA sequences for proteins in the phylogenetic profiles of human AMPK-TOR...
EH2546 | Domain annotations for proteins in the phylogenetic profiles of human AMPK-...
EH2547 | Phylogenetic profiles of BUSCO arthropoda proteins
EH2548 | FASTA sequences for proteins in the phylogenetic profiles of BUSCO arthropo...
EH2549 | Domain annotations for proteins in the phylogenetic profiles of BUSCO arthr...
Each data set contains three files (objects) corresponding for the phylogenetic profiles, the FASTA sequences and the protein domain annotations. A particular data object can be retrieve using its ID, for example:
# Retrieve FASTA sequences for proteins in the phylogenetic profiles of the
# human AMPK-TOR pathway
ampkTorFasta <- myData[["EH2545"]]
For a detailed description of each data set and the belonging data objects please see the vignette PhyloProfileData.
library(PhyloProfileData)
browseVignettes("PhyloProfileData")
Any bug reports or comments, suggestionsare highly appreciated. Please open an issue on GitHub or be in touch via email.
This data package is released under MIT license.
Ngoc-Vinh Tran, Bastian Greshake Tzovaras, Ingo Ebersberger; PhyloProfile: Dynamic visualization and exploration of multi-layered phylogenetic profiles, Bioinformatics, , bty225, https://doi.org/10.1093/bioinformatics/bty225
or use the citation function in R CMD to have it directly in BibTex or LaTeX format
citation("PhyloProfileData")
Vinh Tran [email protected]