This repository holds presentation materials used for seminars about FacileData.
The FacileData ecosystem (aka the "facileverse") is a set of R packages built to make the analysis and exploration of large, consortia-scale genomic datasets more fruitful for computational and non-computational biologists, alike.
This is accomplished by providing an ecosystem of packages which:
- Defines a fluent data query and retrieveal API for multi-omics datasets, such as the diversity of data generated by the TCGA.
- An efficient (in memory and speed) reference implementation of data container that implements the FacileData API (a FacileDataSet). This casn easily store the entirety of the high throughput genomics data (RNA-seq, microarray, CNV, etc) generated from the ~11,000 samples in the TCGA.
- A set of common analyses over these data (PCA, differential expression, GSEA, etc.) implemented using the FacileData API (FacileAnalyses).
- A set of modularized, interactive (shiny) components that can query, retrieve, and display data from a container that implements the FacileData API.
- A set of interactive (shiny) components that can drive analyses and present their results interactively. Results can be displayed within the context of a shiny app, or as an interactive htmlwidget an Rmarkdown report.
Importantly, the interactive components built here can all be weaved together into a shiny application that is completely driven by a point-and-click interface, or can individually be invoked as shiny gadgets during an analysis that is conducted in an interactive R sesssion by an analyst.
We plan to open source these tools in the second half of 2019.
This was the first public debut of FacileData, given on May 3, 2017 at
PLOTCON. We outlined the need for tools that enable better
collaboration between biologists and computational biologists, and showcased how
the FacileDataSet
and a FacileExplorer
shiny application serves this
purpose.
The video of this talk has been posted to YouTube.