This repo is intended to be a collection of general-purpose software, scripts, and functions written by members of the Fraser Lab at Stanford University.
The code deposited here is diverse, ranging from bioinformatic tools to generic plotting scripts.
We intend to increase efficiency in our workflows by reducing the amount of time we spend solving common issues we encounter in the dry lab, as it is common that several people have worked on similar coding problems in the past.
We also aim to transparently share our strategies when dealing with common tasks in our lab, e.g. processing beginning-to-end data in the realm of gene expression, allele-specific expression, etc.
We organize our code by topics, one per folder with each containing its own readme and likely having specific guidelines for contributing.
We currently cover the following topics:
- compbio_snippets — miscellaneous code used to solve common bionformatic issues (e.g. dealing with bam files).
- config_files — config files for common software, e.g. vim, bash, etc.
- gtex — code to process and access GTEx data.
- plots — whether it's R or python we have a solution for your visualization task.
- tcga - code to process and access TCGA data.
- workflows — a collection of our pipelines in snakemake or nextflow.
We ask you to fork the repo to your own account, make changes and then perform a pull request.
Members of the Fraser Lab have direct writing access to the repo, so there is no need to fork/pull-request.
Please read the guidelines below before submitting a pull request
- Fork this repo to your github account.
- Clone to your laptop or a server.
- Create or copy a file in the appropriate folder.
- Update the lookup list of the corresponding readme.
- Push to your fork.
- Submit a pull request to this repo.
- Fork this repo to your github account.
- Clone to your laptop or a server.
- Create a new folder
- [Optional] Create or copy a file in the new folder.
- Create a readme and a lookup table for the folder.
- Push to your fork.
- Submit a pull request to this repo.
- The workflow should be hosted in its own repo.
- Add the link to that repo in the workflow's readme.
While we provide general guidelines that all code should follow, each folder may have its own specific guidelines, so please make sure to take a review the corresponding readme.
- One file per issue, and if possible one function per file. Each file should only contain code to solve a specifc task. Multiple functions are allowed only if they are all contributing to the same issue. Please avoid submitting a "collection" of functions
- Each file and its description has to be included in the lookup list of the corresponding folder's readme.
- Workflows should be in the format of a pipelining software like nextflow or snakemake. We only include links to their corresponding repos (i.e. your pipeline repo).
- All functions/scripts must contain:
- A brief description of what they do.
- Explanation of inputs, outputs, and arguments.
- List of dependencies.
- An example, if possible.
- Follow these conventions:
- Python
- Docstrings (any style is ok).
- Type hints are not required but highly recommended.
- R
- shell/bash
- (suggestions?).
- Workflows
- Include the description along a brief description of important steps included in the pipeline that can be useful to other people (e.g. mapping reads).
- See the corresponding folder for more details.
- Python