YeastCodonAnalysis

Analysis performed on a yeast codon dataset

Dataset

Dataset created by Pascal Durrens (CNRS) in 2020 (see codons_2020 for details about the construction of the dataset).

Taxonomy

Taxonomy cleaning is used to make a usable .csv file from the taxonomy file in codons_2020 (codons_2020/CDS/MEASURES/SPECIES/genomes-species.csv). taxonomy_cleaning.py usage: reglog.py [-h] Inputdir Outputdir

Extract taxonomic subset.

positional arguments: Inputdir directory containing the whole dataset Outputdir Path to output directory (will be created if it doesn't exist)

optional arguments: -h, --help show this help message and exit

PCA

metrics_average.py :

usage: extract_subset [-h] Workingdir Outputdir Taxopath

Used to prepare the dataset (codons_2020/CSD/MEASURES/SEQUENCES/ALL_MEASURES) for PCA/sPCA analysis by computing the median for each species. The taxonomy file is the one generated by taxonomy_cleaning.py

positional arguments: Workingdir directory containing the whole dataset Outputdir Path to output directory Taxopath Path to taxonomy file

optional arguments: -h, --help show this help message and exit

PCA.R :

usage: extract_subset [-h] Workingdir Outputdir Taxopath Subset_level Subset_name groups

Use to perform PCA and sPCA on a file and save the result in pdf format.

positional arguments: Workingdir directory containing the whole dataset Outputdir Path to output directory Taxopath Path to taxonomy file Subset_level Name of the taxon level you want to keep (phylum, class, order,...) Subset_name Name of the taxon you want to keep at the chosen level groups Name of the taxon level you want to want to highlight

optional arguments: -h, --help show this help message and exit

PCA_CANDIDA.R :

usage: extract_subset [-h] Workingdir Outputdir Candida

Use to perform PCA and sPCA analysis only on Candida yeast species. Use of the candida_data.csv file providing specific data about candida species

positional arguments: Workingdir directory containing the whole dataset Outputdir Path to output directory Candida Path to candida data file

optional arguments: -h, --help show this help message and exit

Classification

prepare_reg_log.py :

usage: prepare log reg [-h] [-f [FAMILY ...]] [-s [SPECIES ...]] [-gc [GENECODETYPE ...]] Workingdir Outputdir Taxopath

Prepare the data for logistic regression model by selecting the subset of species to use for the classification.

positional arguments: Workingdir directory containing the whole dataset Outputdir Path to output directory (will be created if it doesn't exist) Taxopath Path to taxonomy file

optional arguments: -h, --help show this help message and exit -f [FAMILY ...], --family [FAMILY ...] families of interest -s [SPECIES ...], --species [SPECIES ...] list of species to keep -gc [GENECODETYPE ...], --genecodetype [GENECODETYPE ...] Choose which genecode type you want to include (default = W)

reglog.py:

usage: reglog.py [-h] Inputdir Outputdir

Perform classification on the data set with multiple conditions. (Dataset are the files created by prepare_reg_log.py, multiple files will be processed one at a time).

positional arguments: Inputdir directory containing the whole dataset Outputdir Path to output directory (will be created if it doesn't exist)

optional arguments: -h, --help show this help message and exit

model_pathos.py :

usage: model_pathos.py [-h] File Outputdir

Perform classification on the data set with multiple conditions. (Dataset are the files created by prepare_model_pathos.py, multiple files will be processed one at a time).

positional arguments: File File with dataset Outputdir Path to output directory

optional arguments: -h, --help show this help message and exit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YeastCodonAnalysis

Dataset

Taxonomy

PCA

Classification

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Classification		Classification
PCA		PCA
Candida_data.csv		Candida_data.csv
README.md		README.md
taxonomy_cleaning.py		taxonomy_cleaning.py

chrislaincoubard/YeastCodonAnalysis

Folders and files

Latest commit

History

Repository files navigation

YeastCodonAnalysis

Dataset

Taxonomy

PCA

Classification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages