diff --git a/docs/data_preparation.md b/docs/data_preparation.md index 658c019..6f3f9f8 100755 --- a/docs/data_preparation.md +++ b/docs/data_preparation.md @@ -92,7 +92,7 @@ The **clustersToIdentities** points to a tab-separated file containing the cell The **TF_list** is a list of TFs which is used in the GRNBoost2 run. -The [TF_list.txt](/example/INPUTS/TF_list.tsv) contained in the [INPUTS](/example/INPUTS) folder contains 1879 TFs collected from [PlantRegMap/PlantTFDB v5.0](http://planttfdb.gao-lab.org/), [PlnTFDB v3.0](http://plntfdb.bio.uni-potsdam.de/v3.0/) and [TF2Network](http://bioinformatics.psb.ugent.be/webtools/TF2Network/) for which either direct TF-motif information was available or motif information related to the TF family. +The [TF_list.tsv](/example/INPUTS/TF_list.tsv) contained in the [INPUTS](/example/INPUTS) folder contains 1879 TFs collected from [PlantRegMap/PlantTFDB v5.0](http://planttfdb.gao-lab.org/), [PlnTFDB v3.0](http://plntfdb.bio.uni-potsdam.de/v3.0/) and [TF2Network](http://bioinformatics.psb.ugent.be/webtools/TF2Network/) for which either direct TF-motif information was available or motif information related to the TF family. ``` AT1G18790 diff --git a/example/README.md b/example/README.md index ee36d4a..0760607 100755 --- a/example/README.md +++ b/example/README.md @@ -26,7 +26,7 @@ The [OUTPUTS folder](OUTPUTS/) contains four sub-folders + the log file: - miniexExample_rankedRegulons.xlsx (also available in tsv format), an excel file containing metadata for each of the inferred regulons. The different columns are explained below: - TF: TF gene name (i.e. AT1G71930) - alias: TF alias (i.e. VND7) - - hasTFrelevantGOterm: 'relevant_known_TF' if the TF is associated to a relevant GO term (relative to [GOsIwant.txt](https://github.com/VIB-PSB/MINI-EX/tree/main/example/INPUTS/GOsIwant.txt)), 'known_TF' if the TF is associated to another experimentally validated and/or manually curated GO term, 'unknown_TF' when the TF is uncharacterized + - hasTFrelevantGOterm: 'relevant_known_TF' if the TF is associated to a relevant GO term (relative to [GOsIwant.tsv](https://github.com/VIB-PSB/MINI-EX/tree/main/example/INPUTS/GOsIwant.tsv)), 'known_TF' if the TF is associated to another experimentally validated and/or manually curated GO term, 'unknown_TF' when the TF is uncharacterized - GO: the GO term(s) the TF is associated with - GOdescription: the description of the GO term(s) associated with the TF - cluster: the cell cluster the TF acts in @@ -38,7 +38,7 @@ The [OUTPUTS folder](OUTPUTS/) contains four sub-folders + the log file: - closeness: closeness-centrality - betweenness: betweenness-centrality - GO_enrich_qval (only present if 'termsOfInterest' is not null): FDR-corrected p-value of functional enrichment of the regulon's TGs (functional specificity - reporting only the lowest p-value among the relevant terms) - - GO_enrich_term (only present if 'termsOfInterest' is not null): the relevant GO term (relative to [GOsIwant.txt](https://github.com/VIB-PSB/MINI-EX/tree/main/example/INPUTS/GOsIwant.txt)) for which the regulon's TGs showed the most significant enrichment + - GO_enrich_term (only present if 'termsOfInterest' is not null): the relevant GO term (relative to [GOsIwant.tsv](https://github.com/VIB-PSB/MINI-EX/tree/main/example/INPUTS/GOsIwant.tsv)) for which the regulon's TGs showed the most significant enrichment - GO_enrich_desc (only present if 'termsOfInterest' is not null): description of the GO term for which the regulon's TGs showed the most significant enrichment - #TGs_withGO (only present if 'termsOfInterest' is not null): number of TGs enriched for the most significant relevant GO term - borda_rank: global Borda ranking of the regulon @@ -54,6 +54,6 @@ The [OUTPUTS folder](OUTPUTS/) contains four sub-folders + the log file: - miniexExample_heatmapDEcalls, a clustermap reporting whether the top 150 TFs are upregulated (blue) or just expressed (by at least 10% of the cells within the cell cluster - white) in the cell cluster they act. Each TF is color coded accoring to the GO terms associated to it, in green for 'relevant_known_TF', in yellow for 'known_TF', and gray for 'unknown_TF' ![miniexExample_heatmapDEcalls.svg](OUTPUTS/figures/miniexExample_heatmapDEcalls.svg) - - miniexExample_regmap_x, a heatmap reporting top x TFs (default x range: 10, 25, 50, 100, "topRegs", all) per cluster, showing the maximum expression (mean of top 3 cells) per cluster and the respective ranking (borda_clusterRank). Clusters (i.e. columns) are sorted based on the index given in `miniexExample_identities_with_idx.txt`, allowing to track predicted regulators over predefined lineage. If `miniexExample_identities.txt` is provided instead, the clusters will be ordered as provided in this input file. These figures can be easily produced for additional thresholds/settings using the additionally provided `regmap.sh` script. Call `python3 bin/MINIEX_regmap.py -h` for additional parameters. Note: only regmaps having less than 40 TFs are generated in the PNG format + - miniexExample_regmap_x, a heatmap reporting top x TFs (default x range: 10, 25, 50, 100, "topRegs", all) per cluster, showing the maximum expression (mean of top 3 cells) per cluster and the respective ranking (borda_clusterRank). Clusters (i.e. columns) are sorted based on the index given in `miniexExample_identities_with_idx.tsv`, allowing to track predicted regulators over predefined lineage. If `miniexExample_identities.tsv` is provided instead, the clusters will be ordered as provided in this input file. These figures can be easily produced for additional thresholds/settings using the additionally provided `regmap.sh` script. Call `python3 bin/MINIEX_regmap.py -h` for additional parameters. Note: only regmaps having less than 40 TFs are generated in the PNG format ![miniexExample_regmap_25.svg](OUTPUTS/figures/miniexExample_regmap_8.svg)