small fixes

DendrouLab · Mar 13, 2024 · 4261378 · 4261378
1 parent 31eb8c6
commit 4261378
Show file tree

Hide file tree

Showing 2 changed files with 13 additions and 10 deletions.
diff --git a/docs/usage/gene_list_format.md b/docs/usage/gene_list_format.md
@@ -43,7 +43,7 @@ For a typical usecase, we provide example lists on our [github page](https://git
 The human-only cellcycle genes used in [scanpy.score_genes_cell_cycle](https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.score_genes_cell_cycle.html) 
 are stored in [resources/cell_cycle_genes.csv](https://github.com/DendrouLab/panpipes/blob/main/panpipes/resources/cell_cycle_genes.tsv)
 
-However, if the data is mouse only then the cellcycle gene list can be found in [resources/mouse_cell_cycle_genes.tsv](https://github.com/DendrouLab/panpipes/blob/mouse_cell_cycle/panpipes/resources/mouse_cell_cycle_genes.tsv)
+However, if you are working with mouse data, we supply an alternative cellcycle gene list with murine genes, which  can be found in [resources/mouse_cell_cycle_genes.tsv](https://github.com/DendrouLab/panpipes/blob/mouse_cell_cycle/panpipes/resources/mouse_cell_cycle_genes.tsv)
 
 Differently from the other custom gene file, the cell cycle file should be a **tab separated file with two columns**:
 
@@ -105,6 +105,7 @@ However, if the input is from mouse data then, the custom genelist file can be s
 
     ```yaml
     exclude_file: resources/qc_gene_list_mouse.csv
+    ```
 ### Explaining custom gene lists actions
 
 1. **Ingest workflow** (pipeline_ingest.py)

diff --git a/docs/yaml_docs/pipeline_ingestion_yml.md b/docs/yaml_docs/pipeline_ingestion_yml.md
@@ -201,21 +201,23 @@ In the ingestion workflow we compute cell and genes QC metrics (such as % of mit
 Feel free to leave options blank to run with default parameters.
 
 #### Providing a gene list
-To calculate RNA QC metrics, we need to define a gene list providing additional information on the genes in the data.
+To calculate RNA QC metrics based on custom genes annotations, we need to use a gene list providing additional information on the genes expressed in the data.
 Additionally, we can specify what actions we want to apply to the genes, such as what metrics to calculate.
 
-<span class="parameter">custom_genes_file</span>`String`, Default: resources/qc_genelist_1.0.csv<br>
+Please visit our documentation section on [creating and using custom genes lists](../usage/gene_list_format.md) to perform quality control and visualization. 
+<span class="parameter">custom_genes_file</span>`String`, Mandatory parameter, Default: resources/qc_genelist_1.0.csv<br>
     Path to the file containing the entire human gene list. Panpipes provides such a file with standard genes, and the path to this file is set as default.
 
-However, if the input is from mouse data then the user must provide the mouse gene list as shown here: 
+##### Working with different species than human
+*If working with a different species, the user must provide the appropriate gene list. For example, we offer a precompiled version of the qc gene list for mouse, the user can supply the list by specifying the path to the file as shown here:*
 
- <span class="parameter">custom_genes_file</span>`String`, Default: qc_gene_list_mouse.csv<br>
+ `custom_genes_file:  qc_gene_list_mouse.csv`
 
-This mouse gene list can be found in the panpipes [resources](https://github.com/DendrouLab/panpipes/blob/mouse_gene_list_upload/panpipes/resources/qc_gene_list_mouse.csv)
+*Find the mouse gene list in our [resources](https://github.com/DendrouLab/panpipes/blob/mouse_gene_list_upload/panpipes/resources/qc_gene_list_mouse.csv)*
 
-Usually, it's convenient to rely on known gene lists, as this simplifies various downstream tasks, such as evaluating the percentage of mitochondrial genes in the data, identify ribosomal genes, or excluding IGG genes from HVG selection.
-For the ingestion workflow, we retrieved the cell cycle genes used in `scanpy.score_genes_cell_cycle` [Satija et al. (2015), Nature Biotechnology](https://www.nature.com/articles/nbt.3192) and stored them in a file: panpipes/resources/cell_cicle_genes.tsv.
-Additionally, we also provide an example for an entire gene list: panpipes/resources/qc_genelist_1.0.csv 
+
+It's convenient to rely on known gene lists, as this simplifies various downstream tasks, such as evaluating the percentage of mitochondrial genes in the data, identify ribosomal genes, or excluding IGG genes from HVG selection.
+For the ingestion workflow, we retrieved the cell cycle genes used in `scanpy.score_genes_cell_cycle` [Satija et al. (2015), Nature Biotechnology](https://www.nature.com/articles/nbt.3192) 
 
 | mod | feature | group  |
 |-----|---------|--------|
@@ -228,7 +230,7 @@ Additionally, we also provide an example for an entire gene list: panpipes/resou
 Next, we define "actions" on the genes as follows:
 
 In the group column, specify what actions you want to apply to that specific gene.
-For instance: calc_proportion: mt will calculate proportion of reads mapping to the genes whose group is "mt".
+For instance: `calc_proportion: mt` will calculate proportion of reads mapping to the genes whose group is "mt" in the custom genes file.
 
 (for pipeline_ingest.py)
 calc_proportions: calculate proportion of reads mapping to X genes over total number of reads, per cell