From 0f926fe476e1a9a6de69921d17601d950ce1c0f7 Mon Sep 17 00:00:00 2001 From: giuliaelgarcia <147185635+giuliaelgarcia@users.noreply.github.com> Date: Tue, 19 Mar 2024 17:48:30 +0000 Subject: [PATCH 1/3] Update troubleshooting.md --- docs/troubleshooting.md | 88 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 1ba11de1..70c94068 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -18,3 +18,91 @@ You can fix the issue by ... Second: Before re-running panpipes, we recommend deleting any intermediate files that were created in the previous run which broke halfway through. + + +### Error in Plot 10x metric task in ingestion when not starting from CellRanger outputs +**NoneType error** + +First: check the log files to see what went wrong. +- In this case the pipeline failed at: + +``` +TypeError: argument of type 'NoneType' is not iterable \ +``` +- After checking the error, check which log file to inspect by checking the Job line: +``` +Task = def pipeline_ingest.aggregate_tenx_metrics_multi(...): \ +Job = [None -> logs/tenx_metrics_multi_aggregate.log] \ +``` + +- Inspect the log file for this process in `logs/tenx_metrics_multi_aggregate.log` +- Check which Python script the code failed to know which task failed by looking at. In this case, the error was in: + + ``` + File "../panpipes/panpipes/python_scripts/aggregate_cellranger_summary_metrics.py" + ``` +- This indicates that in the pipeline.yml some parameter involving `10x metric` was set wrong, and by going to the beginning of the pipeline.log file it is clear that the `plot_10X_metrics` was set to `True` although the data did not come from CellRanger, which means `plot_10X_metrics` should be set to `False` since the input data was an Andata object. +- Thus, by changing the `pipeline.yml` `plot_10X_metrics` to `False` the issue is fixed +- To rerun, first delete the `logs/tenx_metrics_multi_aggregate.log` file +- Then run `panpipes ingest make full` + + + +### Error in Plot 10x metric task in ingestion when starting from CellRanger outputs using 10X.h5 format +**KeyError** + +First: check the log files to see what went wrong. +- In this case the pipeline failed at: + +``` +raise KeyError(key) from err \ +KeyError: 'Median UMI counts per cell' \ +``` + +- After checking the error, check what was the last task run: + +``` +Traceback (most recent call last): \ +File "/data/leuven/344/vsc34406/Miniconda3/envs/Panpipes/lib/python3.9/site-packages/panpipes/python_scripts/aggregate_cellranger_summary_metrics.py", line 268, in \ +``` + +- In this case, the error was while running the Python script `aggregate_cellranger_summary_metrics.py` +- This means that a particular string could not be found. In this case, it was the `Median UMI counts per cell` + + + + + + + + + + + + + + +### Wrong cell_cycle gene list inputted +**No valid genes were passed for scoring** + +First: check the log files to see what went wrong. +- In this case the pipeline failed at: +``` +ValueError: No valid genes were passed for scoring. \ +``` + + - After checking the error, check which log file to inspect by checking the Job line: +``` + Task = def pipeline_ingest.run_rna_qc(...): \ + Job = [None -> logs/run_scanpy_qc_rna.log, mouse_integrated_cell_metadata.tsv, mouse_integrated_unfilt.h5mu] \ +``` +- In this case, also check the WARNING: + +``` + WARNING: genes are not in var_names and ignored: ['MCM5', 'PCNA', 'TYMS', 'FEN1', 'MCM2', 'MCM4', 'RRM1', 'UNG', 'GINS2', 'MCM6', 'CDCA7', 'DTL', 'PRIM1', 'UHRF1', 'MLF1IP', 'HELLS', 'RFC2', 'RPA2', 'NASP', 'RAD51AP1', 'GMNN', 'WDR76', 'SLBP', 'CCNE2', 'UBR7', 'POLD3', 'MSH2', 'ATAD2', 'RAD51', 'RRM2', 'CDC45', 'CDC6', 'EXO1', 'TIPIN', 'DSCC1', 'BLM', 'CASP8AP2', 'USP1', 'CLSPN', 'POLA1', 'CHAF1B', 'BRIP1', 'E2F8'] \ +``` +- This shows that the gene list provided is not matching with the input data since **none** of the genes can be found. For example, this gene list is for humans only and the input data was from a mouse model which requires a mouse-specific gene list +- To fix the issue simply change the preprocess pipeline.yml `custom_genes_file` to a mouse gene list provided in the `panpipes/resources` +- Delete the log folders and any temp files +- Run `panpipes ingest make ful` + From 6a8499cd4b578e9d0306cef1d6c7be0a542f2877 Mon Sep 17 00:00:00 2001 From: giuliaelgarcia <147185635+giuliaelgarcia@users.noreply.github.com> Date: Tue, 19 Mar 2024 17:50:27 +0000 Subject: [PATCH 2/3] Update troubleshooting.md --- docs/troubleshooting.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 70c94068..26716834 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -74,14 +74,6 @@ File "/data/leuven/344/vsc34406/Miniconda3/envs/Panpipes/lib/python3.9/site-pack - - - - - - - - ### Wrong cell_cycle gene list inputted **No valid genes were passed for scoring** From 811f0087b0c9e4b5acddf90c992f2ee6ea589511 Mon Sep 17 00:00:00 2001 From: Giulia Garcia <147185635+giuliaelgarcia@users.noreply.github.com> Date: Fri, 22 Mar 2024 13:22:57 +0000 Subject: [PATCH 3/3] Update troubleshooting.md --- docs/troubleshooting.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 26716834..af4f9b1e 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -98,3 +98,22 @@ ValueError: No valid genes were passed for scoring. \ - Delete the log folders and any temp files - Run `panpipes ingest make ful` + +### Running preprocessed with RNA-only PlotQC error +**No such file or directory** +First: check the log files to see what went wrong. +- In this case the pipeline failed at: +``` +FileNotFoundError: [Errno 2] No such file or directory: +``` + - After checking the error, check which log file to inspect by checking the Job line: +``` + Task = def pipeline_preprocess.filter_mudata(...): \ + Job = [None -> humanised_preprocessed.h5mu] \ +``` +- Before proceeding, double-check the `pipeline.yml` to see if the correct path was provided for the mudata object. +- Once the user is sure the path is correct, the next step is to check the modalities in the `pipeline.yml` +- In this case, the mudata object contains only RNA and that was the only modality set for `True`, however in the `plotqc` variable there are metrics for `prot_metrics` when there is no protein in the data. +- To fix the issue, clear the metrics in `prot_metrics` +- Delete the log folder and any temp files +- Re-run `panpipes preprocess make full`