- adapt example cell type classification genes for ovarian cancer and melanoma samples.
-
update cellranger rules
have newcellranger_count_8
rule that includes syntax changes of cellranger v8. The new rule is the default; if an older version of cellranger should be used with the rulecellranger_count
the
ruleorder: cellranger_count > cellranger_count_8
in the Snakefile must be adapted.
Also, add new rulegunzip_and_link_cellranger
and separate these steps from the cellranger rule. -
update and describe in main README how cellranger expects to find the raw FASTQ files.
-
update conda environments
celltyping.yaml
identify_doublets.yaml
sctransform_preprocessing.yaml
-
update
identify_doublets
output- Have results of
identify_doublets
rule in own subdirectory instead of thecounts_filtered
directory.
- Have results of
-
update
generate_qc_plots_*
- have resulting QC plots written in own subdirectory instead of same directory as count files.
- Improve memory usage.
- Clean up script.
-
update
filter_genes_and_cells.R
- implement iterative filtering to make sure the selected thresholds for genes and cells apply to all genes/cells of the downstream analyses
- Clean up script.
-
update
plotting.R
- add more colours for clusters. Make sure even with a high number of clusters, enough colours are provided.
- make sure all cell types that are not found in a sample are still shown in the legend (with
show.legend = T
, adapt to new ggplot2 default settings) - Clean up script.
-
update
create_hdf5.py
- make sure the script can work with Human and also Mouse data. Mouse Ensembl gene IDs are longer than 16 characters, and cannot be of type
dtype='S16'
.
- make sure the script can work with Human and also Mouse data. Mouse Ensembl gene IDs are longer than 16 characters, and cannot be of type
-
fix
sctransform_preprocessing.R
- Filtering of raw input files is not applied to row and column names. This issue should have had no effect as long as filtered input data was provided (with minimum of QC on genes and cells).
- Changed to a check that stops the script if unfiltered input is detected.
- Script linting.
- specify which library should be used for the function
ggsave
to avoid conflict between the R packagesggplot2
andcowplot
- adapt script
query_civic_expr.py
to changed syntax in python packagecivicpy
version 3.0. The script no longer works as is with previous versions of the package. - adapt installation instructions for
civicpy
to require version 3.0
- adapt script
query_civic_expr.py
to changed syntax in python packagecivicpy
version 2.0. The script no longer works as is with previous versions of the package. - adapt installation instructions for
civicpy
to require version 2.0
- adapt R scripts using ggplot2 for plotting (
filter_genes_and_cells.R
) to changes introduced with ggplot v3.4.0. With new defaults inscale_*_manual
only factor levels found in the data are shown in the legend. - in rule
cellranger_count
use consistently full path given inconfig[inputOutput][input_fastqs]
. Update respective README sections. - fix relative path to conda env yaml file in rule inheritance of
create_hdf5_starsolo
.
- fix bug in script
filter_genes_and_cells.R
that resulted in colour discrepancy between legend and plot in{sample}.visualize_filtered_cells.png
in rare cases.
- specify which library should be used for the function ggsave to avoid conflict between ggplot2 and cowplot
- change run time keyword in rules and config from "time_min" or "time" to "runtime" as is recommended.
- have template memory values per job and not per thread
- use "mem_mb" consistently in rule files and config files
- Update example Ovarian cancer cell types to include Mesothelial cells and pDCs
- Fix empty output plot from rule
plot_upsetr
The scAmpi pipeline framework was fully revised to follow the current Snakemake best practices. No changes were made regarding the analysis steps.
- the Snakemake pipeline framework was updated to follow current best practices
- using Snakemake checkpoints the scAmpi basic and clinical parts were joined together to be one workflow
- several samples can be run in parallel (previously that was only possible for scAmpi_basic)
First full, publicly available version of the scAmpi single-cell RNA analysis pipeline.