-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in rule polish_clusters (both with the test data and with real data) #13
Comments
Im having the same problem |
went through other issues posted and found that pip install numpy==1.19.5 fixed the problem for me |
Hi @samir-watson, thank you for pointing this out. I also found the issue you are probably referring (#11 ). Unfortunately, for me installing numpy 1.19.5 did not solve the problem. I followed the procedure as in #11 but still the same problem |
Update: i installed the pipeline on my laptop as well, experiencing the same issue. There installing numpy 1.19.5 indeed fixed the problem, but not on our data analysis system. |
Numpy Error: nanoporetech#13 Solution (Downgrade Numpy): nanoporetech#13 (comment) With reference to the above; proposal to include (numpy==1.19.5) as otherwise numpy will default to (numpy-base-1.21.2 | 4.8 MB | ) when installed using provided conda instructions which errors.
Numpy Error: #13 Solution (Downgrade Numpy): #13 (comment) With reference to the above; proposal to include (numpy==1.19.5) as otherwise numpy will default to (numpy-base-1.21.2 | 4.8 MB | ) when installed using provided conda instructions which errors.
Hello,
I installed the pipeline, which seems to have worked fine, but when running the pipeline test as recommended
# To test if the installation was successful run $ snakemake -j 1 -pr --configfile config.yml
I get an error at step 6 (might be as in #5 ). The same error occurs also with real data. Would be very grateful for any help. Thanks!
$ snakemake -j 1 -pr --configfile config.yml
Targets: EGFR_917
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
cluster 1 1 1
cluster_consensus 1 1 1
copy_bed 1 1 1
detect_umi_consensus_fasta 1 1 1
detect_umi_fasta 1 1 1
map_1d 1 1 1
map_consensus 2 1 1
polish_clusters 1 1 1
reads 1 1 1
reformat_consensus_clusters 1 1 1
reformat_filter_clusters 1 1 1
seqkit_bam_acc_tsv 1 1 1
split_reads 1 1 1
total 14 1 1
Select jobs to execute...
[Mon Oct 25 09:11:15 2021]
rule copy_bed:
input: data/example_egfr_amplicon.bed
output: example_egfr_single_read_run/targets.bed
jobid: 1
reason: Missing output files: example_egfr_single_read_run/targets.bed
wildcards: name=example_egfr_single_read_run
resources: tmpdir=/tmp
cp data/example_egfr_amplicon.bed example_egfr_single_read_run/targets.bed
[Mon Oct 25 09:11:15 2021]
Finished job 1.
1 of 14 steps (7%) done
Select jobs to execute...
[Mon Oct 25 09:11:15 2021]
rule map_1d:
input: data/example_egfr_single_cluster.fastq, data/example_egfr_reference.fasta
output: example_egfr_single_read_run/align/1d.bam, example_egfr_single_read_run/align/1d.bam.bai
jobid: 11
reason: Missing output files: example_egfr_single_read_run/align/1d.bam
wildcards: name=example_egfr_single_read_run
resources: tmpdir=/tmp
catfishq --max_n 0 data/example_egfr_single_cluster.fastq | minimap2 -ax map-ont -k 13 -t 1 data/example_egfr_reference.fasta - | samtools sort -@ 5 -o example_egfr_single_read_run/align/1d.bam - && samtools index -@ 1 example_egfr_single_read_run/align/1d.bam
[M::mm_idx_gen::0.0061.09] collected minimizers
[M::mm_idx_gen::0.0131.04] sorted minimizers
[M::main::0.0131.04] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.0131.04] mid_occ = 13
[M::mm_idx_stat] kmer size: 13; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.0141.04] distinct minimizers: 35388 (98.51% are singletons); average occurrences: 1.028; average spacing: 5.326; total length: 193845
[M::worker_pipeline::0.0860.42] mapped 50 sequences
[M::main] Version: 2.22-r1101
[M::main] CMD: minimap2 -ax map-ont -k 13 -t 1 data/example_egfr_reference.fasta -
[M::main] Real time: 0.088 sec; CPU: 0.039 sec; Peak RSS: 0.008 GB
[Mon Oct 25 09:11:15 2021]
Finished job 11.
2 of 14 steps (14%) done
Select jobs to execute...
[Mon Oct 25 09:11:15 2021]
rule split_reads:
input: example_egfr_single_read_run/align/1d.bam
output: example_egfr_single_read_run/fasta_filtered, example_egfr_single_read_run/stats/umi_filter_reads_stats.txt
jobid: 10
reason: Missing output files: example_egfr_single_read_run/fasta_filtered; Input files updated by another job: example_egfr_single_read_run/align/1d.bam
wildcards: name=example_egfr_single_read_run
resources: tmpdir=/tmp
Reads found: 50
Reads unmapped: 0 (0%)
EGFR_917
Reads found: 50
On target: 50 (100%)
0 concatamers - 0%
0 short - 0%
[Mon Oct 25 09:11:15 2021]
Finished job 10.
3 of 14 steps (21%) done
Select jobs to execute...
[Mon Oct 25 09:11:15 2021]
rule detect_umi_fasta:
input: example_egfr_single_read_run/fasta_filtered
output: example_egfr_single_read_run/fasta_umi/EGFR_917_detected_umis.fasta
jobid: 9
reason: Missing output files: example_egfr_single_read_run/fasta_umi/EGFR_917_detected_umis.fasta; Input files updated by another job: example_egfr_single_read_run/fasta_filtered
wildcards: name=example_egfr_single_read_run, target=EGFR_917
resources: tmpdir=/tmp
Counting reads in example_egfr_single_read_run/fasta_filtered/EGFR_917.fastq
100%|█████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 17395.09it/s]
Found 25 fwd and 25 rev reads (ratio: 1.0)
100.0% of reads contained both UMIs with max 3 mismatches
[Mon Oct 25 09:11:15 2021]
Finished job 9.
4 of 14 steps (29%) done
Select jobs to execute...
[Mon Oct 25 09:11:15 2021]
rule cluster:
input: example_egfr_single_read_run/fasta_umi/EGFR_917_detected_umis.fasta
output: example_egfr_single_read_run/clustering/EGFR_917/clusters_centroid.fasta, example_egfr_single_read_run/clustering/EGFR_917/clusters_consensus.fasta, example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters
jobid: 8
reason: Missing output files: example_egfr_single_read_run/clustering/EGFR_917/clusters_consensus.fasta, example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters; Input files updated by another job: example_egfr_single_read_run/fasta_umi/EGFR_917_detected_umis.fasta
wildcards: name=example_egfr_single_read_run, target=EGFR_917
resources: tmpdir=/tmp
mkdir -p example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters && vsearch --clusterout_id --clusters example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters/test --centroids example_egfr_single_read_run/clustering/EGFR_917/clusters_centroid.fasta --consout example_egfr_single_read_run/clustering/EGFR_917/clusters_consensus.fasta --minseqlength 40 --maxseqlength 60 --qmask none --threads 1 --cluster_fast example_egfr_single_read_run/fasta_umi/EGFR_917_detected_umis.fasta --clusterout_sort --gapopen 0E/5I --gapext 0E/2I --mismatch -8 --match 6 --iddef 0 --minwordmatches 0 --qmask none -id 0.85
vsearch v2.18.0_linux_x86_64, 62.7GB RAM, 32 cores
https://github.com/torognes/vsearch
Reading file example_egfr_single_read_run/fasta_umi/EGFR_917_detected_umis.fasta 100%
2802 nt in 50 seqs, min 54, max 58, avg 56
Sorting by length 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 1 Size min 50, max 50, avg 50.0
Singletons: 0, 0.0% of seqs, 0.0% of clusters
Multiple alignments 100%
[Mon Oct 25 09:11:16 2021]
Finished job 8.
5 of 14 steps (36%) done
Select jobs to execute...
[Mon Oct 25 09:11:16 2021]
rule reformat_filter_clusters:
input: example_egfr_single_read_run/clustering/EGFR_917/clusters_consensus.fasta, example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters
output: example_egfr_single_read_run/clustering/EGFR_917/clusters_fa, example_egfr_single_read_run/stats/EGFR_917_vsearch_cluster_stats.tsv, example_egfr_single_read_run/clustering/EGFR_917/smolecule_clusters.fa
jobid: 7
reason: Missing output files: example_egfr_single_read_run/clustering/EGFR_917/clusters_fa, example_egfr_single_read_run/stats/EGFR_917_vsearch_cluster_stats.tsv, example_egfr_single_read_run/clustering/EGFR_917/smolecule_clusters.fa; Input files updated by another job: example_egfr_single_read_run/clustering/EGFR_917/clusters_consensus.fasta, example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters
wildcards: name=example_egfr_single_read_run, target=EGFR_917
resources: tmpdir=/tmp
umi_parse_clusters --smolecule_out example_egfr_single_read_run/clustering/EGFR_917/smolecule_clusters.fa --balance_strands --min_reads_per_clusters 20 --max_reads_per_clusters 60 --stats_out example_egfr_single_read_run/stats/EGFR_917_vsearch_cluster_stats.tsv -o example_egfr_single_read_run/clustering/EGFR_917/clusters_fa example_egfr_single_read_run/clustering/EGFR_917/clusters_consensus.fasta example_egfr_single_read_run/clustering/EGFR_917/vsearch_clusters
100%|████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1599.05it/s]
Clusters: 100% written (1)
Reads: 50 found
Reads: 0 removed (0.0%)
Reads: 100% written
Reads: 100% in written clusters
[Mon Oct 25 09:11:16 2021]
Finished job 7.
6 of 14 steps (43%) done
Select jobs to execute...
[Mon Oct 25 09:11:16 2021]
rule polish_clusters:
input: example_egfr_single_read_run/clustering/EGFR_917/clusters_fa, example_egfr_single_read_run/clustering/EGFR_917/smolecule_clusters.fa
output: example_egfr_single_read_run/fasta/EGFR_917_consensus_tmp, example_egfr_single_read_run/fasta/EGFR_917_consensus.bam, example_egfr_single_read_run/fasta/EGFR_917_consensus.fasta
jobid: 6
reason: Missing output files: example_egfr_single_read_run/fasta/EGFR_917_consensus.fasta; Input files updated by another job: example_egfr_single_read_run/clustering/EGFR_917/clusters_fa, example_egfr_single_read_run/clustering/EGFR_917/smolecule_clusters.fa
wildcards: name=example_egfr_single_read_run, target=EGFR_917
resources: tmpdir=/tmp
[Mon Oct 25 09:11:19 2021]
Error in rule polish_clusters:
jobid: 6
output: example_egfr_single_read_run/fasta/EGFR_917_consensus_tmp, example_egfr_single_read_run/fasta/EGFR_917_consensus.bam, example_egfr_single_read_run/fasta/EGFR_917_consensus.fasta
shell:
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: pipeline-umi-amplicon/.snakemake/log/2021-10-25T091115.498024.snakemake.log
The text was updated successfully, but these errors were encountered: