You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train ChromBPNet on Arabidopsis ATAC-seq data, but run into the error "Input file shifts inconsistent" as in #153, #174 and #176. Having read through these other issues, I was able to generate the PWM for my BAM file:
The raw fastq data was processed using the nf-core ATAC-seq pipeline with Bowtie2 as the chosen aligner. To train ChromBPNet I took the merged BAM file containing the alignments of all replicates (found under bowtie2/merged_replicate/ among the output of the Nextflow pipeline).
My command used for training (I am running on a computing cluster with Apptainer):
This worked fine when using the data from the tutorial, but yields the inconsistent shift error on my own data. To my knowledge, the nf-core ATAC-seq pipeline does not apply any shifts, at least I did not find it in the documentation.
The commands I used to generate the above PWM from my BAM file:
Do you have any suggestions on how to proceed? I don't have sufficient background on Tn5 bias to judge if the PWM looks as expected. Are there any references you suggest to learn more about Tn5 bias? I am wondering if the issue could be due to specific bias in Arabidopsis or plants in general, compared to the human genome.
The text was updated successfully, but these errors were encountered:
I am trying to train ChromBPNet on Arabidopsis ATAC-seq data, but run into the error
"Input file shifts inconsistent"
as in #153, #174 and #176. Having read through these other issues, I was able to generate the PWM for my BAM file:The raw fastq data was processed using the nf-core ATAC-seq pipeline with Bowtie2 as the chosen aligner. To train ChromBPNet I took the merged BAM file containing the alignments of all replicates (found under
bowtie2/merged_replicate/
among the output of the Nextflow pipeline).My command used for training (I am running on a computing cluster with Apptainer):
apptainer exec --nv -e \ --no-mount /scratch \ --bind data:/data \ --bind bias_model:/bias_model \ --bind potter-kc:/potter \ --bind out:/output \ chrombpnet.sif \ chrombpnet pipeline \ -ibam /potter/CONTROL.mRp.clN.sorted.bam \ -d ATAC \ -g /potter/ath.fasta \ -c /potter/ath.chrom.sizes.txt \ -p /data/peaks_no_blacklist.bed \ -n /data/background_negatives.bed \ -fl /data/train1-3_val4_test5.json \ -b /bias_model/ENCSR868FGK_bias_fold_0.h5 \ -o /output
This worked fine when using the data from the tutorial, but yields the inconsistent shift error on my own data. To my knowledge, the nf-core ATAC-seq pipeline does not apply any shifts, at least I did not find it in the documentation.
The commands I used to generate the above PWM from my BAM file:
Do you have any suggestions on how to proceed? I don't have sufficient background on Tn5 bias to judge if the PWM looks as expected. Are there any references you suggest to learn more about Tn5 bias? I am wondering if the issue could be due to specific bias in Arabidopsis or plants in general, compared to the human genome.
The text was updated successfully, but these errors were encountered: