-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Workflow not running #681
Comments
Is it possible there is a shell message being output? Those are not captured in stdout/stderr, and won't end up in the log but will be printed in your terminal. Segfaults and memory issues are examples of this. Maybe setting |
The cluster was a bit busy today, I hope it will run during the night |
Hi Maarten, I have a good news, I tried the atac-seq workflow as well yesterday and it worked just fine, I'll try it today with my own samples. Concerning the chip-seq workflow, I tried the modification you suggested the genome is now
The file May be the problem comes from this:
I tried to dig into the rules to find how the hg38.fa.sizes is created from hg38.fa but I can't find it from the python script. Do you have any idea what the problem can be problem? Here are attached:
|
Good news, I am happy at least some is working for you! We just got a "freshly" installed server this morning, and I, unfortunately, can not reproduce this error there 😞 ... The warning is indeed suspicious, however it also happenend on my successful run. I made an issue for this #682, but I don't think it's causing the problem. One thing I noticed in the terminal output is the line:
as stdout/stderr that is not captured by our rule. However that also just seems to indicate that the .fa.sizes file is empty.. The ATAC-seq workflow is practically a copy of the chip-seq workflow, except that some defaults are set differently, so this is quite surprising to me. 🤔 @siebrenf I remember we had some file-latency ish error in the past with genomepy. The rule was registred as finished succesfully, but it was still running in the background somehow. Are we sure this was "solved"? Perhaps @JihedC you could try adding a long sleep (e.g. 1 minute) at the end of this script? https://github.com/vanheeringen-lab/seq2science/blob/master/seq2science/scripts/genome_support.py. Maybe the cluster somehow needs some time to sync updates to files? |
p.s. depending on whether or not you are used to conda/python packaging, adding the sleep might be extremely trivial, or quite complicated. Let me know if you don't know how to do it, I can type it out for you 😄 |
Hi Maarten, I don't know how to do it, could you help me? I could find, I think
|
It should be in add
at the bottom |
Ok great thanks for the quick reply! |
it was a latency/communication issue with scripts in general, and it sounds like a plausible cause for this error! |
So I tried two things:
And I have got the same issue with the
I have also got a similar error with the Zebra fish genome. You said it worked fine on your computer? May be there is something wrong with our cluster computer for the download of this file? |
Yeah I honestly don't know what is going on, and it would be best if this can be fixed somehow... One thing to try is to download the genome directly through genomepy, and see if you can use that .fa.sizes. Genomepy comes with seq2science, so you do not have to install anything
Let's hope you can just copy the freshly downloaded .fa.sizes from to the corrupt seq2science one, and you can at least just run the workflows from there... |
Let me know if you get it working (or not) |
Yes I will update you as soon as I can. I had a little issue with memory space which slowed me a bit. Now it should be okay. |
Hi Maarten, Here is what I did to try to make the chip-seq workflow run.
Since I got an empty file for the mm10.size.fa file, I downloaded mm10 with
Note that mm10.gaps.bed is empty. I ran the workflow on slurm and I have got the following problem
It was stuck at the job 8 for 2 days and then stopped due to the time limit I set for the slurm job. The problem was that the bowtie 2 index files were incomplete and for some reason it was not communicated to me:
I am going to try with bwa again. |
The issue is still the same:
Do you have may be this file? or an example of its format? |
I made a mm10 folder for you. http://ocimum.science.ru.nl/mm10/ When running seq2science the first time with these files I think you need to use something like:
This is necessary because the timestamps will be messed up from downloading the file, and otherwise snakemake/seq2science will try to re-create these files |
Thank you so much for these files! I have added them to my Unfortunately it still does not work. Here are the log and the slurmoutput: seq2science.2021-04-21T100027.917233.log The run goes so fast I am doubting that it's doing anything. Here is the content of the bwa-index:
So it created it but the results folder is almost empty:
Do you think it because I am using a swatch command to distribute the job on the cluster? Then snakemake doesn't actually know which job are done or not? |
I have honestly no clue what is going on here.. Sorry, I don't think I can help you 😭 |
If the ATAC-seq run did work, but you get timeouts and unexplained errors, then maybe the issue is the server occupancy/load? |
No worries @Maarten-vd-Sande I will try that @siebrenf |
Hi,
I have been trying to run the chip-seq workflow of seq2science. It starts but stops when 7% of the jobs are done.
To Reproduce
Please include your config.yaml, your samples.tsv, and the complete/relevant output.
Both
config.yaml
andsamples.tsv
were generated fromseq2science init chip-seq
I get several error messages, I include the complete log file:
seq2science.2021-04-13T103059.065792.log
The log file in
seq2science/results/log/bwa-mem2_index/hg38.log
:Those are the files I got in the genome folder:
Do you think the problem comes from there?
The text was updated successfully, but these errors were encountered: