Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rules checkm_... assume bins have been found? #44

Closed
julianzaugg opened this issue May 10, 2022 · 3 comments
Closed

rules checkm_... assume bins have been found? #44

julianzaugg opened this issue May 10, 2022 · 3 comments

Comments

@julianzaugg
Copy link
Contributor

julianzaugg commented May 10, 2022

A researcher I am assisting had the following error thrown when running the latest aviary 0.3.3:

[Tue May 10 13:17:25 2022]
rule checkm_rosella:
    input: data/rosella_bins/done
    output: data/rosella_bins/checkm.out
    jobid: 17
    threads: 80
    resources: tmpdir=/tmp

Activating conda environment: ../../../../../../home/user/.conda/envs/7f69cd9282402138a5a52ef44587d7df
[2022-05-10 13:17:32] INFO: CheckM v1.1.3
[2022-05-10 13:17:32] INFO: checkm lineage_wf -t 80 --pplacer_threads 48 -x fna data/rosella_bins/ data/rosella_bins//checkm --tab_table -f data/rosella_bins/checkm.out
[2022-05-10 13:17:32] INFO: [CheckM - tree] Placing bins in reference genome tree.
[2022-05-10 13:17:32] ERROR: No bins found. Check the extension (-x) used to identify bins.

  Controlled exit resulting from an unrecoverable error or warning.
Waiting at most 5 seconds for missing files.
MissingOutputException in line 313 of /srv/home/user/temp/aviary/aviary/modules/binning/binning.smk:
Job Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
data/rosella_bins/checkm.out completed successfully, but some output files are missing. 17
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-05-10T131724.751376.snakemake.log
An error occurred
05/10/2022 01:17:38 PM CRITICAL: Command 'snakemake --snakefile /srv/home/user/temp/aviary/aviary/modules/Snakefile --directory /srv/projects3/IMOS/analysis/20220411_imos_db_binning/aviary_output_dirs/aviaryRecover_broomfield_1_singleSiteBinning --jobs 80 --rerun-incomplete --configfile '/srv/projects3/IMOS/analysis/20220411_imos_db_binning/aviary_output_dirs/aviaryRecover_broomfield_1_singleSiteBinning/config.yaml' --nolock  --conda-frontend mamba --use-conda --conda-prefix /srv/home/user/.conda/envs/  recover_mags' returned non-zero exit status 1.

Based on a quick look at the output/error and the code, it would appear that rules checkm_rosella, checkm_metabat2 etc. assume there are bin files produced from their corresponding binning rules. However checkm will fail if no bin files are present and cause the pipeline to exit.

Is this correct?

@julianzaugg julianzaugg changed the title rule checkm_... assume bins have been found? rules checkm_... assume bins have been found? May 10, 2022
@rhysnewell
Copy link
Owner

You're right, there should be a check here to prevent auto failure. Just curious, were any MAGs recovered by other tools?

@julianzaugg
Copy link
Contributor Author

julianzaugg commented May 11, 2022

Yes. Metabat_sspec, for example, produced ~360 bins.

I do wonder if this is related to our ongoing issue #43 (comment)

@rhysnewell
Copy link
Owner

rhysnewell commented May 11, 2022

If this issue (#43 (comment)) is persisting then that would cause this crash as well. I can add in a buffer so aviary doesn't crash at these steps, but it is helpful to know when one of the binners crashes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants