Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input & help files #36

Open
cintiaoi opened this issue Apr 3, 2024 · 7 comments
Open

Input & help files #36

cintiaoi opened this issue Apr 3, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@cintiaoi
Copy link

cintiaoi commented Apr 3, 2024

Description of feature

@muffato @gq1 @ksenia-krasheninnikova Hi, I want to run the pipeline with some insect genomes, I was wondering if there is any input or help files I can start with. Thanks

@cintiaoi cintiaoi added the enhancement New feature or request label Apr 3, 2024
@ksenia-krasheninnikova
Copy link
Contributor

ksenia-krasheninnikova commented Apr 3, 2024

Hi @cintiaoi
Have you had a look here?

https://github.com/sanger-tol/genomeassembly/blob/main/docs/usage.md
https://github.com/sanger-tol/genomeassembly/blob/main/docs/output.md

There are some example YAML files in the /assets folder in the repo.

@cintiaoi
Copy link
Author

cintiaoi commented Apr 3, 2024

Hi @ksenia-krasheninnikova, thanks for your fast reply. I've checked assets/test.yaml and the other yaml files but they looked like full paths to an operational system we don't have access to. To be able to run, I was wondering if you can let me know how those files look like

@ksenia-krasheninnikova
Copy link
Contributor

Have a look here:

https://darwin.cog.sanger.ac.uk/genomeassembly_test_data.tar.gz

This dataset corresponds to assets/test_github.yaml

@gq1
Copy link
Member

gq1 commented Apr 3, 2024

Here are the instructions how to do the test locally.
https://github.com/sanger-tol/genomeassembly/blob/main/docs/usage.md#local-testing

@cintiaoi
Copy link
Author

cintiaoi commented Apr 3, 2024

Thanks! So can we run the pipeline without 10x data, it looks that way in the main.nf. We have Pac bio and HiC data.
Also, there is a mito.fam file which we are not exactly sure what it this is, is there an example? Thanks again

@ksenia-krasheninnikova
Copy link
Contributor

If you keep polishing step switched off in the config file with polishing_on = false (like here) then the 10X data is not needed. You don't need .fam file to run the pipeline from the main branch now, this feature will be available in the next release.

Hope this helps!

@cintiaoi
Copy link
Author

Just a follow up of the things I changed to run my own data.

juicer_tools_pre.nf I had to change the java version

  •    'quay.io/biocontainers/java-jdk:8.0.92--1' }"
    

merquryfk/main.nf
process MERQURYFK_MERQURYFK {
tag "$meta.id"
label 'process_medium'

  • label 'process_high_memory'

In the Nextflow config, I had to change the memory to run on my server:

including:

apptainer.registry = 'quay.io'

  • max_memory = '128.GB'
  • max_memory = '30.GB'
    max_cpus = 16
  • max_time = '240.h'
  • max_time = '48.h'

Using my genome I only had a SAM file, so I created a CRAM file. But still I had some errors, so a crai file was missing, which I also created using samtools. Now it worked.

I created a fork on my github account and the modified files are there.
Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants