Input & help files #36

cintiaoi · 2024-04-03T15:00:02Z

Description of feature

@muffato @gq1 @ksenia-krasheninnikova Hi, I want to run the pipeline with some insect genomes, I was wondering if there is any input or help files I can start with. Thanks

ksenia-krasheninnikova · 2024-04-03T15:25:32Z

Hi @cintiaoi
Have you had a look here?

https://github.com/sanger-tol/genomeassembly/blob/main/docs/usage.md
https://github.com/sanger-tol/genomeassembly/blob/main/docs/output.md

There are some example YAML files in the /assets folder in the repo.

cintiaoi · 2024-04-03T15:46:38Z

Hi @ksenia-krasheninnikova, thanks for your fast reply. I've checked assets/test.yaml and the other yaml files but they looked like full paths to an operational system we don't have access to. To be able to run, I was wondering if you can let me know how those files look like

ksenia-krasheninnikova · 2024-04-03T15:55:09Z

Have a look here:

https://darwin.cog.sanger.ac.uk/genomeassembly_test_data.tar.gz

This dataset corresponds to assets/test_github.yaml

gq1 · 2024-04-03T16:57:09Z

Here are the instructions how to do the test locally.
https://github.com/sanger-tol/genomeassembly/blob/main/docs/usage.md#local-testing

cintiaoi · 2024-04-03T17:37:07Z

Thanks! So can we run the pipeline without 10x data, it looks that way in the main.nf. We have Pac bio and HiC data.
Also, there is a mito.fam file which we are not exactly sure what it this is, is there an example? Thanks again

ksenia-krasheninnikova · 2024-04-04T08:16:08Z

If you keep polishing step switched off in the config file with polishing_on = false (like here) then the 10X data is not needed. You don't need .fam file to run the pipeline from the main branch now, this feature will be available in the next release.

Hope this helps!

cintiaoi · 2024-06-10T17:09:38Z

Just a follow up of the things I changed to run my own data.

juicer_tools_pre.nf I had to change the java version

   'quay.io/biocontainers/java-jdk:8.0.92--1' }"

merquryfk/main.nf
process MERQURYFK_MERQURYFK {
tag "$meta.id"
label 'process_medium'

label 'process_high_memory'

In the Nextflow config, I had to change the memory to run on my server:

including:

apptainer.registry = 'quay.io'

max_memory = '128.GB'

max_memory = '30.GB'
max_cpus = 16

max_time = '240.h'

max_time = '48.h'

Using my genome I only had a SAM file, so I created a CRAM file. But still I had some errors, so a crai file was missing, which I also created using samtools. Now it worked.

I created a fork on my github account and the modified files are there.
Thanks for your help!

cintiaoi added the enhancement New feature or request label Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input & help files #36

Input & help files #36

cintiaoi commented Apr 3, 2024

ksenia-krasheninnikova commented Apr 3, 2024 •

edited

Loading

cintiaoi commented Apr 3, 2024

ksenia-krasheninnikova commented Apr 3, 2024

gq1 commented Apr 3, 2024

cintiaoi commented Apr 3, 2024

ksenia-krasheninnikova commented Apr 4, 2024

cintiaoi commented Jun 10, 2024

Input & help files #36

Input & help files #36

Comments

cintiaoi commented Apr 3, 2024

Description of feature

ksenia-krasheninnikova commented Apr 3, 2024 • edited Loading

cintiaoi commented Apr 3, 2024

ksenia-krasheninnikova commented Apr 3, 2024

gq1 commented Apr 3, 2024

cintiaoi commented Apr 3, 2024

ksenia-krasheninnikova commented Apr 4, 2024

cintiaoi commented Jun 10, 2024

merquryfk/main.nf process MERQURYFK_MERQURYFK { tag "$meta.id" label 'process_medium'

ksenia-krasheninnikova commented Apr 3, 2024 •

edited

Loading

merquryfk/main.nf
process MERQURYFK_MERQURYFK {
tag "$meta.id"
label 'process_medium'