Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

two short(1000bp) contigs are generated by YaHS #76

Open
spoonbender76 opened this issue Nov 27, 2023 · 1 comment
Open

two short(1000bp) contigs are generated by YaHS #76

spoonbender76 opened this issue Nov 27, 2023 · 1 comment

Comments

@spoonbender76
Copy link

Hi,

Thank you for creating and maintaining YaHS. I installed YaHS 1.2a.1 by mamba install yahs -c bioconda.

~100x Hi-C data was mapped to assembly after purge_dups by Chromap following the pipeline described here. https://github.com/WarrenLab/hic-scaffolding-nf/blob/main/main.nf

I tried YaHs with or without -e GATC and left other options default. YaHS generated two short 1000 bp contigs, which are much shorter than minimum contig length from hifiasm p_ctg (15508 bp) and after purge_dups (31739 bp).

With --no-contig-ec YaHS did not generate 1000 bp contigs, but this version contains more errors and is not ideal compared to the contig error correction version.

These two 1000 bp contigs cannot be visualized or manually curated in juicebox. Can you give some advice on how to handle them or prevent YaHS from doing this?

@c-zhou
Copy link
Owner

c-zhou commented Nov 29, 2023

Hello @spoonbender76,

These small pieces are debris generated during correcting the contigs. It is unavoidable if you want to do assembly error correction. They are very likely junk sequences. You can either leave them there in your assembly or simply remove them.

Best,
Chenxi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants