-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Pipistrellus hanaki #84
Conversation
Hi @SMangenot, thanks for sending the EAR of Pipistrellus hanaki. |
ok |
|
Hi @talioto, do you agree to review this assembly? |
@SMangenot Could you please add BUSCO scores from a more appropriate lineage, for example mammalia or laurasiatheria? I am concerned about the number of duplicated genes and that they didn't seem to decrease even though you removed quite a number of sequences during the curation |
Hi @tbrown91, here's the new EAR report with the BUSCO scores from laurasiatheria lineage
The researcher has updated the EAR PDF. Please review the assembly @tbrown91. |
Yes
… On 3 Oct 2024, at 10:06, erga-ear-bot[bot] ***@***.***> wrote:
Hi @talioto <https://github.com/talioto>, do you agree to review this assembly?
Please reply to this message only with Yes or No by 10-Oct-2024 at 09:05 CET
—
Reply to this email directly, view it on GitHub <#84 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI5CMMLNI3ZWGXPBEKML23ZZTUGBAVCNFSM6AAAAABPJD7LMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJQGY3TMNJXHE>.
You are receiving this because you were mentioned.
|
Thanks for agreeing! |
Hi @talioto Have you had a chance to look through the assembly? We need to get this submitted by the end of the month unless there are major issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not quite done, but here are my notes so far:
Contact density is in general very low, making decisions somewhat difficult. It's too bad the library was not sequenced to higher depth.
Telomeric and subtelomeric sequence is often incorporated but there is a lot left in the chaff/shrapnel at the end with non-specific signal. I don't know how much to trust the placement of this stuff. This is why my review has taken some time.
SUPER_1: some faint telomeric sequence in the middle around 91.1-91.8 Mb. BUT, it seems to be contiged and contacts support this being some relic signal from a chromosome fusion. Perhaps nothing to do here.
join SUPER_7 and SUPER_5? Similar to signal seen in SUPER_1. There is a telomeric repeat region, though.
SUPER_8: 6.5-7 Mb contig repeat that is maybe misplaced. Better to unloc it.
SUPER_14: 0-3.8 Mb. I don't think the signal is specific enough to keep this attached to this chromosome. Based on pattern of contacts to other subtelomeric regions, it seems like this is the odd one out. II would place it in the chaff.
SUPER_15: beginning subtelomeric region. I'm not sure I trust YaHS in putting this together.
SUPER_17: interior telomeric region around 32-26 Mb. Not sure how to handle.
"Y" should probably be X. It is a male specimen: https://www.ebi.ac.uk/biosamples/samples/SAMEA115120470 so I assume XY. Other species in genus have 104 Mb X and 4 Mb Y. or 106 Mb X and 6 Mb Y.
"Y": interior telomeric region 54.3-61.8 Mb. I don't think there's enough support to keep it in the middle of the X. In fact I think this is the actual Y sequence. Keep the part next to it that has contacts with the X and is higher coverage. This is likely part of the PAR. See my savestate.
Perhaps minimap2 alignments to other species in the genus would help sort some of this out.
Thank you for the review @talioto! @SMangenot can you please look through Tyler's suggested changes and see if they make sense in the context of the Hi-C map. I wonder if looking at synteny to other pipistrellus genomes would also help, e.g. these 4 are all in chromosomes, including the kuhlii which is stated as scaffold: https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=27671&reference_only=true |
It would be good to know if there's been any progress on this. We're in a time crunch here. Could use this genome span. |
Hi @talioto, I'll reach out to Sophie for an update, but she's still working on the map. We should have news soon, and we'll be able to submit it before the October 31st deadline. Thanks! Lola |
A new update for Pipistrellus hanaki I made the corrections according to your remarks but when comparing with the reference genome I did not join SUPER_7 and SUPER_5 and SUPER_14 and SUPER_15.
The researcher has updated the EAR PDF. Please review the assembly @talioto. |
Hi everyone, Given tomorrow's deadline, it won’t be possible to submit this genome on time. Sophie and I are awaiting Tyler's feedback for a final review, and we will submit it once that's completed 👍 . Thank you for your understanding. Best, Lola |
Man, this one is not easy, but I assume you did some alignments to other Pipistrellus bats. Is the X colinear? Are there any Y's to align to. I'm not sure the piece labeled Y is the Y. I broke off the little bit that matches the X and placed in the gap in the middle of X. Here's a link to folder where I have the pretextmap and a savestate. |
Yes, it's a hard genome, the message was mostly for Tom to keep him informed. Sophie will answer your question. Thanks @talioto. |
My 2 cents on the Y chromosome. So Y... looking at the map, I would suggest that scaffold_32, scaffold_37, scaffold_58, scaffold_34 (in that order) are it. Hi-C coverage is really not high enough and for another discussion. |
Attention @talioto, the EAR PDF was updated. |
Hi @SMangenot Thank you for the new EAR. Could you please detail here the changes that you have made? I'm finding it a little difficult to go through the conversation here and find everything. Thanks |
The Y chromosome was wrong, I made a mistake in my last card. |
I wouldn't necessarily expect to see alignment between Y, esp. from different species. |
A new map with the Y chromosome tagged
Attention @talioto, the EAR PDF was updated. |
Thanks @SMangenot @talioto @additive3 let's try to get this one finalised. I don't see much more room for improvement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go ahead. Not much else to improve. Really need higher coverage of Hi-C from Genoscope to better scaffold and curate and spend less time doing it. The agreed on target is 50x coverage minimum. For bad libraries we go to 100x.
Thanks @talioto for the review. Congrats on the assembly @SMangenot! After @tbrown91 confirmation, you can start with the assembly submission to save time. |
Hi @SMangenot, out of curiosity, do you know why HiC throughput was so low? |
Assembly review request