-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rNatMau1 EAR #121
rNatMau1 EAR #121
Conversation
Hi @gitcruz, thanks for sending the EAR of Natrix maura. |
ok |
|
Hi @ldemirdj, do you agree to review this assembly? |
Yes |
Thanks for agreeing! |
Hi @gitcruz thanks for sending the EAR for review. Could you briefly describe the savestates that are included in the download link? Thank you |
Hi @tom Brown ***@***.***>
As I wrote in the notes there are two:
1. rNatMau1.curated_primary.mt.scrubbed_mq40_nosort.extensions.pretext.
*savestate_1* is a savestate I produced immediately after remapping the
hic to the curated assembly. It can serve as "homebase" during review
2. rNatMau1.curated_primary.mt.scrubbed_mq40_nosort.extensions.pretext.*savestate_W
*same but placing SUPER_W and all its unlocs right after the SUPER_Z. Is
a matter of taste we like to do this when curating both sex chromosomes, as
often one can see some interactions,etc.
Best regards,
Fernando
…On Tue, 26 Nov 2024 at 11:54, Tom Brown ***@***.***> wrote:
Hi @gitcruz <https://github.com/gitcruz> thanks for sending the EAR for
review. Could you briefly describe the savestates that are included in the
download link?
Thank you
—
Reply to this email directly, view it on GitHub
<#121 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB34KVJBRCJ7GB4R2GAFD732CRHO5AVCNFSM6AAAAABSQDA4GOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMBQGI4TIOBRGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
perfect, thank you |
Ping @tbrown91, |
Hi @gitcruz, Sorry for the delay, and thank you for sharing the EAR report. I have reviewed it, and the metrics for this assembly look good. I appreciate your efforts on this genome. I noticed that you relocated a large number of contigs tagged as unloc, and I agree with this conservative approach. However, I made some adjustments by repositioning a few smaller contigs to the ends of their corresponding scaffolds, which could also be tagged as unloc. This mainly affects SUPER_1, 2, 4, Z, 5, 6, and W. These constitute the majority of my edits, but I can send you my save_state file if that would be helpful. I agree with your identification of scaffolds Z and W, but I observed a strange pattern in scaffold Z with a spike in coverage. Do you have any insights into this? Additionally, it seems that the scaffolds from SUPER_8 to SUPER_16 interact with other scaffolds. Would you have an explanation for this? I am still reviewing the map, but these are my initial remarks. Looking forward to your feedback and @tbrown91’s as well! Best regards, Lola |
Hi @ldemirdj, Thanks for your response. Please see my replies below. Yes, please share the savestate with the small unlocs you’ve found. That will be very useful. I think you should be able to place it inside the folder I shared with you. Please let me know if that works. The region you showed in the SUPER_Z snapshot has a problem of mappability around a gap (no mappings using mq 40). It is a repetitive region rich in heterochromatin (quite frequent in snakes W chromosomes and some parts of Z). In fact, 9,094 bp are represented by tandem repeats. In addition, is actually bridging 2 contigs with autosomal coverage, similar to the PAR region in mammals. With regard to repeats in sex chromosomes I can refer to this paper: https://pmc.ncbi.nlm.nih.gov/articles/PMC5793158/ SUPER_8 to SUPER_16: These superscaffolds clearly correspond with the expected number of micro-chromosomes for this species. The microchromosomes exhibit high degrees of interchromosomal interaction, particularly with other microchromosomes (as we see in rNatMau1 assembly and in other snakes’ assemblies rHemHip1.1 before). See this reference https://pmc.ncbi.nlm.nih.gov/articles/PMC7947875/ Hope this answered your questions. Best regards, |
Hi @ldemirdj, Thanks for your response. Please see my replies below. Yes, please share the savestate with the small unlocs you’ve found. That will be very useful. I think you should be able to place it inside the folder I shared with you. Please let me know if that works. The region you showed in the SUPER_Z snapshot has a problem of mappability around a gap (no mappings using mq 40). It is a repetitive region rich in heterochromatin (quite frequent in snakes W chromosomes and some parts of Z). In fact, 9,094 bp are represented by tandem repeats. In addition, is actually bridging 2 contigs with autosomal coverage, similar to the PAR region in mammals. With regard to repeats in sex chromosomes I can refer to this paper: https://pmc.ncbi.nlm.nih.gov/articles/PMC5793158/ SUPER_8 to SUPER_16: These superscaffolds clearly correspond with the expected number of micro-chromosomes for this species. The microchromosomes exhibit high degrees of interchromosomal interaction, particularly with other microchromosomes (as we see in rNatMau1 assembly and in other snakes’ assemblies). See this reference https://pmc.ncbi.nlm.nih.gov/articles/PMC7947875/ Hope this answered your questions. Best regards, |
Hi Fernando, Thank you for your detailed responses and the references. I have uploaded the savestate with the small unlocs I found to the folder you shared. Please let me know if you encounter any issues accessing it. Regarding SUPER_Z and the mappability issue, the information about the repetitive heterochromatin region is very useful. I’ll review the referenced paper to better understand its implications for sex chromosome organization. For the scaffolds (SUPER_8 to SUPER_16), I’ll also dive into the reference you shared to explore the patterns observed in other snake assemblies. Thank you again for the clear explanations. I’ll follow up if I have additional questions after going through the papers. Best regards, Lola |
Hi Lola, Thanks for the savestate with your review!!!! Here my comments:
While if you leave it as it was looks like this: In this case I'd opt for the conservative choice of leaving it as unloc I’d like to know @tbrown91 opinion on these edits and his own input on the rNatMau1 assembly shared by CNAG. Again, thanks a lot for your review, |
Dear @ldemirdj, Please forget my previous comments on the edits. I realized that were simply automatic scaffold renamings in the agp. Just reply to my two comments above. Thanks!!!! |
Thanks for your review. Regarding to the reorganization of the SUPER_W unlocs, I understand the effort but I am not sure what the intention is. The original W (5.3 Mb) size now is towards the end. Is the idea paint everything together or previous unlocs remain as they were before? I'd like to know before tagging and painting the assembly again. I am not sure if you suggest painting the block around this original W and leave the rest as unlocs. I agree that the little scaffold_136 is part of the W. Interacts strongly with SUPER_W_unloc_12 so I moved it there for now. Good catch! Good to know, we agree on the SUPER_1 unlocs as they were originally. Tom, we are not using HiGlass frequently. But I suspect that the W chromosome would have been better assembled with Hifi reads than with ONT... Finally, I am bit lost trying to keep track of the Edits, Lola made 43 and Tom 84. @tbrown91 were yours done on top of Lola's save_state??? I personally like the assembly as it is now. But I do have doubts on what is the suggestion for painting the W and which scaffolds we could leave as unloc. Best, |
he he he Yes, I worked on Lola's savestate, but moved the unloc from super_1 back into the unloc region. It's my feeling that the W can be painted into one chromosome - what do you think? |
Can I chip in? All this unlocalized stuff on super 1 and 5 was introduced after rescuing W sequence from purge_dups. It's repetitive or haplotypic. I've started identifying haplotigs but it's a pain. I would have left it out from the assembly to begin with and just kept the W sequence. Perhaps we can sort through it and keep some, but to me, I think it's more work than it's worth. As far as localizing all the W sequence into a superscaffold, I would still be a little conservative. Maybe we can put together something, but I am not convinced we can localize it all. |
ok, sounds to me like things are moving in the right direction @gitcruz can you generate a new image, stats and EAR once you have finished with the W chromosome? I think after one round we should be good to go |
Hi Fernando, Tom, and Tyler, Thanks for all the feedback and discussion—it’s really helpful to get everyone’s perspectives. I understand the challenges of localizing SUPER_unloc_1, and I agree that in this case, a conservative approach (leaving it as unlocalized) is likely the best option. Fernando, I see your concerns about the reorganization of the SUPER_W unlocs. If the goal is to paint the W as a single block, it’s important to clarify which scaffolds should be included and which might remain unlocalized. That said, I share Tyler’s perspective: while it might be possible to consolidate parts of the W, I believe we should proceed conservatively. We might manage to piece some sequences together, but I’m not confident we can localize everything with certainty. I’ll wait for your new image, stats and EAR, Fernando, for a final review. Thanks again, everyone! Best, Lola |
Dear all, SUPER_W: I have been reorgaizing a bit the W scaffolds and we think we have a section of 16Mb with solid diagonal contacts that can be painted together plus 36Mb constituted by unlocs (hard to place after the main block or with discontinous contacts along the diagonal respect to it) SUPER_1 and SUPER_5 we distinguished several haplotigs from the other unlocs based on coverage. The reviewed contact map has been painted and tagged accordingly @ldemirdj and @tbrown91 could you have a look at "rNatMau1.reviewed.savestate_1" before I generate the final EAR? If it's ok with you, we could get the appoval and start uploading the assembly to ENA as we didi for other genomes. Of course we can wait for the final EAR to merge the PR. Thanks, |
Thanks @gitcruz, I'll take a look tomorrow. |
Thanks both |
W is looking much nicer. That PAR-like regions is really causing some difficulties with some of the scaffolds. I know there's a lot of unlocs, but I think what you've produced is very good. @ldemirdj if you are happy please "Approve" the PR |
Thanks @ldemirdj for the review. Congrats on the assembly @gitcruz! After @tbrown91 confirmation, you can start with the assembly submission to save time. |
Hi @tbrown91, I uploaded the assembly to ENA. Just realized that for this genome we did not used Omni-C but Arima High Coverage HiC. I used the right label fro the upload "Arima v2" and I will fix the final EAR report with the correct information (i.e Arima-HiC). I will try to upload it before the Christmas break if the mappings finished...but it will take time to map 900M pairs. Please ping me after holidays in case I forget it. Best regards, |
Feliz Navidad |
Thanks Lola! I fixed the tag in my greetings! I was tagging another reviewer before sorry. |
Ping @tbrown91, |
1 similar comment
Ping @tbrown91, |
Happy new year all. A small nudge to @gitcruz to see how many of us are back to work at this point |
Attention @tbrown91, the EAR PDF was updated. |
Happy new year @tbrown91 ! I uploaded the final EAR report that contains the final pretext map link for rNatMau1.1 Please merge this branch. Thanks, |
Ace, thanks @gitcruz happy new year! |
Assembly review request