-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
abPOA user specifiable seeds #37
Comments
Yes, theoretically, abPOA could take any type of seeding and chaining result to guide the POA process. I think adding an option to take MUM seed/anchor as input is much easier than implementing it inside abPOA directly. |
Hi Yan,
That is great news. As a strawman, I'd suggest using PAF format to take a
set of pairwise anchors? Or do you prefer the anchors to be across multiple
sequences?
…On Fri, Mar 25, 2022 at 2:27 AM Yan Gao ***@***.***> wrote:
Yes, theoretically, abPOA could take any type of seeding and chaining
result to guide the POA process.
I choose the minimizer simply out of speed consideration.
Using a more mature seeding method (MUM) is definitely preferable for
divergent sequences.
I think adding an option to take MUM seed/anchor as input is much easier
than implementing it inside abPOA directly.
Only concern is that we need a determined input format.
—
Reply to this email directly, view it on GitHub
<#37 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEQ4IGE5CSRG5WWWL4NHODVBWBJPANCNFSM5RR3Y7WA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Benedict
(calendar invites: ***@***.***,
appointments: Kimberley Czupil ***@***.*** ***@***.***>> or
https://calendly.com/bpaten/30min)
|
PAF format is nice. To feed abPOA, we only need to record which anchor comes from which sequence in the PAF file. |
Yes, if you can create a function for this, then we can definitely specify
use this. If you prefer to create some kind of object to define the seeds
we can also work with that. Thanks,
Benedict
…On Sun, Mar 27, 2022 at 8:39 PM Yan Gao ***@***.***> wrote:
PAF format is nice. To feed abPOA, we only need to record which anchor
comes from which sequence in the PAF file.
Across multiple sequences may be too stringent, could lead to too few
seeds.
I think pairwise should be just fine. Specifically, we just need the
anchors between every two adjacent sequences.
The order could be the input order or the order determined by a
progressive guide tree (you already knew this).
—
Reply to this email directly, view it on GitHub
<#37 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEQ4IHMS62TWARZJT626K3VCESVPANCNFSM5RR3Y7WA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Benedict
(calendar invites: ***@***.***,
appointments: Kimberley Czupil ***@***.*** ***@***.***>> or
https://calendly.com/bpaten/30min)
|
I think for Cactus,it's important to have an API to pass the anchors in via a struct (as opposed to FILE*). Whether that struct is PAF-based or not is less important. Also, if we are going to keep using abPOA's progressive ordering, then we'd need an API to get that (if it's not already there) before computing the mum anchors. Something like
thanks! |
Yes, totally agree, Glenn.
…On Mon, Mar 28, 2022 at 9:35 AM Glenn Hickey ***@***.***> wrote:
I think for Cactus,it's important to have an API to pass the anchors in
via a struct (as opposed to FILE*). Whether that struct is PAF-based or not
is less important.
Also, if we are going to keep using abPOA's progressive ordering, then
we'd need an API to get that (if it's not already there) before computing
the mum anchors. Something like
[abpoa] get_progressive_order(sequences)
[cactus] compute_mum_anchors(sequences, order)
[abpoa] get_msa(sequences, anchors)
thanks!
—
Reply to this email directly, view it on GitHub
<#37 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEQ4ID3UCRU2ZYIFELFYIDVCHNTTANCNFSM5RR3Y7WA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Benedict
(calendar invites: ***@***.***,
appointments: Kimberley Czupil ***@***.*** ***@***.***>> or
https://calendly.com/bpaten/30min)
|
Hi @yangao07 , I've been experimenting a little with the seeding in abpoa and am wondering if it would be possible to add an option for users to provide alignment seeds? My issue is that for more divergent sequences minimizers are not very ideal for anchoring. I have found more luck using maximal unique matches (MUMs), using a chaining process more like that in the original MUMmer program. Looking forward, I also see a time where we will want to anchor the alignments based upon unique markers in order to facilitate the alignment of highly repetitive sequences (e.g. satellite arrays). Interested in your perspective on this.
The text was updated successfully, but these errors were encountered: