Skip to content
This repository has been archived by the owner on Sep 22, 2019. It is now read-only.

have ability to add non-taxon node labels to newicks #205

Open
josephwb opened this issue Feb 16, 2016 · 31 comments
Open

have ability to add non-taxon node labels to newicks #205

josephwb opened this issue Feb 16, 2016 · 31 comments
Assignees

Comments

@josephwb
Copy link
Member

Currently only taxon labels are applied to newick nodes.
e.g.

"newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474),(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)))Gavia_ott803675;"
@josephwb josephwb self-assigned this Feb 16, 2016
@josephwb
Copy link
Member Author

I guess I should wait for @jar398 to give me the go ahead on this, since people have expressed that they want this, but none of the myriad docs explicitly discusses it.

@jar398
Copy link
Member

jar398 commented Feb 17, 2016

I think it's a good idea

-- apologies for brevity / using handheld gizmo --

On Feb 16, 2016, at 20:41, "Joseph W. Brown" [email protected]
wrote:

I guess I should wait for @jar398 https://github.com/jar398 to give me
the go ahead on this, since people have expressed that they want this, but
none of the myriad docs explicitly discusses it.


Reply to this email directly or view it on GitHub
#205 (comment)
.

@kcranston
Copy link
Member

What methods does this affect?

@josephwb
Copy link
Member Author

"subtree" and "induced_subtree" (only the latter is implemented at the moment; I can turn it off if you like).

@josephwb
Copy link
Member Author

Old:

curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/induced_subtree -H "content-type:application/json" -d '{"node_ids":["ott501678","ott666104","ott316878"], "ott_ids":[102710,536234,810751,81461,501678]}'
{
  "node_ids_not_in_tree" : [ "ott501678" ],
  "ott_ids_not_in_tree" : [ 501678 ],
  "newick" : "(((Clangula_ott316878,Perdix_ott102710)Galloanserae_ott5839486,((Selasphorus_calliope_ott536234)Trochilidae_ott810751,Setophaga_ott666104))Neognathae_ott241846)Aves_ott81461;",
  "synth_id" : "opentree4.1"
}

New:

curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/induced_subtree -H "content-type:application/json" -d '{"node_ids":["ott501678","ott666104","ott316878"], "ott_ids":[102710,536234,810751,81461,501678]}'
{
  "node_ids_not_in_tree" : [ "ott501678" ],
  "ott_ids_not_in_tree" : [ 501678 ],
  "newick" : "(((Clangula_ott316878,Perdix_ott102710)Galloanserae_ott5839486,((Selasphorus_calliope_ott536234)Trochilidae_ott810751,Setophaga_ott666104)mrcaott246ott5481)Neognathae_ott241846)Aves_ott81461;",
  "synth_id" : "opentree4.1"
}

@jimallman
Copy link
Member

Thanks for the "before" and "after" examples. I was unclear on the consequences of this change. Do I understand correctly that this isn't just a matter of labeling, but of including entirely new nodes (e.g. mrcaott246ott5481 above)?

@jar398
Copy link
Member

jar398 commented Feb 17, 2016

we could make it an option if there are worries about clutter or space or
compatibility.
{"include_nontaxon_labels" : true , ...}
not sure how to decide this.

On Wed, Feb 17, 2016 at 4:51 PM, Joseph W. Brown [email protected]
wrote:

"subtree" and "induced_subtree" (only the latter is implemented at the
moment; I can turn it off if you like).


Reply to this email directly or view it on GitHub
#205 (comment)
.

@josephwb
Copy link
Member Author

@jimallman the trees are the exact same except for labelling.

@kcranston
Copy link
Member

+1 for including them
Ambivalent about option to exclude. Hold off until it becomes a problem? Would be a compatible change.

@josephwb
Copy link
Member Author

"nontaxon" nodes could be part of the query; a reason for them to be returned.

@jimallman
Copy link
Member

the trees are the exact same except for labelling

thanks, my Newick-fu is lacking :bowtie:

@josephwb
Copy link
Member Author

Here is subtree. Before:

curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}'
{
  "newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474),(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)))Gavia_ott803675;",
  "synth_id" : "opentree4.1"
}

After:

curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}'
{
  "newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474)mrcaott651474ott1085739,(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)mrcaott90560ott1057518)mrcaott90560ott651474)Gavia_ott803675;",
  "synth_id" : "opentree4.1"
}

@josephwb
Copy link
Member Author

Deployed on test for testing. Let me know if you want me to revert things.

@jar398
Copy link
Member

jar398 commented Feb 17, 2016

It would be compatible in the sense of having a /v3 URL instead of a /v2
URL. But it's easy to imagine clients used to the v2 Newicks barfing on
the v3 Newicks. That's OK with me, just cautioning on "would be a
compatible change"

But I have no strong feelings.

On Wed, Feb 17, 2016 at 5:02 PM, Joseph W. Brown [email protected]
wrote:

"nontaxon" nodes could be part of the query; a reason for them to be
returned.


Reply to this email directly or view it on GitHub
#205 (comment)
.

@josephwb
Copy link
Member Author

Please formulate some definite feelings, strong or no; otherwise how can this possibly move forward (or not)?

opentreeapi pushed a commit to OpenTreeOfLife/phylesystem-0 that referenced this issue Feb 17, 2016
See examples here: OpenTreeOfLife/treemachine#205 (comment)

(Update document 'tt_35' via OpenTree API)
@jimallman
Copy link
Member

Just FYI, I've imported both the old and new Newick strings from this comment above, on devtree:

https://devtree.opentreeoflife.org/curator/study/edit/tt_35/?tab=trees

@jar398
Copy link
Member

jar398 commented Feb 17, 2016

Look above for the strongest uncontradicted opinion. (It's "+1 for
including them
Ambivalent about option to exclude. Hold off [on making them optional]
until it becomes a problem".)

A different, stronger, better supported opinion might come along later,
e.g. from Mark or one of our users. So just write the code so it's easy to
turn on or off. No big deal.

Getting this right immediately is not as important as getting things like
'tax_sources' right immediately because we don't have code inside the
project that depends on it. That's why my attitude is different on this
issue compared to #207.

@josephwb
Copy link
Member Author

It is a big deal when I get lambasted for not doing exactly what is expected. The views expressed above are nowhere near the strong gatekeeper consensus you expressed during the hangout. I'll not be making any more decisions; tell me what you want, and I'll do it.

Be explicit:

  1. Do this or not?
  2. If yes, should there be a user argument to turn on/off?
  3. If yes, what is the exact name of the argument?
  4. If yes, what is the default value?

@jar398
Copy link
Member

jar398 commented Feb 18, 2016 via email

@josephwb
Copy link
Member Author

Where does the spec come from? Me? Does 1-4 above, together withe examples, suffice? If not, what? How is the spec reviewed?

@jar398 jar398 assigned jar398 and unassigned josephwb Feb 18, 2016
@jar398
Copy link
Member

jar398 commented Feb 18, 2016 via email

@jar398
Copy link
Member

jar398 commented Mar 24, 2016

Current behavior:
curl -X POST https://devapi.opentreeoflife.org/v3/tree_of_life/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}'
{ "newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474)mrcaott651474ott1085739,(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)mrcaott90560ott1057518)mrcaott90560ott651474)Gavia_ott803675;"}

The answer to #1 is yes. #2 is a good question. @kcranston what do you think, should we have a flag that controls whether to get v2-like behavior (i.e. suppress those long and mostly useless internal node labels)? Having seen some of these trees I think maybe we should.

@kcranston
Copy link
Member

In response to @josephwb questions:

  1. Do this or not? yes
  2. If yes, should there be a user argument to turn on/off? yes, there should be an option
  3. If yes, what is the exact name of the argument? include_all_node_labels (if true, then ott and mrca labels, if false, then ott labels only)
  4. If yes, what is the default value? false (i.e. the default is the v2 behaviour)

(edited to change default from true to false and clarification on what the flag does)

@jar398
Copy link
Member

jar398 commented Mar 24, 2016 via email

@josephwb
Copy link
Member Author

Is the goal to get this implemented for the current release? Seems low priority, as anyone getting trees is likely to throw out any internal node label. Plus, it feels awkward to have 2 label args:

  1. label_format
  2. include_all_node_labels

I'm inclined to just return all the labels and let the user dispose of them if they wish. Getting rid of the mrcaottXXXotYYY labels but not the ottWWW label isn't going to help anyone get anything done.

@jar398
Copy link
Member

jar398 commented Mar 24, 2016 via email

@jar398
Copy link
Member

jar398 commented Mar 28, 2016

include_all_node_labels is implemented (v3 branch on devapi) as @kcranston requested above.

I'm not too happy about the flag name.
I'm groping for something short along the lines of 'include_id_when_unnamed' or 'use_id_for_label_when_node_is_unnamed' since the flag says whether the id should be used as the label when the node has no name (OTT id), instead of nothing at all. (For named nodes the label_format tells what label to use.) It's possible to put a spin on 'include_all_node_labels' that makes it a correct description, but it's hard: you have to say that the a priori label is according to the format, if named, or is the id, if unnamed; then you have to say "all labels" means use the a priori label in every case, while "not all labels" means use the a priori label only when the node is named.

There are 6 combinations of format + this flag, and they all make sense, but a couple of them are rather odd.

@jar398
Copy link
Member

jar398 commented May 10, 2016

@kcranston This issue just needs documentation and test cases and it will be done. Shall we (I) proceed?

@jimallman
Copy link
Member

Do we want to expose this option in the webapps, perhaps as a checkbox? Or a drop-down, if there are more than two meaningful choices.

@jar398
Copy link
Member

jar398 commented May 10, 2016 via email

@jimallman
Copy link
Member

Sounds good to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants