Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synth node mrca labels not returned #111

Open
snacktavish opened this issue Jan 13, 2022 · 18 comments
Open

synth node mrca labels not returned #111

snacktavish opened this issue Jan 13, 2022 · 18 comments

Comments

@snacktavish
Copy link
Member

https://github.com/OpenTreeOfLife/germinator/wiki/Synthetic-tree-API-v3#subtree-tree-of-life

The expected return of this call
curl -X POST https://api.opentreeoflife.org/v3/tree_of_life/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}'

shows the synth mrca node labels - but those don't seem to be returned anymore.

 "newick": "(((Gavia_adamsii_ott90560,Gavia_immer_ott1057518),(Gavia_pacifica_ott651474,Gavia_arctica_ott1085739)),Gavia_stellata_ott1057044,Gavia_egeriana_ott3595544,Gavia_fortis_ott3595545,Gavia_concinna_ott3595546,Gavia_howardae_ott3595548,Gavia_brodkorbi_ott6151844,Gavia_pusilla_ott6151845,Gavia_roseiventris_ott7659869)Gavia_ott803675;",
 "supporting_studies": [
  "ot_521@tree1",
  "ot_809@tree2",
  "ot_104@tree1"
 ]

I was planning to use them to link dates from studies to synth subtrees.

@mtholder
Copy link
Member

I seem to remember this bug showing up once (at least) before. @bredelings do you recall that?

@mtholder
Copy link
Member

I just logged on to the server and checked the raw otcetera call and the call through ws_wrapper. The bug is present in both, so I think this is an otcetera issue not a ws_wrapper issue.

@mtholder
Copy link
Member

or perhaps those labels are from propinquity. That is another possibility...

@bredelings
Copy link
Contributor

Or possibly those labels are only added to the tree if you ask for arguson?

@mtholder
Copy link
Member

They are in the newick example of the docs https://github.com/OpenTreeOfLife/germinator/wiki/Synthetic-tree-API-v3#subtree_tree

@bredelings
Copy link
Contributor

bredelings commented Jan 13, 2022

Looking at the source code, this functionality is disabled by default but you can get the internal node labels by adding "include_all_node_labels" : true:

curl -X POST https://api.opentreeoflife.org/v3/tree_of_life/subtree -H "content-type:application/json" -d '{"node_id":"ott803675","include_all_node_labels":true}'

See

bool all_node_labels = extract_argument_or_default<bool>(parsedargs, "include_all_node_labels", false);

Also, we have this fun comment:

    // FIXME: According to treemachine/ws-tests/tests.subtree, there is an "include_all_node_labels"
    //        argument.  Unless this is explicitly set to true, we are supposed to not write node labels
    //        for non-ottids.  At least in Newick.

@bredelings
Copy link
Contributor

Perhaps the least intrusive change would be to document the "include_all_node_labels" argument and update the sample output?

@bredelings
Copy link
Contributor

bredelings commented Jan 13, 2022

Here's a previous discussion of this: OpenTreeOfLife/treemachine#205
It looks like the design consensus was that we should have the current behavior (default to no labels, but add them given a flag) to preserve compatibility with v2.

@bredelings
Copy link
Contributor

bredelings commented Jan 13, 2022

And here is the code in ws-tests that checks that we do NOT include internal node labels by default.

https://github.com/OpenTreeOfLife/treemachine/blob/master/ws-tests/test_subtree.py

@snacktavish
Copy link
Member Author

Ah super helpful! Thanks. Should we change the default to True, or update the docs to show argument and default is False and correct the example output?
I added it as an option in the branch I'm working on in python-opentree, but haven't PR'd.
OpenTreeOfLife/python-opentree@b2a6935

@mtholder
Copy link
Member

hmm. I'd say that we should definitely pass the "include_all_node_labels": true from the client when we want the labels (in case we change our mind in a future version of the API).
My instinct is that we should make the implementation match the docs, but the fact that we have a previous issue indicating that we purposefully chose to make ""include_all_node_labels": false the default, but then forgot to document that is troubling.
I'm not sure whether we should fix the docs or the impl.
I'd vote updating the docs, but will defer if others have a strong preference.

@snacktavish
Copy link
Member Author

I have a guess that it was changed to not include internal node labels because when label_format = 'name' a bunch of internal node labels have parens in them. Which, as discussed in various other issues I can't find right now, is legal newick, but that many treeviewers cannot cope with. (e.g. https://tree.opentreeoflife.org/taxonomy/browse?id=996421 for a label with several characters that can cause issues) .There are a few tips with that issue as well, but way fewer.

@snacktavish
Copy link
Member Author

Which is to say, I think setting the default to False and updating the docs makes sense. I have been contemplating suggesting that we don't include parens in taxon labels, but that seems like a whole PITA. Any they should be allowed anyways! It is the tree viewers who are wrong!

@snacktavish
Copy link
Member Author

snacktavish commented Jan 14, 2022

I think my theory is wrong anyhow - because even when include_all_node_labels = False, named taxa are still included in internal node labels, just the mrca labels are not
e.g.
curl -X POST https://api.opentreeoflife.org/v3/tree_of_life/subtree -H "content-type:application/json" -d '{"node_id":"ott524059", "include_all_node_labels":false, "label_format":"name"}'

So the label string problems are not solved anyhow.

Edited to add: Example from user issue https://gitter.im/OpenTreeOfLife/public?at=61ca3aafe1a1264f0a2adde9

@bredelings
Copy link
Contributor

We could also switch the default for v4. It seems like switching the default for v3 basically depends on whether we think any existing software that isn't expecting internal node names would break if we added them.

I have no reason to think there are newick parsers that can't handle internal node names... and yet the badness of newick parsers seems to be unbounded below. It kind of seems like a judgement call.

@snacktavish
Copy link
Member Author

I put it on the agenda to discuss at our next dev meeting!

@snacktavish
Copy link
Member Author

I will update documentation to reflect current behavior.

@bredelings
Copy link
Contributor

OK, new branch with 1-line fix: default-include-node-labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants