Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corpus size for taxa and genes not matching KB UI? #200

Open
wdahdul opened this issue May 7, 2021 · 4 comments
Open

Corpus size for taxa and genes not matching KB UI? #200

wdahdul opened this issue May 7, 2021 · 4 comments

Comments

@wdahdul
Copy link
Contributor

wdahdul commented May 7, 2021

Question about the corpus numbers returned for taxa and genes. Shouldn't these match the KB UI numbers, which are 6184 taxa and 18537 genes?

Here's the output from rphenoscape:

> corpus_size("taxa")
[1] 785
> corpus_size("genes")
[1] 18662
@balhoff
Copy link
Member

balhoff commented May 7, 2021

At least for taxa, these are counting different things. The similarity corpus is counting variation profiles resulting from our ancestral state procedure.

@hlapp
Copy link
Member

hlapp commented Sep 3, 2022

Indeed they are counting different things. The corpus size for annotated-taxa is now reported though as

> corpus_size("annotated-taxa")
[1] 6533

whereas the KB website says 6540. The KB info query also returns 6540:

> get_KBinfo()
Annotated taxa: 6540
Annotated characters: 14555
Annotated matrices: 256
Annotated states: 28461
Build time: 2022-07-07 01:24:38 EDT

@balhoff any suggestions as to why the discrepancy?

And I'm not sure how to get the UI to display a corpus size for genes. Where did you find that, @wdahdul ?

@balhoff
Copy link
Member

balhoff commented Sep 6, 2022

@hlapp at the moment the databases for the website and for the latest service API are different. This is because I have some pending work to do on the web UI to handle semantic similarity service changes.

@hlapp hlapp added this to the pre-2023-TraitFest milestone Nov 4, 2022
@hlapp
Copy link
Member

hlapp commented Jan 12, 2023

@balhoff any update here prior to the TraitFest?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants