-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extracting table & analysis-level metadata #28
Comments
Hi!
for the table caption: good idea, actually we already extract it so it
is just a question of writing it in a more visible place and documenting
it. ATM it is buried in the 'articles' directory, eg in
query_a64755ef68b219b22aec44cd9fecdb07/articles/d6e/pmcid_9812244/tables/table_000_info.json
for p-value, region, and contrast name: ATM pubget does not have the
ability to extract them. I think it would probably be useful, but like
extracting coordinates it would require a good amount of trial and
error, and then some work to estimate if the extraction is somewhat
accurate. I would be inclined to wait and see if or how many users ask
for it, once the tools have been advertised a bit more. However I think
it is well within the scope of the project so if someone wants to tackle
it (including the validation) that would be a welcome contribution.
Do you know if the accuracy of extraction of p-value region and contrast
name by ACE has been evaluated? and AFAIK neurosynth doesn't use them in
any way, is that correct?
|
That makes sense. Indeed ACE has a lot trial and error in the heuristics it uses to get those fields (although p-value i rather easy actually--- its contrast name and region that's a bit harder). I actually have an undergrad working on doing some QA on newly extract ACE data, so I can have him take a look at that. Neurosynth doesn't use this data, nor is it in the official neurosynth data output. I'm working on changing that, so at least neurostore can ingest it |
That's great, the QA work combined with manually curated coordinates
from neurostore will produce some good validation data for future
improvements of tables processing, both in ACE and in pubget
|
opened #38 to deal with this |
Hi @jeromedockes
I know that pubget extracts the
table_id
andtable_label
along side the coordinates.Does pubget have the ability to, or is there interest in expanding pubget to extract more meta-data, such as: p-value, region, or contrast name?
At the level of the table we could also try to extract Table caption.
I'm asking because it seems ACE can do this, and it may be helpful meta-data for neurosynth-compose users.
The text was updated successfully, but these errors were encountered: