Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Indexing DCP Workshop Followup #11

Open
david4096 opened this issue Jul 5, 2018 · 0 comments
Open

Metadata Indexing DCP Workshop Followup #11

david4096 opened this issue Jul 5, 2018 · 0 comments

Comments

@david4096
Copy link

david4096 commented Jul 5, 2018

At the DCP workshop one of the breakout groups discussed metadata modeling, and we would like to continue that conversation with a little more code.

Please add your notes on what you expect from this followup, and any links that would help structure conversation.

Add yourself to this poll (and make sure I can get your contact info) if you're interested!

https://doodle.com/poll/4mtaeps9kp2pnqzh

I sent this note to the group that showed interest:


Hi All!

You're receiving this email because you either tagged yourself, or were tagged by someone else at the DCP Workshop last week. We are hoping to make practical progress regarding making best use of the DATS format, and finding places to share code for managing the task of metadata indexing.

Are you available at the normal KC7 time (Thursday 9-10am pacific) next week? I think it's a natural time, but here is a doodle poll to fill out if we can't find consensus on that time.

https://doodle.com/poll/4mtaeps9kp2pnqzh

If someone is a Zoom master and wants to take over hosting the actual conference (and recording it for others), I'd appreciate it. Otherwise, we'll meet on hangouts.

And don't worry if you can't make it, we'll make the notes and anything useful learned available to the rest of KC7!

I'm basically looking for instruction from the folks who have used DATS and JSON-LD how we can easily write and share code for denormailzing DATS into documents for indexing: #3

To that end, if folks could come with:

  • DATS related code concepts/use cases
  • SPARQL queries against DATS
  • Suggestions for ETL processes (triple stores?)
  • DBGap scraping/export techniques
  • ElasticSearch mappers/analyzers

I believe everyone on this list had something to say about one of these things. Thanks for taking the extra time to help us figure out places to share code!

I'll offer a brief case study of how we use ElasticSearch to present detailed information about a project. One of the conceptual blockers that I would like the communities help to be able to formulate properly to KC2 and 7 groups is about the "resolution" or "granularity" of metadata and unique identifiers. Should the DATS model be expected to contain information on how to resolve individual data objects per dataset, or is it a metadata representation of the project at a higher level? What is expected in terms of harmonization of variables, versus column names?

Best!
David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant