Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vocabulary support #21

Open
lgessler opened this issue Nov 15, 2023 · 2 comments
Open

Vocabulary support #21

lgessler opened this issue Nov 15, 2023 · 2 comments
Assignees
Labels
enhancement A development proposal that extends functionality

Comments

@lgessler
Copy link
Owner

Implement Vocabularies, as specified in the documentation.

@lgessler lgessler added the enhancement A development proposal that extends functionality label Nov 15, 2023
@lgessler
Copy link
Owner Author

@michaelpginn makes the good point that given their different behaviors, the version of this used for tokens should probably be different from the one used for spans and relations.

@lgessler lgessler assigned lgessler and rsquinlan and unassigned lgessler Apr 16, 2024
@lgessler
Copy link
Owner Author

  • See data model design doc for details
  • We're going to use the full many-to-many-capable approach for tokens for spans and relations as well. Unlikely that a many-to-many structure would be meaningful for spans and relations, but it simplifies development to just reuse the same Vocabulary mechanism for everything
  • Core structures:
    • vocab-layer (global): has a name, also layer-type (token/span/relation)
    • vocab-item (global): has a form (for tokens, this is a reference form; for non-token applications this is just the value of e.g. the POS tag); has properties (arbitrary map of KVPs for now, could make this structured in the future to e.g. provide validation)
    • vocab-map (document-level): maps between tokens/spans/relations/... and a vocabulary item. Has members: collection of IDs of linked items
  • CRUD: vocab-layer
    • Delete: delete all vocab-items and vocab-maps
  • CRUD: vocab-item
    • Delete: delete all vocab-maps matching this vocabulary item
  • CRUD: vocab-map
    • Deletion on a token/span/relation: need to do the same thing we did for spans: remove from all vocab-maps, and then delete the vocab-maps if its associated item set is now empty (deletion on token/span/relation layer should already automatically trigger this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A development proposal that extends functionality
Projects
None yet
Development

No branches or pull requests

2 participants