Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match only noun phrases #1

Open
JDziurlaj opened this issue Dec 28, 2018 · 0 comments
Open

Match only noun phrases #1

JDziurlaj opened this issue Dec 28, 2018 · 0 comments

Comments

@JDziurlaj
Copy link
Contributor

Follow the following tregex

( NP [<NN | <NNS])

Which says get all the noun phrases that have leaf nodes under them (i.e. they are not made up of other noun or verb phrases)

Need to remove any non-nouns in noun phrase.

NLP Approach

Glossary terms should be handled as named entities. Thus they need to be added to the Gazetter. This could be accomplished by the use of the RegexNER annotator.

Each term will have an entry in a mapping file, containing a regex like this

( [{pos:/NN/;word:/style(s)?/}] )

Which will map to a GLOSSARY_ENTRY tag, whose normalization is the term itself. Conversion to GitHub linking conventions can be done downstream.

@JDziurlaj JDziurlaj transferred this issue from HiltonRoscoe/GlossaryMD Jan 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant