You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 22, 2019. It is now read-only.
Have you done any work on training, adding entities and such?
I can help, just need the base structure of it there as my cpp is a little poor hah.
Can do the stemmers and such. Was also contemplating on how the training instances would look, was thinking it'd be possible to do entity tagging within the string, something like:
constinstance='This library by {bhelx}=PERSON is cool';
Or some other format of syntax sugar that'll find the instances in the string and add them as entities automatically, rather than manually providing the stemmed entity word positions one by one. Though having options for both would be good too.
Thoughts?
The text was updated successfully, but these errors were encountered:
I haven't investigated doing custom training but it was in my TODOs. I assume we'd want to follow the way the underlying C library does it. I think it makes sense to just pass in the token locations the way the C API does it. Here is a python example:
Copying that python API would probably be the most straightforward way. We probably wouldn't want to alter the source text because the trainer needs to know the token locations but I'd need to know more about how the parser would work.
Hello!
Have you done any work on training, adding entities and such?
I can help, just need the base structure of it there as my cpp is a little poor hah.
Can do the stemmers and such. Was also contemplating on how the training instances would look, was thinking it'd be possible to do entity tagging within the string, something like:
Or some other format of syntax sugar that'll find the instances in the string and add them as entities automatically, rather than manually providing the stemmed entity word positions one by one. Though having options for both would be good too.
Thoughts?
The text was updated successfully, but these errors were encountered: