Request: Training #2

Salakar · 2016-01-10T20:23:46Z

Hello!

Have you done any work on training, adding entities and such?

I can help, just need the base structure of it there as my cpp is a little poor hah.

Can do the stemmers and such. Was also contemplating on how the training instances would look, was thinking it'd be possible to do entity tagging within the string, something like:

const instance = 'This library by {bhelx}=PERSON is cool';

Or some other format of syntax sugar that'll find the instances in the string and add them as entities automatically, rather than manually providing the stemmed entity word positions one by one. Though having options for both would be good too.

Thoughts?

bhelx · 2016-01-10T23:20:09Z

Hey @Salakar

I haven't investigated doing custom training but it was in my TODOs. I assume we'd want to follow the way the underlying C library does it. I think it makes sense to just pass in the token locations the way the C API does it. Here is a python example:

https://github.com/mit-nlp/MITIE/blob/master/examples/python/train_ner.py

Copying that python API would probably be the most straightforward way. We probably wouldn't want to alter the source text because the trainer needs to know the token locations but I'd need to know more about how the parser would work.

RahulPol · 2017-05-18T02:47:27Z

Is this done?

bhelx · 2017-05-18T16:25:18Z

@RahulPol I'm not working on it. I'm not sure about @Salakar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: Training #2

Request: Training #2

Salakar commented Jan 10, 2016

bhelx commented Jan 10, 2016

RahulPol commented May 18, 2017

bhelx commented May 18, 2017

Request: Training #2

Request: Training #2

Comments

Salakar commented Jan 10, 2016

bhelx commented Jan 10, 2016

RahulPol commented May 18, 2017

bhelx commented May 18, 2017