-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving NER #294
Comments
@GautamR-Samagra hi, can I please get the access to the datasets. I'd like to make some contributions to this issue at hand. |
HI @basedsaksham, the idea is to treat NER as a model that does multiple things under the hood :
As an input I just pass an argument which lists the entities I want to extract, the model uses either regex or the seq-seq model to extract the above entities. |
hey @GautamR-Samagra I have actually written a code which is detecting the required entities such as time ,email, phone number, number and unit and also predicting the extracted time(both in hindi and english) using regex. https://colab.research.google.com/drive/1DAg0xKBYMnXXcQzFwBK2aQppoyoddj1X?authuser=1#scrollTo=6qNzUpRqjnrS this what I have done so far. I am working on integrating all of the above mentioned things in the issue |
unable to open it |
please try now |
Hey @GautamR-Samagra , worked on the NER notebook that you had given and tried to add-on crop_symptoms to it along with crop_name and crop_disease. |
@adityathenerd let me know if you were able to fix issues with it. still seeing |
@adityathenerd and @basedsaksham you haven worked on separate aspects of it. We must integrate both parts of it into a new module ner -->agri_ner inside ai-tools. Proposed folder structureThe structure should mirror existing model setup such as that for text classification but with extra files for each kind ner we do Folder structure can look like this :
Do collaborate with each other and make a PR to ai-tools on this. regex NER here |
Hey @GautamR-Samagra , found out what the problem was with this. The dataset didnt have enough pest related tags, so model was not able to predict those well. Working on adding 7-8 more pest related sentences to the dataset. it should work fine now. Will update by EOD. |
Model Link |
@Shubh-Goyal-07 can you link your PR here |
The PR for the same has been made here: @GautamR-Samagra |
Goal :
We want to improve our NER model to include entitites that come out of this
The overall goal is being able to extract any relevant entity (recognize its that entity) from a question that will help us with a search.
Current state :
The model is created based on this dataset using this code.
next steps are to extract the data from the pdf/csv, create the sentences in the required format (same as the dataset above) and then train the model. 30k Queries provided in the other ticket can also be used for the same.
Some pre-decided entities are also here :
We need to have a common model that is able to detect all these entity types. We should be able to input a sentence and get back the entities detected for the sentence.
The text was updated successfully, but these errors were encountered: