-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
looks great, maybe a python version is more competitive and atractive #89
Comments
Dear @lovejasmine, |
I believe that python are more supportive on many ML/DL package like tensorflow,torch,sciketlearn etc. |
All good reasons, however can't deep learning models can be developed in Python and used in Java? |
sure, it works on tensorflow which build and train model with python and decode with java , not sure torch works or not. |
Hello! @lovejasmine you have seen the repo DeLFT in python, so the fact that this specific entity-fishing project is written in java is not an accident. Java is better for manipulating large data set (see hadoop or spark) and this is the purpose of entity-fishing which contains billion of objects in its knowledge base, python pdf parsers are 20-50 times slower, etc... The ML part might rely on some java library now because it is quite basic, but you can see in DeLFT that I've built in python largely superior DL NER models (state of the art actually), with some constraints related to size, embeddings, etc. with the idea to call these models saved in TF format in Java (although using a service in a docker would be also a solution I think). The way the embeddings are managed in DeLFT and in entity-fishing is exactly the same for this purpose (because the inputs have to be similar) and the fact that the DL model are very small is also motivated by that objective. If you are interested to contribute to entity-fishing, it's really great, and the best would be actually to contribute to DeLFT on the DL related parts. In DeLFT, new NER models have been created, and I've started to work on an implementation of https://github.com/openai/deeptype with DeLFT - just for the final biLSTM-CRF labelling, after the type system has been generated. Finally the plan is also to create a python wrapper/client so that entity-fishing can be used transparently in python similarly as NLP library like Spacy (as you know, ML modules are never natively in python). |
Great @kermitt2 |
No description provided.
The text was updated successfully, but these errors were encountered: