Skip to content

Convert the text from PDF to JSON for use with Doccano, an open-source text annotation tool, for validation.

License

Notifications You must be signed in to change notification settings

rabee05/scispacy-nlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

scispacy-nlp

In a nutshell, the notebook contains multiple Python functions to accomplish the following:

  • Download a list of drug labels from FDA.
  • Convert the text from PDF to JSON for use with Doccano, an open-source text annotation tool, for validation.
  • Run three SciSpacy models on the extracted text for inference to extract Named Entity Recognition (NER) entities, such as drugs, diseases, and genes, from FDA drug label, then save various types of files, including entities and their positions, for validation.

About

Convert the text from PDF to JSON for use with Doccano, an open-source text annotation tool, for validation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published