Reseractor

A NLP based tool for domain specific extraction from research paper pdfs

Overview

An innovative tool designed to address the crucial need for domain informed specificity in scientific data extraction from extensive online resources. Utilizing a novel algorithm, Whitespace, and advanced NLP techniques, this tool demonstrates significant accuracy in extracting relevant information for the material science domain. Tested across diverse datasets, it achieved commendable results, showing promise for broader applications.

Features

Integrated OCR and Layout Parsing
Database/Corpus Generation
Relevancy Check
Data Extraction
Multiple PDF uploads

Installation

Clone this repository in local using command git clone https://github.com/d-mittal-21/Reseractor.git or simply dowload the zip file from here

Make sure all the packages listed in requirements.txt file are installed before following further steps.

Download the pre-trained model from this link and store it in models folder.

Now run the command python main.py to start the interactive GUI window.

Next you can follow the Demo Video to get use to the GUI.

Implementation

Demo Video

tool_video.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
database		database
media		media
models		models
notebook		notebook
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reseractor

A NLP based tool for domain specific extraction from research paper pdfs

Overview

Features

Installation

Implementation

Demo Video

About

Releases

Packages

Languages

d-mittal-21/Reseractor

Folders and files

Latest commit

History

Repository files navigation

Reseractor

A NLP based tool for domain specific extraction from research paper pdfs

Overview

Features

Installation

Implementation

Demo Video

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages