Skip to content

harvardnlp/image-extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code for extracting a representative image from a PDF file using CV.

This code needs to be run on GPU. We include a colab example.

Setup

bash setup.sh

Running

First add a bunch of PDF files to a directory pdfs/.

Next call,

python run.py pdfs/ pics/

The code will attempt to extract an image for each pdf into the pic directory.

About

Extract images from PDFs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages