pdf2info

Pdf2info is a simple table extractor library using tabula.

Installation

Choose one of following:

Local

Using local python3 interpreter:

$ python3 -m pip install -r requirements

Virtualenv

Creating a new virtual environment:

Assuming $ python3 -m pip install virtualenv :

$ python3 -m pip install virtualenv
$ virtualenv venv
$ source venv/bin/activate
(venv) $ python3 -m pip install -r requirements.txt

Conda

Creating a conda environment (assumg you have conda/miniconda installed) :

$ conda create --name venv --file requirements.txt

How to use

Launch extraction

Place all your necessary PDFs in a single dicrectory, then call process_folder.py script:

$ python tables_from_dir.py --dir=path/to/your/dir --out=path/to/out/folder

This will create one csv file per table.

Logging can be read in results.log file. If need to check console live log, add --log-console param:

$ python tables_from_dir.py --dir=path/to/your/dir --out=path/to/out/folder --log-console

If you only need to extract from one pdf, use tables_from_file.py instead of tables_from_dir.py

Recreate tab2know results

Check analysis/tab2know_tests istructions.

Recreate pdf2info results

Check analysis/pdf2info_tests istructions.

Comparison

camelot:

tabula:

linesearch:

tab2know:

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
analysis		analysis
core		core
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
tables_from_dir.py		tables_from_dir.py
tables_from_file.py		tables_from_file.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdf2info

Installation

Local

Virtualenv

Conda

How to use

Launch extraction

Recreate tab2know results

Recreate pdf2info results

Comparison

About

Releases 5

Packages

Languages

fabian57fabian/pdf2info

Folders and files

Latest commit

History

Repository files navigation

pdf2info

Installation

Local

Virtualenv

Conda

How to use

Launch extraction

Recreate tab2know results

Recreate pdf2info results

Comparison

About

Resources

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages