return of page.extract_table() with vertical_lines information #573
Replies: 1 comment
-
Hi, not having found how to have the number of pages, I used another application and an answer of jsvine to a request for help. Here is the code that solves my request. " def get_vertical_lines(page):
def get_horizontal_lines(page): ifile="/Users/yves/Documents/Facture F21083132 HOTEL COSTES 2.pdf" doc = fitz.Document(ifile) for i in range(0,n):
" |
Beta Was this translation helpful? Give feedback.
-
Hi,
pdfplumber version : 0.6.0
python version : 3.9.7
my code 👍
"
import pdfplumber
from decimal import *
path='/Users/yves/Documents'
ifile = path+'/Facture F21083132 HOTEL COSTES 2.pdf' # file joint
vertical_lines = [Decimal('89.64'),Decimal('375.64'),Decimal('409.64'),Decimal('451.64'),Decimal('502.64'),Decimal('564.64')]
table_settings ={"horizontal_strategy": "lines","vertical_strategy": "explicit","explicit_vertical_lines": vertical_lines,"keep_blank_chars": True,"intersection_x_tolerance": 5}
pdf=pdfplumber.open(ifile)
page = pdf.pages[0]
df=page.extract_table(table_settings)
"
The result difficult to use with pandas!
[['Désignation', 'Qté', 'PU', 'Tva', 'Hors Taxe'], ['du 01/08/2021', '', '0,00 €', '41,69 €', ''], [None, None, None, None, '792,15 €'], ['', '', '', '', ''], ["PETIT PAIN COSTES (FARINE ISSUE DE L'AGRICULTURE BIOLOGIQUE\nPETIT PAIN NOIR (farines issues de l'agriculture biologique)\nFICELLE (farine issue de l'agriculture biologique)\nMERINGUE\nCHEESECAKE JLC\nAPPAREIL MOELLEUX\nAPPAREIL TUILE\nMOUSSELINE BARQUETTE DE 450G\nFOND DE PATE SUCRÉ CHOCOLAT ROND COSTES\nFEUILLETÉ CARAMÉLISÉ 4X12\nPAIN DE MIE \nPAIN DE MIE COMPLET\nDEMIE BAGUETTE TRADITION PRECUITE (farine issue de l'agriculture \nbiologique)", '600,00\n700,00\n20,00\n30,00\n30,00\n4,00\n2,00\n3,00\n20,00\n75,00\n1,00\n1,00\n50,00', '0,34 €\n0,40 €\n0,67 €\n1,92 €\n3,41 €\n13,65 €\n10,57 €\n6,75 €\n0,95 €\n0,85 €\n7,08 €\n9,08 €\n0,57 €\n0,00 €', '3\n3\n3\n3\n3\n3\n3\n3\n3\n3\n3\n3\n3\n0,00 €\n29,27 €', '204,00 €\n280,00 €\n13,40 €\n57,60 €\n102,30 €\n54,60 €\n21,14 €\n20,25 €\n19,00 €\n63,75 €\n7,08 €\n9,08 €\n28,45 €'], ['Sous-total', None, None, None, '880,65 €'], ['du 02/08/2021', None, None, None, '880,65 €'], [None, None, None, None, ''], [None, None, None, None, '880,65 €'], ['', '', '', '', '']]
How to proceed?
thank you for your feedback.
Greetings
Facture F21083132 HOTEL COSTES 2.pdf
Beta Was this translation helpful? Give feedback.
All reactions