Skip to content

Created framework to convert an image file into word then to pdf file.

License

Notifications You must be signed in to change notification settings

nitinkumar30/imageFileToPdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JPG to PDF converter

A framework made for converting an image file to PDF file.
Usage of just 4 packages of Python.
For latest code on how to convert multiple image files to pdf, check multiple-files branch.

Working

  1. First, this will seacrh for the text inside the image file(most probably .jpg/.PNG file).
  2. Then, it'll fetch the text & store it in a variable.
  3. And then, writes it into a docx file(So that if it extracts wrong text, you can manually type the correct text).
  4. And finally, convert the word file hence created into a pdf file.

Requirements

  1. You need to download tesseract.
  2. You need following packages of python:
    1. PIL
    2. pytesseract
    3. docx
    4. docx2pdf
  3. IDLE (PyCharm Community Edition preferrable)

How to run

  1. Go to main file.
  2. Just check the following lines in the code:
text = extractTextFromImg(imgPath) # - extract text from image file
appendDataToWord(wordPath, text) # - append data into word file
convertDocxToPdf(wordPath, pdfPath) # - convert it into pdf
  1. You don't need to change anything in this file. The changes which you need to make is in variables file.
  2. Just change the value of following 4 variables:
    1. tesseractPath # the tesseract path you just downloaded. The one written in the file is default path
    2. wordPath # the word file path. Put the word file path under wordFiles directory for better readability
    3. pdfPath # the pdf file path. Put the pdf file path under pdfFiles directory for better readability

Author

Nitin Kumar

About

Created framework to convert an image file into word then to pdf file.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages