This is a simple text extraction application using Python and Tkinter that allows you to extract text from PDF and image files and save it in a text file.
-
Run the script by executing the following command: python text_extractor.py
-
To extract text from a PDF file:
- Select a PDF file with the "Browse PDF" button.
- The text will be extracted and saved to a text file with the same name as the PDF but with a .txt extension.
-
To extract text from an image (JPG, JPEG, or PNG):
- Select an image file with the "Browse Image" button.
- The text will be extracted using Pytesseract and saved to a text file with the same name as the image but with a .txt extension.
-
The application displays the input file in red and the output text file in green in the right-side text box.
You can customize the appearance and behavior of the application by modifying the Python script. For example, you can change the window size, button colors, fonts, and more.
- Tkinter for the graphical user interface.
- PyPDF2 for PDF file handling.
- pytesseract for image text extraction.
- PIL (Pillow) for image processing.