Welcome to TextTableScoop
🌟, a versatile tool designed for extracting text from files and CSV tables, particularly focusing on Office files like Excel, PowerPoint, etc. This project is part of a 'ProjectText' suite that includes ProjectTextAgent
and ProjectDataBaseQnA
.
- Specializes in extracting text from various file formats, including Office files.
- Designed to work in both Windows with COM and Linux with LibreOffice + PyUNO.
- Current implementation supports Linux + LibreOffice + PyUNO.
- Windows support with COM environment is planned for robust file handling.
To install TextTableScoop
, use the following pip command:
pip3 install git+https://github.com/Flagro/TextTableScoop.git
Run texttablescoop
from the bin folder with these arguments:
path
: Path to the file or directory to process.-t
or--temp
: (Optional) Path to a custom temporary folder.-p
or--project
: (Optional) Path to the project folder the file belongs to.--ignore
: (Optional) Comma-separated list of patterns to ignore.
texttablescoop 'path/to/file' --temp 'path/to/temp' --project 'path/to/project' --ignore 'pattern1,pattern2'
Open for collaboration; check the issues page for discussions.
Here's how you can contribute:
- Fork the Project.
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
). - Commit your Changes (
git commit -m 'Add some AmazingFeature'
). - Push to the Branch (
git push origin feature/AmazingFeature
). - Open a Pull Request.