Add PDF Reading Capability #92

gromdimon · 2025-01-19T15:33:03Z

Problem Statement

Is your feature request related to a problem? Please describe.

Currently, the Nevron framework lacks the ability to extract and analyze content from PDFs. This limitation prevents the agent from processing important documents, research papers, and other PDF-based sources, which are common in many workflows.

Describe the solution you'd like

Add functionality to the framework to read and extract content from PDF files. The feature should enable the agent to process PDF documents, analyze the text, and integrate the extracted data into workflows such as memory storage, action planning, or contextual analysis.

Proposed Solution

Proposed Implementation Steps:

Add a PDF Processing Utility:
- Use a Python library like PyPDF2, pdfplumber, or PyMuPDF for PDF text extraction.
- Create a new Execution tool:
  - Extract text from single-page and multi-page PDFs.
  - Handle PDFs with complex layouts (e.g., multi-column, images).
  - Manage encrypted PDFs by saying, that PDF must be unencrypted.
Configuration Options:
- Allow users to configure PDF processing settings in settings.py, such as:
  - Maximum file size for PDFs.
  - Page range selection.
  - Enable/disable image-based OCR for non-text PDFs.
Error Handling:
- Gracefully handle errors like:
  - Corrupted or unsupported PDF files.
  - Failed text extraction due to complex layouts or encryption.
- Log detailed error messages for debugging.
Unit Tests:
- Write unit tests to validate PDF extraction functionality using sample PDFs:
  - Text-only PDFs.
  - PDFs with images and text.
  - Encrypted PDFs.
  - PDFs with complex layouts.
Security:
- WE need to check the audio first for any malware

Additional Context

Suggested utility function for extracting text:

import pdfplumber

def extract_text_from_pdf(file_path: str) -> str:
    """
    Extract text from a PDF file.

    Args:
        file_path (str): Path to the PDF file.

    Returns:
        str: Extracted text.
    """
    try:
        with pdfplumber.open(file_path) as pdf:
            text = ""
            for page in pdf.pages:
                text += page.extract_text()
            return text
    except Exception as e:
        raise RuntimeError(f"Failed to extract text from PDF: {e}")

Example use case:
- A user uploads a research paper PDF. The framework extracts the content and uses it to update the agent's memory or plan actions based on the insights.

The text was updated successfully, but these errors were encountered:

gromdimon added the feature New feature or request label Jan 19, 2025

gromdimon added this to the v0.2.0 milestone Jan 19, 2025

gromdimon assigned anderlean Jan 19, 2025

gromdimon modified the milestones: v0.2.0, v0.3.0 Jan 24, 2025

gromdimon modified the milestones: v0.3.0, v0.2.2 Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PDF Reading Capability #92

Add PDF Reading Capability #92

gromdimon commented Jan 19, 2025 •

edited

Loading

Add PDF Reading Capability #92

Add PDF Reading Capability #92

Comments

gromdimon commented Jan 19, 2025 • edited Loading

Problem Statement

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Proposed Solution

Proposed Implementation Steps:

Additional Context

Additional Context

gromdimon commented Jan 19, 2025 •

edited

Loading