Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PDF file uploads as context for LLM queries #3638

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

andrewwan0131
Copy link

Why are these changes needed?

These changes enable users to upload PDF files as context for LLM queries.

Changes made

  1. Added PDF file handling capabilities:

    • Implemented PDF file upload support in the web interface
    • Added PDF text extraction functionality
    • Integrated extracted PDF content as context for LLM queries
  2. Modified relevant files:

    • Updated gradio web server components to handle PDF uploads
    • Added PDF processing utilities
    • Enhanced chat protocol to include document context

Checks

  • I've tested the PDF upload and context integration with various document types by running Chatbot Arena locally

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andrewwan0131 left some comments!

fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks more comments

fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
@CodingWithTim CodingWithTim self-requested a review December 26, 2024 23:16
@CodingWithTim CodingWithTim self-assigned this Dec 26, 2024
@CodingWithTim
Copy link
Collaborator

CodingWithTim commented Dec 30, 2024

@andrewwan0131 @PranavB-11 I resolved the old comments because it is no longer relevant. We can start commenting this new code as it is pretty different from before. The pdfchat is now operational, I will extensively test it and improve it next.

Next steps:

  1. Fix some existing UI issues which is bothering me at the moment.
  2. Integrate our language detection code into parse_pdf.
  3. Add pdf moderator.

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @CodingWithTim ! left some quick comments

fastchat/serve/gradio_web_server.py Outdated Show resolved Hide resolved
fastchat/serve/setup_pdfchat.sh Outdated Show resolved Hide resolved
fastchat/serve/gradio_block_arena_vision.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants