Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Oct 21, 2025

This is a very early WIP

Progress:

  • Only the language model is working now. The vision encoder is not yet implemented
  • Vision encoder is added, but not yet numerically correct
  • Model generate hallucinated text, likely because of the projector being incorrect

@ngxson ngxson linked an issue Oct 21, 2025 that may be closed by this pull request
4 tasks
@github-actions github-actions bot added the python python script changes label Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: support PaddleOCR-VL

1 participant