Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Government Document Chatbot using langchain #3

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

AleenDhar
Copy link

@AleenDhar AleenDhar commented Apr 14, 2024

Hi @amit-s19, I am a 2nd year BTech student and here is my solution to the problem :-

I am using Langchain and Gemini-pro to achieve the following results:

Result

chrome_oAcmNTpTRn

The steps are as follows:

  1. Using PyPDF2 to read the pdf document
  2. Using langchain.text_splitter to split the text written in the pdf
  3. Using GoogleGenerativeAIEmbeddings for creating text embeddings
  4. I have used FAISS as a vectorstore but we can easily replace that with ChromaDB or Pinecone.
  5. I have used a basic prompt template and question_answering chain to talk to the PDF data
  6. Using Streamlit to create a simple user interface.

The reason why I am using gemini-pro is because it has multi-language support.
later we can replace it with our own model fine-tuned in hindi or any other language.

we can use gemini-visison-pro to read the image data inside the pdf can convert it into text, which can later be used for question answering

@AleenDhar AleenDhar changed the title [DMP 2024]: Government Document Chatbot: Streamlining Access and Assistance Government Document Chatbot using langchain Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant