Skip to content

abdullah-k18/Microsoft-RAG-Hack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ Query PDF (Enhancing Accesibility For All Users)

Solution Overview:

Query PDF is a voice-powered AI RAG (Retrieval-Augmented Generation) application 🎀 designed to simplify working with PDFs πŸ“š. Users can upload documents and interact via voice commands πŸ—£οΈ, receiving accurate summaries and real-time responses ⚑.

Process Overflow:

diagram

Why We Built This Solution:

We built this solution to address common challenges people face with large, complex documents πŸ“‘. Traditional search tools can be limiting and are often inaccessible for individuals with disabilities . By integrating RAG and voice technology πŸ€–, we aimed to create an app that lets users interact with documents naturally, using conversation πŸ’¬.

🎯 Target Users:

  • Individuals with Visual Impairments or Learning Disabilities : Benefit from having documents read aloud and interacting using voice commands, promoting accessibility.
  • Business Professionals πŸ“Š: Work with lengthy contracts, proposals, or reports and require a fast, accessible way to review documents.
  • Multitaskers πŸ’Ό: Engage with documents hands-free, listening to summaries or searching documents while focusing on other tasks.
  • Students and Researchers πŸ§‘β€πŸŽ“: Need to extract and interact with large volumes of information from academic PDFs, reports, or textbooks quickly.

How RAG Helped:

RAG ensures the app provides accurate, relevant answers by retrieving specific data from PDFs πŸ“‚ and generating real-time voice summaries πŸ—£οΈπŸ“„. This reduces errors, making the app a trustworthy tool for users needing precise document-based information βœ….


Innovation πŸ’‘:

The app combines voice interaction πŸŽ™οΈ with RAG technology πŸ› οΈ to offer an easy, hands-free way to explore PDFs. It’s particularly helpful for users who may find traditional document navigation challenging, such as those with visual impairments πŸ‘€ or those who prefer voice over reading πŸ“–.


Impact 🌍:

The app is set to transform how people engage with digital documents . By providing voice-driven summaries πŸ”Š and search πŸ”, students, professionals, and individuals with accessibility needs can easily access key information without manually scrolling through long PDFs ⏳.


Usability πŸ”§:

The app is designed to be simple and accessible . Users upload a PDF, use voice commands to interact with content 🎀, and receive voice-based responses πŸ—£οΈ. It’s intuitive and user-friendly, with no technical skills required πŸ’»πŸš«.


Technology & Languages

  • JavaScript
  • Java
  • .NET
  • Python
  • AI Studio
  • AI Search
  • PostgreSQL
  • Cosmos DB
  • Azure SQL

Other Technologies Used

  • Next.js
  • React
  • JavaScript
  • Hugging Face
  • Pinecone
  • OpenAI API Key

Youtube Presentation Link:

Live Website Demo:

App Overview

1. Landing Page

The app begins with a Landing Page that welcomes users. To start using the app, click the "Start to PDF Now" button, which navigates you to the page where you can upload a PDF document.

home1

2. Homepage Features

On the homepage, users can explore the features app's three main features by clicking "Features" tab:

  • PDF Summary: Automatically generates a summary of the uploaded PDF.
  • Ask Questions: Allows users to ask specific questions about the PDF content.
  • Voice Chat: Engage in a voice-based conversation to send messages and interact with the PDF content.

home2

3. Meet the Team

By clicking on the "Meet the Team" section from the homepage, users can view the GitHub repositories of the contributors involved in building the app. home3

4. Chatbot Interaction Sample

Here’s an example of user interaction:

  • After clicking "Start PDF Chat Now", the user uploads a PDF file.
  • The app generates a summary of the uploaded document (e.g., a hackathon PDF).
  • The user can then prompt the chatbot (e.g., "When is submission due?"), and the bot will scan the document to respond accordingly.
Pasted Graphic 1

-By integrating RAG, this app ensures high-quality, context-aware interactions with PDF documents, enhancing the overall user experience.

5. Additional Sample

home4

Meet The Team

Team

Further ResearchπŸ”

To enhance the app's effectiveness and inclusivity, additional research and development can focus on the following areas:

1. Voice Interaction for Differently Abled Individuals

Conduct studies to assess and refine the voice interaction feature for users with various disabilities, including:

  • Speech Impairments: Tailor voice recognition and response features to better accommodate users with speech disabilities.
  • Hearing Impairments: Ensure that voice commands and responses are accessible and clear, possibly integrating text-to-speech and speech-to-text functionalities.

2. Usability Studies with Impaired Groups

Perform detailed usability studies to evaluate how individuals with cognitive, visual, or physical impairments interact with the app. This can include:

  • Cognitive Impairments: Simplify interactions and improve the clarity of instructions and feedback.
  • Visual Impairments: Enhance compatibility with screen readers and ensure that visual elements are accessible.

3. Language Processing and Adaptation

Improve natural language processing (NLP) capabilities to handle diverse speech patterns, accents, and speeds. Research could focus on:

  • Accent and Dialect Recognition: Adapt the app to accurately understand and respond to various accents and dialects.
  • Contextual Understanding: Enhance the app’s ability to comprehend and generate relevant responses based on contextual nuances in user queries.

By addressing these research areas, the app can become more inclusive, user-friendly, and effective for a broader range of users.