Do you have difficulty finding relevant research papers? BrainQuery to the rescue! It employs AI Semantic Search to find hidden meaning behind searches and uses that to search through the abstracts of papers from the arXiv database - a large open-access archive of scientific material. (and yes, it has dark mode too!)
It even includes multi language support using OpenAI translation, which makes searches possible in any language.
- ReactJS
- Tailwind CSS
- Flask API
- OpenAI API (for Text Embeddings and Translation)
- Pinecone (Vector Database)
Example Query (Bengali): "মিন রিসোর্স কোয়ান্টাম কম্পিউটেশন বৃদ্ধি" (translation: min resource quantum computation increase)
The dataset consists of the arXiv Dataset (https://www.kaggle.com/datasets/Cornell-University/arxiv), of which, the first 15000 entries have been used for this project (due to Pinecone Restrictions). The embeddings for the abstract of each article were created (using OpenAI's ada-002 model), which has a dimensionality of 1536. These embeddings were then indexed in the Pincone Vector Database. For Queries, this project uses OpenAI's translation (text completion via text-davinci-003 model) and embeddings (ada-002 model) to first translate given query into English and then embed it. A query is then made to the Pinecone database, which computes the top 5 items with consine similarity and returns the results as per ranking (most similar first), which is then displayed using the React frontend.