This is the MVP of Sherlock AI, the winning project in the AI Tinkerers Generative AI hackathon. It was developed by Reza Salehi, Ramsey Khadder, and Pratik Prakash.
Our team's objective was to simplify access to a company's collective knowledge. For the hackathon, we created a Discord bot that used a company's documentation and support threads on Slack or Discord to provide more accurate answers to questions. During our final presentation, we used LangChain's markdown documentation and LangChain's Discord support forum as sources for answering technical questions.
Sherlock AI is built on four key components:
Data Chunking: The system gathers data from multiple knowledge sources and divides it into documents.
Document Retrieval: When a user submits a query, the Reranker identifies the top-k most relevant documents related to the query.
Prompt Engineering: The retrieved documents are used as context for a state-of-the-art large language model (LLM) to generate reliable answers based on the engineered prompt.
Discord Bot User Interface: Sherlock utilizes Discord as its user interface. Users input their queries through the bot and receive responses generated by the LLM.
During our development process, we discovered that Cohere Rerank was an accessible tool that allowed us to perform document retrieval (the second component) effectively and efficiently. Implementing this part by storing document embeddings in a vector database would have taken us hours. Interestingly, we learned about Cohere Rerank on the same day as the hackathon, in Ivan Zhang's (Co-founder and CTO of Cohere) keynote speech.
You can install the package from source via this command:
pip install -e .