Skip to content

Tutorial Code for the "Build Your Own Search Engine"

Notifications You must be signed in to change notification settings

andluizsouza/search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Your Own Search Engine

Code for the "Build Your Own Search Engine"

What we will do:

  • Use FAQ documents from free online courses
  • Create a search engine for retreiving these documents
  • Later the results can be used for a Q&A RAG system

Tutorial Outline

  1. Preparing the Environment
  2. Basics of Text Search
    • Basics of Information Retrieval
    • Introduction to vector spaces, bag of words, and TF-IDF
  3. Implementing Basic Text Search
    • TF-IDF scoring with sklearn
    • Keyword filtering using pandas
    • Creating a class for relevance search
  4. Embeddings and Vector Search
    • Vector embeddings
    • Word2Vec and other approaches for word embeddings
    • LSA (Latent Semantic Analysis) for document embeddings
    • Implementing vector search with LSA
    • BERT embeddings
  5. Combining Text and Vector Search
  6. Practical Implementation Aspects and Tools
    • Real-world implementation tools:
      • Inverted indexes for text search
      • LSH for vector search (using random projections)
    • Technologies:
      • Lucene/Elasticsearch for text search
      • FAISS and and other vector databases

References

  1. GitHub: Build Your Own Search Engine
  2. YouTube Tutorial: Implement a Search Engine

About

Tutorial Code for the "Build Your Own Search Engine"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published