Skip to content
Ansah Mohammad edited this page May 8, 2024 · 3 revisions

Welcome to the Phantom wiki!

Phantom Search Engine

Phantom Search Engine is a robust, scalable, and efficient web search engine designed to provide fast and relevant search results. It is built with a focus on performance, scalability, and accuracy. The engine is designed to handle a large amount of data and provide quick responses to user queries.

The Phantom Search Engine consists of several main components:

  1. Crawler System: The Crawler System is responsible for crawling the web and fetching the content of web pages. It includes a multithreaded crawler for concurrent crawling and a distributed crawler system for large-scale crawling.

  2. Phantom Indexer: The Phantom Indexer processes the fetched data to create an index for faster search and retrieval. It uses the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm to measure the importance of a term in a document in a corpus.

  3. Phantom Query Engine: The Phantom Query Engine is a crucial component that takes a user's search query and returns the most relevant documents from the database. It uses the TF-IDF algorithm to rank the documents based on their relevance to the query.

Each of these components is designed to work together seamlessly to provide a comprehensive search engine solution. The following sections provide a detailed overview of each component and how they interact with each other.

This documentation is intended to provide a comprehensive understanding of the Phantom Search Engine's architecture, functionality, and usage. It is designed to be a valuable resource for developers, users, and anyone interested in understanding the inner workings of a web search engine.

Use of remote database

In this application, supabase has been used, inorder to leverage the supabase, user will have to create an account, and create two tables

  1. Table index with the following fields:
  • url (text)
  • content (json)
  • title (text)
  1. Table query with the fields
  • query(text)

The query table is used to store the queries to take analyse the queries made

Clone this wiki locally