-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the Phantom wiki!
Phantom Search Engine is a robust, scalable, and efficient web search engine designed to provide fast and relevant search results. It is built with a focus on performance, scalability, and accuracy. The engine is designed to handle a large amount of data and provide quick responses to user queries.
The Phantom Search Engine consists of several main components:
-
Crawler System: The Crawler System is responsible for crawling the web and fetching the content of web pages. It includes a multithreaded crawler for concurrent crawling and a distributed crawler system for large-scale crawling.
-
Phantom Indexer: The Phantom Indexer processes the fetched data to create an index for faster search and retrieval. It uses the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm to measure the importance of a term in a document in a corpus.
-
Phantom Query Engine: The Phantom Query Engine is a crucial component that takes a user's search query and returns the most relevant documents from the database. It uses the TF-IDF algorithm to rank the documents based on their relevance to the query.
Each of these components is designed to work together seamlessly to provide a comprehensive search engine solution. The following sections provide a detailed overview of each component and how they interact with each other.
This documentation is intended to provide a comprehensive understanding of the Phantom Search Engine's architecture, functionality, and usage. It is designed to be a valuable resource for developers, users, and anyone interested in understanding the inner workings of a web search engine.
In this application, supabase
has been used, inorder to leverage the supabase, user will have to create an account, and create two tables
- Table index with the following fields:
- url (text)
- content (json)
- title (text)
- Table query with the fields
- query(text)
The query table is used to store the queries to take analyse the queries made