Table of Contents
BetterSearch is a desktop search tool that brings natural language to traditional file search. BetterSearch allows you to ask questions directly to your laptop/PC and get detailed answers, without sending your files to the cloud. Currently in its alpha version, BetterSearch is available only for Windows machines, with plans to support Windows, MacOS, and Linux in the full release.
Leveraging the powerful indexing features of existing search systems on Windows and MacOS, BetterSearch performs on-the-fly indexing and updates its content index automatically, even when files are added, deleted, or modified. Users do not need to manually add files for querying.
BetterSearch employs two state-of-the-art models for embedding and querying:
- SQLCoder - A fine-tuned Llama-3 model from Defog.ai, designed for SQL generation from natural language queries.
- gte-v1.5 - Alibaba’s gte-v1.5 series of models, known for its advanced embeddings, extended context lengths, and efficient memory usage.
CAUTION: only Windows is supported currently, BetterSearch will not work on a Linux or MacOS installation.
Ensure you have Python >= 3.9 installed, either through a local setup or virtual environment. I recommend creating a virtual environment using conda or venv.
Using conda
:
git clone https://github.com/sandesh-bharadwaj/BetterSearch.git
conda create -f bettersearch_env.yml # This creates a conda environment called 'bettersearch' with all dependencies.
conda activate bettersearch
To start the application, run:
cd BetterSearch
python app.py
On the first few runs, BetterSearch will take time for initial indexing and downloading the necessary models (depending on internet speeds), so please be patient. You can speed up file indexing by starting the application, switching the Compute Mode
setting to a GPU-based option (if you have a compatible Nvidia GPU), and then restarting the application. For more information, see Compute Mode.
Once the initial setup is complete, the application will start up much faster on subsequent launches.
BetterSearch can answer questions related to both file properties and file contents.
By default, BetterSearch uses the CPU-Only setting. However, GPU options are also available, and you can create your custom configurations by modifying the respective JSON files.
-
CPU-Only - This setting loads both the vector embedding model and SQLCoder on the CPU, using the OpenVINO-optimized version of SQLCoder available here. This setting consumes a significant amount of memory, so expect slower responses if you don't have sufficient RAM. (Tested and verified on Intel i7-12800HX, 32GB of RAM)
-
GPU VRAM < 10GB - SQLCoder is loaded using 4-bit optimization on the GPU. Requires at least 6GB of VRAM to work correctly. (Tested and verified on Nvidia RTX 3070Ti)
-
GPU VRAM < 16GB - SQLCoder is loaded using 8-bit optimization on the GPU. Requires at least 10GB of VRAM to work correctly. (Not tested; please report any bugs)
-
GPU VRAM > 16GB - SQLCoder is loaded using 16-bit optimization on the GPU. (Not tested; please report any bugs)
Additionally, you can choose to load only the vector embedding model on the GPU, while loading SQLCoder on the CPU. This can be done by setting embd_model_device
to cuda
instead of cpu
in cpu_only.json
. This configuration allows for fast file content indexing without requiring a powerful GPU to run SQLCoder.
-
"How many files were modified after September 10, 2021?"
-
"What are the three largest files on my system?"
-
"What is the penalty for not wearing a seatbelt in a passenger vehicle in Massachusetts?" - Information available in the Massachusetts Driving Manual PDF on my local machine.
-
"Give me a brief summary of Sandesh's thesis during his MS at Boston University." - Information available in my resume. 😉
-
Llama-3 seems to be prone to strange errors and failure to follow the prompt due to quantization, and I have experienced the same when running BetterSearch in
CPU-Only
andGPU VRAM < 10GB
modes. At the moment, there isn't a solution to this issues, but using the latter two GPU settings inCompute Mode
should yield better results. -
File content queries can be bad, due to the chunk size and chunk overlap settings. This can be improved through smarter indexing, and a good reference point is Greg Kamradt's tutorial. Alternatively, this could also be due to the nature of the embedding model being used, but extensive testing is required to confirm this.
To create your own configuration files, follow the structure provided in any of the pre-defined configuration files and adjust them as needed.
- "model_name": This specifies the LLM model used for search. I recommend:
- "defog/llama-3-sqlcoder-8b" - If you have an Nvidia GPU that supports one of the available GPU configs. Intel GPUs are also supported using the Intel Extension for PyTorch.
- "sandeshb/llama-3-sqlcoder-8b-int8-ov" - If you have an Intel CPU or GPU. Note that AMD CPUs may not experience the same level of speedup with OpenVINO models.
- "cache_dir": The cache directory where the models are downloaded (
"cache_dir/"
by default). - "bnb_config": Configuration for BitsAndBytes; refer to the documentation for more details.
- "kv_cache_flag": Sets the
use_cache
flag for generation models in HuggingFace Transformers. It is recommended to set this totrue
always. - "num_beams": Number of beams for beam search (default=
4
). - "db_path": Location of the content index (Chroma)(
"better_search_content_db/"
by default). - "embd_model_device": Decides where gte-v1.5 will be loaded. (Options:
"cpu"
,"cuda"
) - "check_interval": Interval (in seconds) at which BetterSearch checks the filesystem for changes and updates its content index (default=
30
). - "chunk_size": Chunk size for storing vector embeddings in Chroma (default=
500
). - "chunk_overlap": Overlap between vector embedding chunks in Chroma (default=
150
). It is recommended to keep this value between10%-20%
of "chunk_size". - "chunk_batch_size": Batch size for adding embedding chunks to Chroma. This should be set based on the amount of RAM available, as setting it too high can crash the app. (Default is
500
, adjust according to your preference.) - "top_k": Number of documents retrieved based on the query in Chroma (default=
3
).
- Folder whitelists/blacklists for content indexing.
- Improve content querying through chunk size and overlap controls.
- Add more settings for controlling generation.
- Migrate from using OS-specific search indexes to custom SQL database for better SQL querying and customizability.
- Add MacOS support
- Add Linux support
See the open issues for a full list of proposed features (and known issues).
If you have a suggestion that would improve this, please open an issue with the tag "enhancement".You can also fork the repo and create a pull request. Your feedback is greatly appreciated! Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Llama-3 is used under the Meta Llama-3 License. See 'LLAMA-3-LICENSE' for more information.
Sandesh Bharadwaj - [email protected]
Project Link: https://github.com/sandesh-bharadwaj/BetterSearch
- Meta for their continued contribution to open-source AI.
- Defog.ai for their fine-tuned versions of text-to-SQL models.