Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

BartoszBLL
Copy link

This PR introduces support for Habana Gaudi accelerators (HPUs) to the project, enabling the application to run on HPU devices in addition to the existing support for CUDA, MPS, and CPU. The changes include:

🚀 Key Updates

  1. HPU Dockerfile (Dockerfile.hpu)
    • Added a new Dockerfile for running Khoj on Habana Gaudi devices.
    • Installs necessary dependencies (optimum-habana).
    • Configures environment variables for Habana optimizations.
  2. Device Selection (helpers.py)
    • Enhanced get_device() to detect HPU if available.
    • Supports cuda, hpu, mps, or cpu based on availability or user preference.
  3. Memory Management (helpers.py)
    • get_device_memory() now supports Habana HPU memory queries.
  4. Dependency Updates (pyproject.toml)
    • Added optimum-habana, torch-geometric, and numba.
  5. Documentation
    • README.md & src/khoj/app/README.md: Instructions for building and running Khoj with HPU.

💎 Why This Matters:

HPU Support

This PR enables the application to leverage Habana Gaudi accelerators, which can provide significant performance improvements for deep learning workloads.

Flexibility

Users can now choose their preferred device (CUDA, HPU, MPS, or CPU) for running the application, making it more versatile across different hardware setups.

Optimization

The addition of optimum-habana ensures that models are optimized for HPU and other hardware, improving efficiency and performance.

⚡ Performance Benchmarks

HPU: ~0.2703s average runtime (10 runs)
CPU: ~76.3144s average runtime (10 runs)
Result: ~282× speedup using HPU compared to CPU.

🛠 How to Test

Use the new Dockerfile.hpu to build and run the application on a system with Habana Gaudi accelerators.

# Build the HPU Docker image
docker build -t khoj-hpu -f Dockerfile.hpu .

# Run with Habana runtime
docker run --runtime=habana -e HABANA_VISIBLE_DEVICES=all -p <PORT>:<PORT> khoj-hpu

Check logs to confirm that HPU is recognized and in use.

✅ Checklist

  • Tested on Habana Gaudi accelerator
  • Verified compatibility with CPU and other devices
  • Updated documentation
  • Added required dependencies

📝 Notes

This PR is part of the effort to expand hardware support for the application, ensuring it can run efficiently on a wide range of devices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant