Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

BartoszBLL · 2025-02-26T11:11:34Z

This PR introduces support for Habana Gaudi accelerators (HPUs) to the project, enabling the application to run on HPU devices in addition to the existing support for CUDA, MPS, and CPU. The changes include:

🚀 Key Updates

HPU Dockerfile (Dockerfile.hpu)
- Added a new Dockerfile for running Khoj on Habana Gaudi devices.
- Installs necessary dependencies (optimum-habana).
- Configures environment variables for Habana optimizations.
Device Selection (helpers.py)
- Enhanced get_device() to detect HPU if available.
- Supports cuda, hpu, mps, or cpu based on availability or user preference.
Memory Management (helpers.py)
- get_device_memory() now supports Habana HPU memory queries.
Dependency Updates (pyproject.toml)
- Added optimum-habana, torch-geometric, and numba.
Documentation
- README.md & src/khoj/app/README.md: Instructions for building and running Khoj with HPU.

💎 Why This Matters:

HPU Support

This PR enables the application to leverage Habana Gaudi accelerators, which can provide significant performance improvements for deep learning workloads.

Flexibility

Users can now choose their preferred device (CUDA, HPU, MPS, or CPU) for running the application, making it more versatile across different hardware setups.

Optimization

The addition of optimum-habana ensures that models are optimized for HPU and other hardware, improving efficiency and performance.

⚡ Performance Benchmarks

HPU: ~0.2703s average runtime (10 runs)
CPU: ~76.3144s average runtime (10 runs)
Result: ~282× speedup using HPU compared to CPU.

🛠 How to Test

Use the new Dockerfile.hpu to build and run the application on a system with Habana Gaudi accelerators.

# Build the HPU Docker image
docker build -t khoj-hpu -f Dockerfile.hpu .

# Run with Habana runtime
docker run --runtime=habana -e HABANA_VISIBLE_DEVICES=all -p <PORT>:<PORT> khoj-hpu

Check logs to confirm that HPU is recognized and in use.

✅ Checklist

Tested on Habana Gaudi accelerator
Verified compatibility with CPU and other devices
Updated documentation
Added required dependencies

📝 Notes

This PR is part of the effort to expand hardware support for the application, ensuring it can run efficiently on a wide range of devices.

BL359-96

Add Dockerfile for HPU runtime along with installation requirements.

Add HPUs (Intel® Gaudi®) support

fix: Device loading

BartoszBLL added 9 commits January 13, 2025 14:54

feat: Add HPU to supported backends.

3c891e1

BL359-96

feat: Add Habana dependencies to pyproject.toml

4e80e82

feat: Add Dockerfile

a17d22e

Add Dockerfile for HPU runtime along with installation requirements.

docs: Update README.md

d9f7fea

Remove vanilla Pytorch to remove conflicts with Gaudi Pytorch

a69780c

Merge pull request #1 from BlueLabelLabs/feat/hpu-support

9ecdeb7

Add HPUs (Intel® Gaudi®) support

fix: Device loading

d3c3244

Merge pull request #2 from BlueLabelLabs/feat/hpu-support

8564d1e

fix: Device loading

Merge branch 'master' into master

c403a05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

BartoszBLL commented Feb 26, 2025

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Are you sure you want to change the base?

Add Habana Gaudi (HPU) Support & Performance Benchmarks for Khoj #1125

Conversation

BartoszBLL commented Feb 26, 2025

🚀 Key Updates

💎 Why This Matters:

HPU Support

Flexibility

Optimization

⚡ Performance Benchmarks

🛠 How to Test

✅ Checklist

📝 Notes