Skip to content

Conversation

@PratikDavidson
Copy link

Description

This PR introduces Ollama as an alternative embedding provider to Hugging Face, enabling local CPU/GPU embedding inference, enhancing data privacy, and reducing cloud dependency and cost. It also improves flexibility by allowing users to choose between cloud-based (Hugging Face) and self-hosted (Ollama) embedding generation.

Changes

  • Added ollama client dependency to pyproject.toml.
  • Implemented Ollama configuration and centralized error handling in config.py.
  • Added model definitions for Ollama embedding models and extended EmbeddingService to process Ollama-based embedding requests in embeddings.py.
  • Introduced unit tests for Ollama integration via TestEmbeddingServiceOllama in test_embedding_service.py.
  • Updated README.md with documentation on using Ollama as an embedding provider.

Implementation Details

  • Run Ollama server
  • Config .env with:
    • EMBEDDING_PROVIDER=ollama
    • OLLAMA_HOST=localhost
    • OLLAMA_PORT=11434
    • OLLAMA_MODEL="nomic-embed-text"
  • uv run server.py

Checklist

✅Code compiles successfully
✅Unit tests added and passing
✅Documentation updated
✅No breaking changes introduced

@PratikDavidson PratikDavidson changed the title Added Ollama as an Embedding Provider Add Ollama as an Embedding Provider Oct 18, 2025
@nicholasericksen
Copy link

I was also looking for this capability. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants