Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Draft ollama test #566

Open
wants to merge 26 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions .github/workflows/test_ollama.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
name: test | ollama

on:
workflow_dispatch:
pull_request:
types: [ labeled, synchronize ]

jobs:

run_simple_example_test:

runs-on: ubuntu-latest
services:
ollama:
image: ollama/ollama
ports:
- 11434:11434

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12.x'

- name: Install Poetry
uses: snok/[email protected]
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true

- name: Install dependencies
run: |
poetry install --no-interaction --all-extras

- name: Install ollama
run: curl -fsSL https://ollama.com/install.sh | sh
- name: Run ollama
run: |
ollama serve &
ollama pull llama3.2 &
ollama pull avr/sfr-embedding-mistral:latest
- name: Call ollama API
run: |
curl -d '{"model": "llama3.2", "stream": false, "prompt":"Whatever I say, asnwer with Yes"}' http://localhost:11434/api/generate
Comment on lines +46 to +48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

API Call to Ollama:

The API call appears functionally correct. However, note the typo in the prompt ("asnwer" should be "answer"). Correcting this will ensure clarity in the test.

-          curl -d '{"model": "llama3.2", "stream": false, "prompt":"Whatever I say, asnwer with Yes"}' http://localhost:11434/api/generate
+          curl -d '{"model": "llama3.2", "stream": false, "prompt":"Whatever I say, answer with Yes"}' http://localhost:11434/api/generate
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Call ollama API
run: |
curl -d '{"model": "llama3.2", "stream": false, "prompt":"Whatever I say, asnwer with Yes"}' http://localhost:11434/api/generate
- name: Call ollama API
run: |
curl -d '{"model": "llama3.2", "stream": false, "prompt":"Whatever I say, answer with Yes"}' http://localhost:11434/api/generate


- name: Wait for Ollama to be ready
run: |
for i in {1..30}; do
if curl -s http://localhost:11434/api/tags > /dev/null; then
echo "Ollama is ready"
exit 0
fi
echo "Waiting for Ollama... attempt $i"
sleep 2
done
echo "Ollama failed to start"
exit 1

- name: Dump Docker logs
run: |
docker ps
docker logs $(docker ps --filter "ancestor=ollama/ollama" --format "{{.ID}}")


- name: Run example test
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GRAPHISTRY_USERNAME: ${{ secrets.GRAPHISTRY_USERNAME }}
GRAPHISTRY_PASSWORD: ${{ secrets.GRAPHISTRY_PASSWORD }}
PYTHONFAULTHANDLER: 1
LLM_API_KEY: "ollama"
LLM_PROVIDER: "ollama"
LLM_ENDPOINT: "http://127.0.0.1:11434/v1/chat/completions"
LLM_MODEL: "ollama/llama3.2"
EMBEDDING_PROVIDER: "ollama"
EMBEDDING_MODEL: "avr/sfr-embedding-mistral:latest"
EMBEDDING_ENDPOINT: "http://127.0.0.1:11434/api/embeddings"
EMBEDDING_DIMENSIONS: "4096"
HUGGINGFACE_TOKENIZER: "Salesforce/SFR-Embedding-Mistral"
run: poetry run python ./examples/python/simple_example.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix File Formatting: Append a Newline at File End
YAML standards require a newline at the end of the file. Adding this will prevent linting issues and ensure compatibility with various tools.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 84-84: no new line character at the end of file

(new-line-at-end-of-file)

23 changes: 22 additions & 1 deletion .github/workflows/upgrade_deps.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,29 @@ name: Update Poetry Dependencies

on:
schedule:
- cron: '0 3 * * 0'
- cron: '0 3 * * 0' # Runs at 3 AM every Sunday
push:
paths:
- 'poetry.lock'
- 'pyproject.toml'
branches:
- main
- dev
pull_request:
paths:
- 'poetry.lock'
- 'pyproject.toml'
types: [opened, synchronize, reopened]
branches:
- main
- dev
workflow_dispatch:
inputs:
debug_enabled:
type: boolean
description: 'Run the update with debug logging'
required: false
default: false

jobs:
update-dependencies:
Expand Down
65 changes: 62 additions & 3 deletions cognee/infrastructure/llm/ollama/adapter.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,28 @@
from sys import api_version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused import.

The api_version import from sys is not used and conflicts with the class attribute. Remove this import to fix the redefinition error.

-from sys import api_version
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from sys import api_version
🧰 Tools
🪛 GitHub Actions: lint | ruff format

[error] 1-1: Ruff formatting check failed. The file would be reformatted.

from typing import Type
from pydantic import BaseModel
import instructor
from cognee.infrastructure.llm.llm_interface import LLMInterface
from cognee.infrastructure.llm.config import get_llm_config
from openai import OpenAI

import base64
from pathlib import Path
import os

class OllamaAPIAdapter(LLMInterface):
"""Adapter for a Generic API LLM provider using instructor with an OpenAI backend."""
"""Adapter for a Ollama API LLM provider using instructor with an OpenAI backend."""

api_version: str

MAX_RETRIES = 5

def __init__(self, endpoint: str, api_key: str, model: str, name: str, max_tokens: int):
def __init__(self, endpoint: str, api_key: str, model: str, name: str, max_tokens: int, api_version: str = None) -> None:
self.name = name
self.model = model
self.api_key = api_key
self.endpoint = endpoint
self.max_tokens = max_tokens
self.api_version= api_version

self.aclient = instructor.from_openai(
OpenAI(base_url=self.endpoint, api_key=self.api_key), mode=instructor.Mode.JSON
Expand Down Expand Up @@ -42,3 +50,54 @@ async def acreate_structured_output(
)

return response


def create_transcript(self, input):
"""Generate a audio transcript from a user query."""

if not os.path.isfile(input):
raise FileNotFoundError(f"The file {input} does not exist.")

# with open(input, 'rb') as audio_file:
# audio_data = audio_file.read()

transcription = self.aclient.transcription(
model=self.transcription_model,
file=Path(input),
api_key=self.api_key,
api_base=self.endpoint,
api_version=self.api_version,
max_retries=self.MAX_RETRIES,
)

return transcription
Comment on lines +55 to +73
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix implementation issues in create_transcript.

  1. The transcription_model attribute is not defined in the class.
  2. There is commented-out code that should be removed or implemented.

Apply this diff to fix the issues:

 def create_transcript(self, input):
     """Generate a audio transcript from a user query."""

     if not os.path.isfile(input):
         raise FileNotFoundError(f"The file {input} does not exist.")

-    # with open(input, 'rb') as audio_file:
-    #     audio_data = audio_file.read()
-
     transcription = self.aclient.transcription(
-        model=self.transcription_model,
+        model=self.model,  # Use the model defined in __init__
         file=Path(input),
         api_key=self.api_key,
         api_base=self.endpoint,
         api_version=self.api_version,
         max_retries=self.MAX_RETRIES,
     )

     return transcription
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_transcript(self, input):
"""Generate a audio transcript from a user query."""
if not os.path.isfile(input):
raise FileNotFoundError(f"The file {input} does not exist.")
# with open(input, 'rb') as audio_file:
# audio_data = audio_file.read()
transcription = self.aclient.transcription(
model=self.transcription_model,
file=Path(input),
api_key=self.api_key,
api_base=self.endpoint,
api_version=self.api_version,
max_retries=self.MAX_RETRIES,
)
return transcription
def create_transcript(self, input):
"""Generate a audio transcript from a user query."""
if not os.path.isfile(input):
raise FileNotFoundError(f"The file {input} does not exist.")
transcription = self.aclient.transcription(
model=self.model, # Use the model defined in __init__
file=Path(input),
api_key=self.api_key,
api_base=self.endpoint,
api_version=self.api_version,
max_retries=self.MAX_RETRIES,
)
return transcription


def transcribe_image(self, input) -> BaseModel:
with open(input, "rb") as image_file:
encoded_image = base64.b64encode(image_file.read()).decode("utf-8")

return self.aclient.completion(
model=self.model,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What’s in this image?",
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{encoded_image}",
},
},
],
}
],
api_key=self.api_key,
api_base=self.endpoint,
api_version=self.api_version,
max_tokens=300,
max_retries=self.MAX_RETRIES,
)
Loading