-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add support for Ollama assistants #376
Merged
pmeier
merged 32 commits into
Quansight:main
from
smokestacklightnin:assistants/ollama/basic-functionality
Jun 10, 2024
Merged
Changes from 27 commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
868febb
Added almost empty `OllamaApiAssistant`
smokestacklightnin a0c8499
Add `_make_system_content` method
smokestacklightnin ae0720a
Add preliminary (untested) `_call_api` method
smokestacklightnin eeba8a1
Using JSONL for responses
smokestacklightnin 420c9e8
Add kwargs for compatibility and TODO messages to remove in a future …
smokestacklightnin 20fb764
Add Ollama gemma:2b model
smokestacklightnin 906f2a1
Fix `OllamaApiAssistant._call_api` signature by adding types
smokestacklightnin 50f19a1
Add temperature option
smokestacklightnin 0ce77d8
Add `_assert_api_call_is_success()`
smokestacklightnin 7bbbafb
Add `answer()`
smokestacklightnin 301c815
Add `__init__()`
smokestacklightnin a4a2608
Set url through initializer or environment variable
smokestacklightnin 14e14c5
Add `is_available()`
smokestacklightnin 0cae498
Rename Gemma2B to OllamaGemma2B
smokestacklightnin d02b501
Remove unnecessary `else` clause
smokestacklightnin 1ce1982
Handle error in http response
smokestacklightnin fd5c34b
Remove unnecessary `_call_api()` abstraction
smokestacklightnin 6f2055c
Fix typing errors
smokestacklightnin e5e8e30
Add docstring
smokestacklightnin 72161a0
Add `OllamaPhi2`
smokestacklightnin c5e79e0
Remove unnecessary exclusion from test
smokestacklightnin f6edb19
Simplify check for availability of Ollama model
smokestacklightnin 6460d1e
Simplify call to superclass `is_available()`
smokestacklightnin c9b2e01
Correct incorrect grammar on system instruction
smokestacklightnin 086ce23
Add several Ollama models
smokestacklightnin 9724dd6
Order alphabetically
smokestacklightnin 9bebbb0
Add Ollama to listings in docs
smokestacklightnin bc211d3
Merge branch 'main' into assistants/ollama/basic-functionality
pmeier 4a737e0
refactor streaming again
pmeier 3e2a682
more
pmeier 5a4d89d
fix
pmeier 9de0920
cleanup
pmeier File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
import contextlib | ||
import json | ||
import os | ||
from typing import AsyncIterator, cast | ||
|
||
import httpx | ||
from httpx import Response | ||
|
||
import ragna | ||
from ragna.core import Assistant, RagnaException, Source | ||
|
||
|
||
class OllamaApiAssistant(Assistant): | ||
_MODEL: str | ||
|
||
@classmethod | ||
def display_name(cls) -> str: | ||
return f"Ollama/{cls._MODEL}" | ||
|
||
def __init__(self, url: str = "http://localhost:11434/api/chat") -> None: | ||
self._client = httpx.AsyncClient( | ||
headers={"User-Agent": f"{ragna.__version__}/{self}"}, | ||
timeout=60, | ||
) | ||
self._url = os.environ.get("RAGNA_ASSISTANTS_OLLAMA_URL", url) | ||
|
||
@classmethod | ||
def is_available(cls) -> bool: | ||
if not super().is_available(): | ||
return False | ||
|
||
try: | ||
return httpx.get("http://localhost:11434/").raise_for_status().is_success | ||
except httpx.HTTPError: | ||
return False | ||
|
||
def _make_system_content(self, sources: list[Source]) -> str: | ||
instruction = ( | ||
"You are a helpful assistant that answers user questions given the context below. " | ||
"If you don't know the answer, just say so. Don't try to make up an answer. " | ||
"Only use the following sources to generate the answer." | ||
) | ||
return instruction + "\n\n".join(source.content for source in sources) | ||
|
||
async def _assert_api_call_is_success(self, response: Response) -> None: | ||
if response.is_success: | ||
return | ||
|
||
content = await response.aread() | ||
with contextlib.suppress(Exception): | ||
content = json.loads(content) | ||
|
||
raise RagnaException( | ||
"API call failed", | ||
request_method=response.request.method, | ||
request_url=str(response.request.url), | ||
response_status_code=response.status_code, | ||
response_content=content, | ||
) | ||
|
||
async def answer( | ||
self, prompt: str, sources: list[Source], *, max_new_tokens: int = 256 | ||
) -> AsyncIterator[str]: | ||
async with self._client.stream( | ||
"POST", | ||
self._url, | ||
headers={ | ||
"Content-Type": "application/json", | ||
}, | ||
json={ | ||
"messages": [ | ||
{ | ||
"role": "system", | ||
"content": self._make_system_content(sources), | ||
}, | ||
{ | ||
"role": "user", | ||
"content": prompt, | ||
}, | ||
], | ||
"model": self._MODEL, | ||
"stream": True, | ||
"temperature": 0.0, | ||
}, | ||
) as response: | ||
await self._assert_api_call_is_success(response) | ||
|
||
async for chunk in response.aiter_lines(): | ||
# This part modeled after https://github.com/ollama/ollama/blob/06a1508bfe456e82ba053ea554264e140c5057b5/examples/python-loganalysis/readme.md?plain=1#L57-L62 | ||
if chunk: | ||
json_data = json.loads(chunk) | ||
|
||
if "error" in json_data: | ||
raise RagnaException(json_data["error"]) | ||
if not json_data["done"]: | ||
yield cast(str, json_data["message"]["content"]) | ||
|
||
|
||
class OllamaGemma2B(OllamaApiAssistant): | ||
"""[Gemma:2B](https://ollama.com/library/gemma)""" | ||
|
||
_MODEL = "gemma:2b" | ||
|
||
|
||
class OllamaLlama2(OllamaApiAssistant): | ||
"""[Llama 2](https://ollama.com/library/llama2)""" | ||
|
||
_MODEL = "llama2" | ||
|
||
|
||
class OllamaLlava(OllamaApiAssistant): | ||
"""[Llava](https://ollama.com/library/llava)""" | ||
|
||
_MODEL = "llava" | ||
|
||
|
||
class OllamaMistral(OllamaApiAssistant): | ||
"""[Mistral](https://ollama.com/library/mistral)""" | ||
|
||
_MODEL = "mistral" | ||
|
||
|
||
class OllamaMixtral(OllamaApiAssistant): | ||
"""[Mixtral](https://ollama.com/library/mixtral)""" | ||
|
||
_MODEL = "mixtral" | ||
|
||
|
||
class OllamaOrcaMini(OllamaApiAssistant): | ||
"""[Orca Mini](https://ollama.com/library/orca-mini)""" | ||
|
||
_MODEL = "orca-mini" | ||
|
||
|
||
class OllamaPhi2(OllamaApiAssistant): | ||
"""[Phi-2](https://ollama.com/library/phi)""" | ||
|
||
_MODEL = "phi" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nenb This violates "same response schema" part of #425 (comment) 😖