Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable streaming with get_chat_response() #79

Open
Rumeysakeskin opened this issue Sep 18, 2024 · 0 comments
Open

Enable streaming with get_chat_response() #79

Rumeysakeskin opened this issue Sep 18, 2024 · 0 comments

Comments

@Rumeysakeskin
Copy link

I want to use streaming with chat history

# Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent
from llama_cpp import Llama
from llama_cpp_agent.providers import LlamaCppPythonProvider
from llama_cpp_agent.chat_history import BasicChatHistory, BasicChatMessageStore, BasicChatHistoryStrategy


# Create an instance of the Llama class and load the model
llama_model = Llama("gemma-2-2b-it-IQ3_M.gguf", n_batch=1024, n_threads=10, n_gpu_layers=0)
# llama_model = Llama("gemma-2-9b-it-IQ2_M.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)


# Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class
provider = LlamaCppPythonProvider(llama_model)

from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType
# Pass the provider to the LlamaCppAgentClass and define the system prompt and predefined message formatter
agent = LlamaCppAgent(provider,
                      system_prompt="You are a helpful assistant.",
                      predefined_messages_formatter_type=MessagesFormatterType.CHATML)


settings = provider.get_provider_default_settings()
settings.stream = True
settings.temperature = 0.1

# Create a message store for the chat history
chat_history_store = BasicChatMessageStore()

# Create the actual chat history, by passing the wished chat history strategy, it can be last_k_message or last_k_tokens. The default strategy will be to use the 20 last messages for the chat history.
# We will use the last_k_tokens strategy which will include the last k tokens into the chat history. When we use this strategy, we will have to pass the provider to the class.
chat_history = BasicChatHistory(message_store=chat_history_store, chat_history_strategy=BasicChatHistoryStrategy.last_k_tokens, k=7000, llm_provider=provider)

agent_output = agent.get_chat_response("neler yapabiliyorsun", llm_sampling_settings=settings)

agent_output.strip()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant