Chat options #111

mruwnik · 2023-10-01T22:40:46Z

This one is a big boy. It adds something like OpenAI's playground for the chatbot:

I can split this into smaller PRs if this one is too large.
I'd also like to have it password protected or something, but thought that it's probably best to get the basic version out first

mruwnik · 2023-10-01T22:43:06Z

api/src/stampy_chat/chat.py



 logger = logging.getLogger(__name__)


-STANDARD_K = 20 if COMPLETIONS_MODEL == 'gpt-4' else 10


all of these are moved over into the Settings class to be able to pass them around

mruwnik · 2023-10-01T22:43:51Z

api/src/stampy_chat/chat.py

 # limit a string to a certain number of tokens
-def cap(text: str, max_tokens: int) -> str:
+def cap(text: str, max_tokens: int, encoder) -> str:


the other changes in this file are basically to get it to use the settings object, rather than various constants

mruwnik · 2023-10-01T22:44:45Z

api/src/stampy_chat/chat.py

-        max_tokens_completion = remaining_tokens(prompt)
+        max_tokens_completion = remaining_tokens(prompt, settings)
+        if max_tokens_completion < 40:
+            raise ValueError(f"{max_tokens_completion} tokens left for the actual query after constructing the context - aborting, as that's not going to be enough")


should this be changed? I arbitrarily put 40, but 100 would probably also be too few

maybe it should be a class or instance variable? Also, What is the trade off here? More tokens cost more per response potentially, but has a better chance of giving the user what we want?

There's already a settings.numTokens setting which should handle that. The problem here is that a whole lot of additional context is added to the prompt before it gets to this point. Specifially:

settings.source_prompt_prefix

up to settings.topKBlocks of citations from pinecone

settings.source_prompt_suffix

(optional) the last settings.maxHistory items from the conversation

settings.question_prompt

(optional) settings.mode_prompt to regulate the complexity level of the answer

the actual query

This prompt can get quite big, taking up a lot of the settings.numTokens available tokens (in the worst case taking all of them), which limits the number left for the actual response. A seperate setting might help, but won't solve the underlying problem, which would probably require some creative book keeping to limit the number of tokens available for each step. Steps 1, 3, 5, 6 and 7 should always be sent to the LLM (but even then can take up all the tokens), so it's really only the context and history steps that should be limited.

mruwnik · 2023-10-01T22:45:38Z

api/src/stampy_chat/get_blocks.py

-# we add the title and authors inside the contents of the block, so that
-# searches for the title or author will be more likely to pull it up. This
-# strips it back out.
-def strip_block(text: str) -> str:


this continuosly raises warnings that it can't find the titles. Is it even used anymore? Wans't that the whole point of adding the other fields?

maybe just change the logic to if r: return r.group(1) if r else text ? no need for warning. I'm good to delete this too.

mruwnik · 2023-10-01T22:46:48Z

web/src/components/chat.tsx

@@ -0,0 +1,155 @@
+import { useState, useEffect } from "react";


I moved the actual chat functionality/component bits to a separate component so it can be imported easier. This also will make it easier to move it over to the stampy-ui repo...

mruwnik · 2023-10-01T22:50:14Z

web/src/pages/playground.tsx

+    historyFraction: 0.25, //  the (approximate) fraction of num_tokens to use for history text before truncating
+    contextFraction: 0.5,  //  the (approximate) fraction of num_tokens to use for context text before truncating
+}
+const COMPLETION_MODELS = ['gpt-3.5-turbo', 'gpt-4']


more can be added - these are the only ones I saw mentioned in the code

mruwnik · 2023-10-01T22:51:18Z

web/src/pages/playground.tsx

+    parser?: (v: string) => number,
+}
+
+const ChatSettings = ({settings, updateSettings}: ChatSettingsParams) => {


Basically a list of possible numerical values

mruwnik · 2023-10-01T22:53:22Z

web/src/pages/playground.tsx

+    updateSettings: (updater: (settings: LLMSettings) => LLMSettings) => void,
+}
+
+const ChatPrompts = ({settings, query, history, updateSettings}: ChatPromptParams) => {


Displays the prompt that will be used. This is pretty much what will be sent to the LLM, apart from the citations chunks which first have to be fetched. So this panel allows one to pretty much pass in arbitrary prompts:

mruwnik · 2023-10-01T22:54:49Z

web/src/pages/playground.tsx

+                />
+                )}
+            </details>
+            {history.length > 0 && (


This will display the text of the previous interactions, as the last maxHistory items will be used:

mruwnik · 2023-10-01T22:58:55Z

web/src/pages/playground.tsx

+                    value={settings.prompts.question}
+                    onChange={updatePrompt('question')}
+                />
+                <TextareaAutosize


this is where the prompt specifying the user's level goes

chriscanal

Epic pull request. Left a couple comments, nothing serious. Nice quality code here :)

chriscanal · 2023-10-02T17:56:34Z

api/src/stampy_chat/settings.py

+        elif completions == 'gtp-4':
+            self.numTokens = 8191
+        else:
+            self.numTokens = 4095


would be cool to drop the link to the api here in a comment: https://platform.openai.com/docs/models/gpt-4

This really should have a proper dict with the various models, but I thought it better to change as little as possible here :D

chriscanal · 2023-10-02T17:59:15Z

web/src/types.ts

+export type Mode = "rookie" | "concise" | "default";
+
+export type LLMSettings = {
+  prompts?: {


Are you sure all these settings should have the option of being undefined?

yes. They should have sane defaults in the API - they're mainly here to allow playing about with them, so no point in enforcing them all to be set

mruwnik force-pushed the chat-options branch from 972017a to 2bf1776 Compare October 1, 2023 22:57

mruwnik commented Oct 1, 2023

View reviewed changes

mruwnik requested review from plexish, Aprillion, chriscanal, FraserLee, ccstan99, henri123lemoine and Thomas-Lemoine October 1, 2023 22:59

mruwnik force-pushed the chat-options branch from 2bf1776 to 01586e7 Compare October 2, 2023 16:50

mruwnik added 4 commits October 2, 2023 18:53

extract chat component

b4a072c

Basic playground

ff55dd5

Allow the backend to be configured

88a8840

add styling to playground

43f1aab

mruwnik force-pushed the chat-options branch from 01586e7 to 43f1aab Compare October 2, 2023 16:53

chriscanal approved these changes Oct 2, 2023

View reviewed changes

mruwnik merged commit 1b43c28 into main Oct 2, 2023
1 check passed

mruwnik deleted the chat-options branch October 2, 2023 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat options #111

Chat options #111

mruwnik commented Oct 1, 2023 •

edited

Loading

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

chriscanal Oct 2, 2023

mruwnik Oct 2, 2023

mruwnik Oct 1, 2023

chriscanal Oct 2, 2023

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

mruwnik Oct 1, 2023

chriscanal left a comment

chriscanal Oct 2, 2023

mruwnik Oct 2, 2023

chriscanal Oct 2, 2023

mruwnik Oct 2, 2023



		logger = logging.getLogger(__name__)


		STANDARD_K = 20 if COMPLETIONS_MODEL == 'gpt-4' else 10

		@@ -0,0 +1,155 @@
		import { useState, useEffect } from "react";

Chat options #111

Chat options #111

Conversation

mruwnik commented Oct 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chriscanal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mruwnik commented Oct 1, 2023 •

edited

Loading