-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design proposal: Chat Completions API (rev. 1.1) #144
Comments
Thanks @dlqqq for this very complete proposal. 👍 for defining it in the frontend. In addition to the list of benefits, I would add the compatibility with extensions running in Jupyterlite, without server extensions. Some details of the current state of the completion
Comments on the proposal
|
Maybe I am misunderstanding, but I think this requirement may be too strict. Example: Would this still trigger the Can you sketch out how one may use the
|
@brichet I think we should avoid adding features specific to the current frontend component library (Material UI), as that may change in the future. I don't think we should add a
You may be conflating the |
Thanks for calling this use-case out. You're right that if we take the entire input string and match it against a regex, the regex will not match an I think we can handle this use-case with a simple modification. Instead of testing the whole input string against each regex, take the substring of the input up to the user's cursor position, then test that substring against each regex. This effectively changes Let me know if this sounds reasonable, in which case I'll patch the existing design. |
I haven't designed how we can "link" 1+ files to chats yet, so retrieving this data from the completer is still an open question. We could pass a reference of the current function getCompletions(ychat: YChat, match: string): ChatCompletion[] This by itself would allow for use-case 2), since a completer can call I will do some experimentation today to explore ways we can get a reference to any arbitrary files accessible from the
The current design doesn't allow for this, but we can allow completers to return some kind of sentinel value to indicate that this is an invalid input. For example, if a completer returns I'm hesitant on allowing completers to arbitrarily define how their completions are rendered, as that allows other extensions a little too much freedom to deviate from Jupyter's UI design. @ellisonbg has highlighted this concern in the past. However, allowing completers to define how their completions are rendered is certainly possible. We can modify the type signature of function getCompletions(match: string): (ChatCompletion | JSX.Element)[] For each generated completion:
Currently, yes, the expectation is that command parsing & handling is done separately. The issue is that the concept of "chat commands" doesn't exist in Jupyter Chat. Jupyter AI exclusively defines messages are parsed and handled, which is all currently done in the backend. We could extend this design to be more general, i.e. build a Chat Commands API instead of just a Chat Completions API. This was my original intent, but I scaled back the design after I realized how much of an overhaul this would entail. I think that this may be worthwhile, but also that we would need to first identify what specific & significant benefits this change would provide. |
I think that sounds fine. Checking how slack handles middle edits (
As long as files and variables have really great UX, I don't have a strong view on others defining their completion.
Won't this make it hard to let things edit later? Or does editing change the state back to text? |
Thanks for confirming. I've patched this revision of the design.
Can you clarify what you mean here? On another note: I really appreciate how much thought your team has put into building the best UX for including files & variables. To build a design that allows for this, it would be helpful to have a visualization of the UX your team is envisioning. This will allow me to know what additional capabilities are required from the completer. Even a pen-and-paper sketch showing the UI across a user's actions would be tremendously helpful. Perhaps @govinda18 has some materials to share? |
One shortcoming of this design is that the I'm going to think more deeply about how we can unify most of the chat command logic in the frontend. In the meantime, I've opened this issue to track the need to support message attachments: #147 |
Given
We don't yet have mocks for variables. The key thoughts so far include:
Files to me is a bit easier. See what the top other assistants (cursor, copilot, windsurf) are doing and assume if its useful there, we want something at least as good :) |
Some quick thoughts:
|
Let's call this feature rich command previews for the sake of discussion. Thank you for sharing more details about your team's vision for chat completion! This is helpful. I will take this into account while writing a draft implementation to explore different approaches to providing rich command previews. The draft implementation will help us find the right solution to use here.
I'm still a little lost about what you mean by "changing it". I'm assuming you mean changing the user input, since we haven't had any discussion for editing/deleting command previews. I'll explore an example here and propose how this would affect the command previews shown in the completions menu. Let
This shows just 1 command preview for
This now shows 2 command previews:
This now just shows the command preview Does this clarify the behavior? |
@krassowski Wow, thanks for the great feedback!
The rationale is that async functions are easily called in the constructor, but we can't await a constructor easily. It seems easier to bundle any & all async init tasks into a single function which can be awaited. I know there are some Lumino data structures which are used in JupyterLab for this use-case, but I don't see the value in preferring those over simply awaiting an async function. Jupyter Chat will be the only extension calling
Thanks, I agree with you. It's more future-proof to use only one argument with an object type that can be extended as needed. I'll queue this change for the next revision.
Yeah, this also makes sense. Passing the entire input string to each completer is equivalent in performance, and it also simplifies the mental model of how completers work. I'll queue this change for the next revision.
I agree that this feature is currently ambiguous in its precise behavior & may introduce "unknown unknown" risk. Therefore, I'm writing a draft implementation which we can all use to test different strategies and iterate on this further. My thinking is that we may find a way to support rich (image/multimedia) command previews in the completions menu as we experiment. If this proves too challenging however, we may just want to do this later in a future release of Jupyter Chat. |
Not sure what you are referring to here.
True. But then you have a disconnect between initialization and construction logic which can bite you later. A common pattern for async initialization would be: interface IX {
ready: Promise<void>;
}
class X implements IX {
constructor() {
this._ready = this._initialize();
}
get ready(): Promise<void> {
return this._ready;
}
private async _initialize() {
// do stuff
}
private _ready: Promise<void>;
} This reduces the API contract a little bit and avoids the risk of Anyways, this is a minor detail and more me saying "I did that before and it bit me" than "it must be done this way". |
@krassowski Ah, thanks for clarifying. I must've been confusing your suggestion with something else. This makes much more sense to me now seeing your example. I agree this is better, so let's also queue this for the next revision. 👍 Thanks for taking the time to leave all this helpful feedback! I actually can't own the draft implementation right now; I need to upgrade Jupyter AI to use Others, please feel free to own the draft implementation of this design while I work on the LangChain upgrade. Any contribution would be appreciated! 🤗 Just leave a note in this thread for others to avoid duplicate work. |
We could use the model to attach the file or variables, as commented at #147 (comment). To be able to attach those files/variables from the completion, I can see 2 options (not exhaustive), with a preference for the second one: Option 1Add a callback function in the type ChatCompletion = {
// e.g. "/ask" if the input was `/`
value: string;
// if set, use this as the label. otherwise use `value`.
label?: string;
// if set, show this as a subtitle.
description?: string;
// identifies which icon should be used, if any.
// Jupyter Chat should choose a default if one is not provided.
icon?: LabIcon;
// callback function to call when the completion is accepted
callback?: (model: IChatModel) => void;
} An example of implementation could simply be: callback = (model: IChatModel) => model.addAttachments(attachment); That function can be called There are downsides with this option:
Option 2Make the That way, we could handle any change in the input, parsing it and updating the list of attachments. In my opinion, this option is more robust, it should work for any use case and does not depend on MUI autocomplete. type IAttachment = {
type: 'file' | 'variable' | 'image',
value: string,
mimetype?: string
}
type ChatCompletion = {
// e.g. "/ask" if the input was `/`
value: string;
// if set, use this as the label. otherwise use `value`.
label?: string;
// if set, show this as a subtitle.
description?: string;
// identifies which icon should be used, if any.
// Jupyter Chat should choose a default if one is not provided.
icon?: LabIcon;
// Description of the attachment if the completion is a file or a variable.
attachment?: IAttachment;
} |
@brichet Thanks for doing the research here. I think exploring some combination of these 2 designs would make sense:
I should be able to work on the draft implementation again soon, so I'll take this into account. Thanks again for this. |
@brichet Let's also chat about the edge cases you mentioned, since I've been thinking a lot about these too:
I think this is fine. A user can always manually remove the attachment if it's no longer desired. My thinking is that these chat commands should be handled in the frontend. So if a user types That way, both the user & the AI receive a readable prompt (that doesn't contain custom Jupyter chat commands), while also getting the context through the Attachments API.
This is the edge case that I'm not sure how we will handle. The main problem is that there's two ways of running a command:
I think we will want to replace
Again, I think a draft implementation would help clarify things here. I'll try again on Friday. |
hello, with this design does it still allow for users to extend with their own custom of context providers with its own completion logic as per v2? also with the @<var_name>, how would you handle autocomplete for clashes with variable name and context provider names (e.g. variable called 'file' vs context provider 'file') |
Yes! However, the implementation of context commands will likely be moved to the TypeScript frontend instead of the Python backend. This allows command handlers to attach files to the input box and provide completions with minimal network requests.
I've designed this API to retrieve completions from multiple completers, which will all be shown in the same command completions menu (autocomplete menu). So in this case, if a user types
These would have different icons as well to help distinguish the two for users. |
Cool, that is fine as well.
What if the completer doesn't have an arg? E.g. something like '@wiki'? Personally, I would prefer if vars also have the '@vars' prefix for consistency and unambiguity. Perhaps for convenience, if the user types '@<var_name>' the autocomplete would suggest '@var:<var_name>'. Just my 2 cents. It's not a big deal either way. Will there be a way to disable referencing variables? I may need to due to data leakage concerns. I've already implemented it in v2 but just haven't released it because of this. Would it be possible to override the existing ones? I may need to alter "@file" to restrict / allow certain file types. |
Description
This issue proposes a design for a new Chat Completions API. This API will allow consumer extensions to provide completions for the user's current input from the UI. In this context, a consumer extension is any frontend + server extension that intends to provide completions for substrings in the chat input.
cc @ellisonbg @mlucool @michaelchia
Supersedes Design proposal: Chat Completions API (rev. 0) #143.
Motivation
Suppose a user types
/
in the chat with Jupyter AI installed. Today, Jupyter Chat responds by showing a menu of chat completions:The opening of this completions menu is triggered simply by typing
/
. However, the current implementation only allows a single "trigger character" (/
). This means that@
commands in Jupyter AI cannot be autocompleted. Furthermore, the completer makes a network call every time a user types/
.This design aims to:
To help explain the proposed design, this document will start from the perspective of a consumer extension, then work backwards towards the necessary changes in Jupyter Chat.
Step 1: Define a new
IChatCompleter
interfaceTo register completions for partial inputs, a consumer extension must provide a set of chat completers. A chat completer is a JavaScript/TypeScript class which provides:
id
(property): Defines a unique ID for this chat completer. We will see why this is useful later.regex
(property): Defines a regex which matches any incomplete input.$
to ensure this regex only matches partial inputs just typed by the user. Without$
, the completer may generate completions for commands which were already typed.async initialize(): void
: called and awaited by Jupyter Chat.async getCompletions(match: str): ChatCompletion[]
: Defines a method which accepts a substring matched by its regex, and returns a list of potential completions for that input. This list may be empty.It's important to note that a consumer extension may provide more than 1 completer. This allows extensions to provide completions for different commands which aren't easily captured by a single regex. For example, Jupyter AI can have a completer for
/
commands and another completer for@
commands.Jupyter Chat will define a new
IChatCompleter
interface which chat completers must implement, shown below.The consumer extension will construct/instantiate the class itself before providing it to Jupyter Chat. Jupyter Chat will call
await initialize()
on each completer on init. The details of this will be discussed later.To define a chat completer, a consumer extension should implement the
IChatCompleter
interface. Here is an example of how Jupyter AI may implement a chat completer to provide completions for its slash commands:Step 2: Create a new completers registry
For Jupyter Chat to have awareness of completers in other extensions, the consumer extension must register each of its chat complters to a
ChatCompletersRegistry
object on init. This registry is a simple class which will provide the following methods:add_completer(completer: IChatCompleter): void
: adds a completer to its memory. A completer is said to be registered after this method is called on it.get_completers(): IChatCompleter[]
: returns a list of all registered completers.init_completers(): void
: callsawait initialize()
on all registered completers.To provide access to this
ChatCompletersRegistry
object, Jupyter Chat will define a plugin which provides aIChatCompletersRegistry
token. When consumer extensions require this token in their frontend plugins, they receive a reference to theChatCompletersRegistry
singleton initialized by Jupyter Chat, allowing them to register their completers. This system of providing & consuming tokens to build modular applications is common to all of JupyterLab's frontend.Jupyter Chat already defines a
IAutocompletionRegistry
using a similar approach, used by Jupyter AI to provide completion for/
commands. Because an implementation reference is already available, we will not go into detail here. It is sufficient to know that at this point, we have a way of allowing consumer extensions to define multiple completers and provide them to Jupyter Chat for use.Step 3: Integrate new chat completions API
From the example
SlashCommandCompleter
implementation in Step 1, we can piece together how the application should behave:On init, each consumer extension instantiates its completers and adds them to the
ChatCompletersRegistry
singleton, provided by Jupyter Chat.Jupyter Chat should call
ChatCompletersRegistry.init_completers()
in the background.Perform the following on input changes:
Take the substring ending in the user's cursor, and store this as a local variable, e.g.
partial_input
.For each completer, test
partial_input
against the completer's regex. If a matchm
is found, callgetCompletions(m)
. Store a reference to this Promise.Add a callback to the Promise to append the new completions to the existing list of completions.
If a completion is accepted, replace the substring of the input matched by the completer's regex with the completion.
If a user ignores completions and continues typing, cancel all Promises and return to 3).
The frontend implementation may debounce how frequently it tests the input against each regex, as testing an input against multiple regexes may be expensive. However, I think it is important we test the performance as-is first before making an optimization, since debouncing any callback adds a fixed amount of latency (the debounce delay).
Conclusion
The
IChatCompleter
interface defined in Step 1 and theChatCompletersRegistry
defined in Step 2 give consumer extensions a way of defining and providing chat completers. This interface and registry together define the Chat Completions API. Step 3 of this document provides guidance on how to use the new chat completions API to provide better completions in Jupyter Chat.Benefits & applications
Because completers live in the frontend, they may not need to make a network call when triggered by the input. Some completers may allow completions to be statically defined (e.g. emoji names) and others may only need to make a network call at init (e.g. slash commands).
Because completers live in the frontend, it can choose to use any API to communicate with the server. If a Python-only API is required, a custom server handler can be defined to provide the same capabilities to the completer.
Completers are uniquely identified by their
id
, so two completers can use the same regex but yield two different sets of completions.Application: Another extension could use the same
/
command regex to provide completions for its own custom/
commands.Application:
@
can trigger multiple completers; one may provide usernames of other users in the chat, and another may provide the@
commands available in Jupyter AI (e.g.@file
).A completion doesn't need to share a prefix with the substring that triggered completions.
$
and returns the completion\\$
. Pressing "Enter" to accept the completion allows a user to easily type a literal dollar sign instead of opening math mode. If typing math was the user's intention, typing any character other than "Enter" hides the\\$
completion and allows math to be written.Regex allows the triggering of completions to be strictly controlled. This means that "complete-able" suffixes don't need some unique identifier like
/
or@
.Application: Define a completer that matches
./
following whitespace and returns filenames for the current directory. For example, this could trigger the completions./README.md
,./pyproject.toml
, etc.Application: Define a completer that matches
:
following whitespace and returns a list of emojis.Shortcomings & risks
This design doesn't provide a clear way for a completer to open a custom UI instead of adding another completion entry.
Risk: If we don't address this shortcoming and this design makes it into Jupyter Chat v1, then we would likely need a major release to implement this in the future.
From @mlucool in Design proposal: Chat Completions API (rev. 0) #143: "I think a file completer would want a different experience than an variable one. As an example, for the @var completer, we envisioned users could click on the variable and interact with it. For example, maybe it lets the user have a preview of what will be sent or maybe it lets the user specify some parameters (e.g. you want the verbose mode of a specific variable). While these are only half-formed ideas, it's good to not restrict."
I agree that this could bring a lot of user benefit. At the same time, I have to be mindful of the engineering effort to implement this, as some stakeholders would like Jupyter AI v3.0.0 released by March. @mlucool Let's briefly discuss whether this is something we should do once you're back on Monday.
If a major revision of this design is needed, I will close this issue, revise the design, and open a new issue with a bumped revision number.
The text was updated successfully, but these errors were encountered: