-
Notifications
You must be signed in to change notification settings - Fork 653
Simplify createChatCompletion #673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR refactors the LLM client interface to provide more intuitive and specialized methods for different use cases. Here's a summary of the key changes:
- Split
createChatCompletion
into three focused methods:generateText
,generateObject
, andstreamText
across all LLM clients - Added new example
llm_usage_wordle.ts
demonstrating usage of these simplified methods - Added streaming support through
createChatCompletionStream
implementation in all clients - Standardized response formats to include both
data
andresponse
fields for better data access - Implemented comprehensive error handling, logging, and caching across new methods
Key points to review:
createChatCompletionStream
has unimplemented TODOs for response transformation and validation in several clients- Inconsistent error handling patterns between external clients (langchain, aisdk) and core clients
- Missing tests for Cerebras implementation due to API key unavailability
- Debug console.log statements need cleanup in langchain client
- Response type safety needs improvement in some implementations
💡 (2/5) Greptile learns from your feedback when you react with 👍/👎!
11 file(s) reviewed, 25 comment(s)
Edit PR Review Bot Settings | Greptile
schema: z.object({ | ||
guess: z | ||
.string() | ||
.describe("The 5-letter word that would be the best guess"), | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Schema should validate that guess is exactly 5 letters using .length(5) since this is for Wordle
schema: z.object({ | |
guess: z | |
.string() | |
.describe("The 5-letter word that would be the best guess"), | |
}), | |
schema: z.object({ | |
guess: z | |
.string() | |
.length(5) | |
.describe("The 5-letter word that would be the best guess"), | |
}), |
logger, | ||
retries = 3, | ||
}: CreateChatCompletionOptions): Promise<T> { | ||
console.log(logger, retries); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Remove debug console.log statement
console.log(logger, retries); |
const response = await generateObject({ | ||
model: this.model, | ||
prompt: prompt, | ||
schema: schema, | ||
...options, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: options are spread after required parameters, which could accidentally override them
try { | ||
// Log the generation attempt | ||
logger({ | ||
category: "anthropic", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Category 'anthropic' is incorrect for LangchainClient - should be 'langchain'
category: "anthropic", | |
category: "langchain", |
category: "anthropic", | ||
message: "Text generation successful", | ||
level: 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Log message indicates 'Text generation' but this is object generation
const objResponse = { | ||
object: generatedObject, | ||
finishReason: response.response.choices[0].finish_reason, | ||
usage: response.response.usage, | ||
...response, | ||
} as T; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Spreading response after setting specific fields may override the object/finishReason/usage properties
export interface LLMObjectResponse extends BaseResponse { | ||
data: Record<string, unknown>; | ||
usage: UsageMetrics; | ||
response: LLMResponse; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: LLMObjectResponse includes both 'data' and 'response' fields which seem redundant since 'response' already contains the full LLM response. This could lead to data inconsistency.
// Object Response type that can include LLM properties | ||
export interface ObjectResponse extends BaseResponse { | ||
object: string; | ||
choices?: LLMChoice[]; | ||
usage?: UsageMetrics; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: ObjectResponse makes usage and choices optional but doesn't include the 'data' field from LLMObjectResponse. This inconsistency between response types could cause confusion.
// TODO: O1 models - parse the tool call response manually and add it to the response | ||
// TODO: Response model validation | ||
// TODO: Caching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Missing implementation for tool calls, response validation, and caching in streaming mode could cause issues with O1 models
!response.data || | ||
response.data.length === 0 || | ||
response.data == undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: response.data.length check is unnecessary since response.data is an object, not an array
!response.data || | |
response.data.length === 0 || | |
response.data == undefined | |
!response.data || | |
response.data === null || | |
response.data === undefined |
why
Simplified
createChatCompletion
intogenerateText
,generateObject
andstreamText
similar to Vercel AI SDK.what changed
Added
generateText
,generateObject
andstreamText
functions toLLMClient
for all supported providers -OpenAI
,Anthropic
,Groq
,Google
,Cerebras
and external clients -aisdk
,customOpenAI
,langchain
test plan
Tested
generateText
,generateObject
andstreamText
with:OpenAI
Anthropic
Groq
Google
aisdk
customOpenAI
Unable to test for
Cerebras
due to unavailability of API key