-
Notifications
You must be signed in to change notification settings - Fork 59.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test stt #5634
base: main
Are you sure you want to change the base?
test stt #5634
Conversation
@DDMeaqua is attempting to deploy a commit to the NextChat Team on Vercel. A member of the Team first needs to authorize it. |
WalkthroughThe changes in this pull request enhance the chat application's functionality by integrating voice recognition features. Modifications include the addition of state variables for managing speech recognition, the implementation of methods for starting and stopping voice input, and the incorporation of a new button for toggling voice input in the chat interface. Additionally, the configuration for text-to-speech (TTS) and speech-to-text (STT) functionalities has been updated, and a new API interface for transcription has been introduced. Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
app/components/chat.tsx (2)
1565-1565
: Use consistent language in code commentsThe comment on line 1565 is in Chinese (
// 快捷键
), whereas the rest of the code comments are in English. For consistency and better collaboration, it's recommended to use a single language for all code comments.
Line range hint
1566-1597
: Prevent multiple event listener registrationsThe
useEffect
hook adds an event listener towindow
and removes it on cleanup. However, since the dependencies array includesmessages
,chatStore
, andnavigate
, the effect will re-run and re-register the event listener whenever any of these dependencies change. This can lead to multiple event listeners being attached, causing unexpected behavior or memory leaks.To fix this, consider adjusting the dependencies array to an empty array
[]
if thehandleKeyDown
function does not rely on changing variables. Alternatively, you can memoize thehandleKeyDown
function usinguseCallback
with the necessary dependencies.Apply this diff to adjust the dependencies:
- }, [messages, chatStore, navigate]); + }, []);
Your build has completed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (6)
app/client/api.ts (1)
66-74
: LGTM! Consider adding a range for thetemperature
property.The
TranscriptionOptions
interface is well-defined and covers essential options for transcription. The use of specific types formodel
andresponse_format
enhances type safety.Consider adding a more specific range for the
temperature
property, as it typically has a limited range (e.g., 0 to 1). You could use a type like this:temperature?: number & { __brand: 'Temperature' };And then define a type guard function:
function isValidTemperature(value: number): value is number & { __brand: 'Temperature' } { return value >= 0 && value <= 1; }This approach would provide better type safety and documentation for the expected range of the
temperature
property.app/constant.ts (2)
Line range hint
268-273
: LGTM: Comprehensive TTS defaults addedThe new constants for Text-to-Speech (TTS) provide a robust set of default settings, covering engines, models, and voices. This addition enhances the flexibility and configurability of the TTS feature.
Consider adding a brief comment explaining the purpose of these constants and their relationship to the TTS feature for improved code documentation.
274-276
: LGTM: STT defaults added with good flexibilityThe new constants for Speech-to-Text (STT) provide a good set of default settings, including multiple engine options. This addition enhances the flexibility of the STT feature.
Consider adding a brief comment explaining the purpose of these constants and their relationship to the STT feature for improved code documentation.
app/components/settings.tsx (1)
86-86
: LGTM! Consider grouping related configurations.The addition of the STTConfigList component and its integration into the Settings component looks good. It follows the existing pattern used for TTSConfigList, which maintains consistency in the codebase.
To improve code organization, consider grouping related configurations together. You could move the STTConfigList next to the TTSConfigList, as they are both related to voice interactions. This would make it easier for developers to find and maintain related features.
<List> <TTSConfigList ttsConfig={config.ttsConfig} updateConfig={(updater) => { const ttsConfig = { ...config.ttsConfig }; updater(ttsConfig); config.update((config) => (config.ttsConfig = ttsConfig)); }} /> + <STTConfigList + sttConfig={config.sttConfig} + updateConfig={(updater) => { + const sttConfig = { ...config.sttConfig }; + updater(sttConfig); + config.update((config) => (config.sttConfig = sttConfig)); + }} + /> </List> - <List> - <STTConfigList - sttConfig={config.sttConfig} - updateConfig={(updater) => { - const sttConfig = { ...config.sttConfig }; - updater(sttConfig); - config.update((config) => (config.sttConfig = sttConfig)); - }} - /> - </List>Also applies to: 1707-1716
app/components/chat.tsx (1)
553-589
: Speech recognition functionality implemented.The addition of speech recognition functionality is well-implemented, with proper state management and error handling. The code considers different STT engines and handles them appropriately.
However, there's a minor improvement to be made:
Consider removing the
console.log
statement on line 585 before deploying to production:- console.log(finalTranscript);
app/client/platforms/openai.ts (1)
185-193
: Enhance readability of conditionalformData.append
statementsThe conditional
formData.append
statements could be formatted for better readability and consistency.Consider using braces and proper indentation:
if (options.language) { formData.append("language", options.language); } if (options.prompt) { formData.append("prompt", options.prompt); } if (options.response_format) { formData.append("response_format", options.response_format); } if (options.temperature) { formData.append("temperature", options.temperature.toString()); }Alternatively, streamline the appending of optional parameters:
+const optionalParams = ['language', 'prompt', 'response_format', 'temperature']; +optionalParams.forEach((param) => { + if (options[param]) { + formData.append(param, options[param].toString()); + } +});
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
- app/client/api.ts (2 hunks)
- app/client/platforms/openai.ts (1 hunks)
- app/components/chat.tsx (7 hunks)
- app/components/settings.tsx (2 hunks)
- app/constant.ts (2 hunks)
- app/store/config.ts (4 hunks)
🧰 Additional context used
🔇 Additional comments (16)
app/client/api.ts (2)
111-111
: LGTM! Thetranscription
method is a good addition.The new
transcription
method in theLLMApi
abstract class is well-defined and consistent with theTranscriptionOptions
interface. The use ofPromise<string>
as the return type is appropriate for an asynchronous transcription operation.This addition enhances the API's capabilities and provides a clear contract for implementing transcription functionality across different LLM providers.
Line range hint
66-111
: Summary: Transcription feature addition is well-implementedThe changes to
app/client/api.ts
successfully introduce transcription capabilities to the API. The newTranscriptionOptions
interface and thetranscription
method in theLLMApi
abstract class are well-structured and consistent with the existing codebase. These additions follow good TypeScript practices and enhance the API's functionality.The implementation provides a solid foundation for integrating transcription features across different LLM providers. Great job on this feature addition!
app/constant.ts (3)
153-153
: LGTM: Addition of TranscriptionPathThe new
TranscriptionPath
constant in theOpenaiPath
object is correctly defined and aligns with OpenAI's API structure for audio transcription endpoints. This addition supports the integration of speech-to-text functionality.
Line range hint
153-276
: Summary: Good additions for speech-related features, minor improvements suggestedThe changes to
app/constant.ts
successfully introduce new constants for Text-to-Speech (TTS) and Speech-to-Text (STT) functionality. These additions provide good flexibility and configuration options for speech-related features. Consider adding brief comments to explain the purpose of these new constants for improved code documentation. Additionally, please provide clarification on the need for a Firefox-specific STT engine default.
276-276
: Please clarify the need for a Firefox-specific STT engineThe addition of
FIREFOX_DEFAULT_STT_ENGINE
suggests a different default Speech-to-Text engine for Firefox browsers. Could you provide more context on why Firefox requires a different default engine? This information would be valuable for understanding the rationale behind this browser-specific configuration.To verify the necessity of this Firefox-specific setting, we can check for any Firefox-related conditionals in the codebase:
✅ Verification successful
FIREFOX_DEFAULT_STT_ENGINE confirmed
The addition of
FIREFOX_DEFAULT_STT_ENGINE
is justified by the Firefox-specific conditional found in the codebase:
app/utils/speech.ts: if (isFirefox()) return;
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for Firefox-related conditionals rg --type typescript --type javascript 'if.*firefox|firefox.*if' -iLength of output: 104
Script:
#!/bin/bash # Search for Firefox-related conditionals in .ts and .js files rg --glob '*.ts' --glob '*.js' 'if.*firefox|firefox.*if' -iLength of output: 110
app/components/chat.tsx (5)
13-13
: New icon import added for voice functionality.The addition of the
VoiceWhiteIcon
import is consistent with the existing import style and suggests the introduction of voice-related features to the chat component.
76-76
: Firefox-specific utility function imported.The addition of the
isFirefox
import suggests the implementation of browser-specific functionality, which is a good practice for ensuring cross-browser compatibility.
102-104
: Speech-to-text related constants imported.The addition of STT-related constant imports (DEFAULT_STT_ENGINE, DEFAULT_TTS_ENGINE, FIREFOX_DEFAULT_STT_ENGINE) indicates the integration of speech-to-text functionality, enhancing the chat component's capabilities.
124-124
: Speech transcription APIs imported.The addition of
OpenAITranscriptionApi
andWebTranscriptionApi
imports provides multiple options for speech-to-text functionality, allowing for flexibility in the implementation.
826-834
: Voice input action added to chat interface.A new ChatAction component for voice input has been successfully integrated into the chat interface. The implementation is conditional based on the STT configuration, ensuring it's only available when enabled. The component correctly toggles between start and stop functionality, maintaining consistency with other ChatAction components.
app/store/config.ts (6)
8-9
: ImportingDEFAULT_STT_ENGINE
andDEFAULT_STT_ENGINES
is appropriate.The addition of
DEFAULT_STT_ENGINE
andDEFAULT_STT_ENGINES
to the imports ensures that the necessary constants for STT functionality are available.
25-25
: Definition ofSTTEngineType
is correct.The type
STTEngineType
correctly derives from theDEFAULT_STT_ENGINES
array, representing valid STT engine options.
87-87
: Enabling TTS by default: Confirm if this is intentional.Changing
ttsConfig.enable
totrue
activates text-to-speech by default for all users. Please verify that this aligns with user expectations and does not inadvertently impact those who may not wish to use TTS.
94-97
: AddingsttConfig
withenable: true
: Review default settings.The new
sttConfig
enables speech-to-text by default. Consider whether users should opt-in to this feature, as it may require microphone access and could have privacy implications. Ensuring explicit user consent might enhance the user experience.
104-104
: Definition ofSTTConfig
type is appropriate.The
STTConfig
type accurately referencesChatConfig["sttConfig"]
, ensuring type consistency.
134-138
:STTConfigValidator
is correctly implemented.The validator for
sttConfig
properly enforces that theengine
property conforms toSTTEngineType
, maintaining data integrity.
app/client/platforms/openai.ts
Outdated
options.onController?.(controller); | ||
|
||
try { | ||
const path = this.path(OpenaiPath.TranscriptionPath, options.model); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect number of arguments in this.path
method call
In line 200, the this.path
method is called with two arguments (OpenaiPath.TranscriptionPath, options.model
), but the path
method is defined to accept only one argument. This will result in a TypeError.
To fix this, update the path
method signature to accept the optional model
parameter:
-export class ChatGPTApi implements LLMApi {
- // Existing code...
- path(path: string): string {
+export class ChatGPTApi implements LLMApi {
+ // Existing code...
+ path(path: string, model?: string): string {
// Method implementation...
}
Ensure that you handle the model
parameter within the path
method as needed.
Committable suggestion was skipped due to low confidence.
const res = await fetch(path, payload); | ||
clearTimeout(requestTimeoutId); | ||
const json = await res.json(); | ||
return json.text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add response status check before parsing JSON
In lines 214-217, the code attempts to parse the JSON response without verifying if the request was successful. If the request fails, res.json()
might throw an error or the response might not contain the expected text
property.
Consider checking res.ok
before parsing the response and handle errors appropriately:
const res = await fetch(path, payload);
clearTimeout(requestTimeoutId);
+ if (!res.ok) {
+ const errorText = await res.text();
+ console.error(`[Response] Transcription request failed: ${errorText}`);
+ throw new Error(`Transcription request failed with status ${res.status}`);
+ }
const json = await res.json();
return json.text;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const res = await fetch(path, payload); | |
clearTimeout(requestTimeoutId); | |
const json = await res.json(); | |
return json.text; | |
const res = await fetch(path, payload); | |
clearTimeout(requestTimeoutId); | |
if (!res.ok) { | |
const errorText = await res.text(); | |
console.error(`[Response] Transcription request failed: ${errorText}`); | |
throw new Error(`Transcription request failed with status ${res.status}`); | |
} | |
const json = await res.json(); | |
return json.text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (15)
app/client/platforms/moonshot.ts (2)
67-69
: Implement transcription methodThe addition of the
transcription
method is consistent with theLLMApi
interface. However, to improve the placeholder implementation:
- Add a TODO comment to remind about the implementation.
- Consider adding a console warning to alert developers during development.
Here's a suggested improvement:
transcription(options: TranscriptionOptions): Promise<string> { - throw new Error("Method not implemented."); + // TODO: Implement transcription method + console.warn("transcription method not implemented yet"); + return Promise.reject(new Error("Method not implemented.")); }
Line range hint
1-70
: Summary: Transcription feature additionThe changes in this file lay the groundwork for adding transcription capabilities to the
MoonshotApi
class. This aligns with the PR objectives of introducing a new feature and the AI summary mentioning voice recognition features.However, the current implementation is a placeholder. To complete this feature:
- Implement the
transcription
method with the actual logic for transcribing audio.- Update the
speech
method if it's related to this feature.- Add unit tests to ensure the new functionality works as expected.
- Update any relevant documentation to reflect these new capabilities.
Consider how this transcription feature will integrate with the rest of the application. Will it require updates to the user interface or other parts of the codebase?
app/client/platforms/bytedance.ts (2)
87-89
: Method signature looks good, but needs implementation and documentation.The
transcription
method has been correctly added to theDoubaoApi
class with the appropriate signature. However, it's currently not implemented.Consider the following suggestions:
- Add a TODO comment explaining the intended functionality and timeline for implementation.
- Provide documentation for the method, including parameter description and expected return value.
- If possible, implement a basic version of the method or provide a more informative error message.
Would you like assistance in drafting the documentation or a basic implementation?
Issues Found: Unimplemented
transcription
method is being used across the codebase.
app/utils/speech.ts
:const transcription = await llm.transcription({ file: audioBlob });
app/client/api.ts
:abstract transcription(options: TranscriptionOptions): Promise<string>;
app/client/platforms/tencent.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/iflytek.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/google.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/baidu.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/anthropic.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/alibaba.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/bytedance.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/moonshot.ts
:transcription(options: TranscriptionOptions): Promise<string> {
app/client/platforms/openai.ts
:async transcription(options: TranscriptionOptions): Promise<string> {
It is crucial to implement the
transcription
method to ensure that all dependent functionalities operate correctly. Alternatively, if the implementation is pending, consider adding proper error handling or temporarily disabling the usages to prevent runtime issues.🔗 Analysis chain
Line range hint
1-89
: Overall impact: Transcription capability added but not yet implemented.The changes introduce a new
transcription
method to theDoubaoApi
class, expanding its capabilities. However, the method is not yet implemented.Suggestions for next steps:
- Implement the
transcription
method.- Update any relevant documentation or API references to include this new capability.
- Consider adding unit tests for the new method once implemented.
- Review other parts of the codebase that might need to be updated to utilize this new functionality.
To ensure no existing code is expecting this method to be functional, we can run the following check:
This will help identify any premature usage of the unimplemented method.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any usage of the transcription method rg 'transcription\(' --type typescriptLength of output: 75
Script:
#!/bin/bash # Search for any usage of the transcription method rg 'transcription\(' --type tsLength of output: 1109
app/client/platforms/iflytek.ts (1)
Line range hint
27-70
: Consider a plan for completing the SparkApi implementation.The
SparkApi
class is partially implemented, with some methods likechat
being fully functional while others (speech
,transcription
) are not yet implemented. To ensure a complete and consistent implementation:
- Create a roadmap for implementing all remaining methods.
- Prioritize the implementation of these methods based on project requirements.
- Consider adding TODO comments with issue/ticket numbers for tracking purposes.
- Ensure that error messages for unimplemented methods are more informative, possibly indicating that the feature is not supported by this API if that's the case.
- Review the
LLMApi
interface to confirm that all required methods are present in this implementation.Would you like assistance in creating this implementation roadmap or updating the error messages for unimplemented methods?
app/client/platforms/tencent.ts (1)
97-99
: LGTM: Addition of transcription methodThe
transcription
method is correctly added with the appropriate signature. However, it's currently not implemented.Would you like assistance in implementing this method? We could discuss the requirements and draft an initial implementation.
app/client/platforms/baidu.ts (1)
Line range hint
1-287
: Summary of changes and next stepsThe changes in this file are focused on adding transcription capability to the
ErnieApi
class. The additions are consistent with the PR objectives and don't introduce any issues in the existing code. However, thetranscription
method needs to be implemented to complete this feature.Next steps:
- Implement the
transcription
method in theErnieApi
class.- Add appropriate error handling and logging to the new method.
- Consider adding unit tests for the new functionality.
- Update any relevant documentation or comments to reflect these changes.
app/client/platforms/google.ts (2)
72-74
: Consider adding a TODO comment and improving the error message.While it's common to add placeholder methods when implementing new interface requirements, it's beneficial to:
- Add a TODO comment to track the need for implementation.
- Provide a more informative error message to aid in debugging if the method is accidentally called before implementation.
Consider updating the method as follows:
transcription(options: TranscriptionOptions): Promise<string> { - throw new Error("Method not implemented."); + // TODO: Implement transcription method + throw new Error("Transcription method not yet implemented in GeminiProApi"); }This change will help track the pending implementation and provide clearer error information if the method is called prematurely.
Line range hint
1-324
: Summary and Next StepsThe changes to
app/client/platforms/google.ts
lay the groundwork for adding transcription functionality to theGeminiProApi
class. While the implementation is not complete, the structure is in place with the new import and method stub.To move this feature forward:
- Implement the
transcription
method with the actual logic for transcription.- Add unit tests to cover the new functionality.
- Update the PR description with details about the new transcription feature and its intended use.
- Consider adding documentation for the new method, especially if it introduces new behavior or requirements for users of the
GeminiProApi
class.🧰 Tools
🪛 Biome
[error] 77-77: This aliasing of this is unnecessary.
Arrow functions inherits
this
from their enclosing scope.
Safe fix: Use this instead of an alias.(lint/complexity/noUselessThisAlias)
app/client/platforms/anthropic.ts (1)
Line range hint
1-424
: Summary: Transcription feature initiated but incomplete.The changes in this PR introduce the groundwork for a transcription feature in the
ClaudeApi
class. While the necessary import and method signature have been added, the implementation is still pending. To complete this feature:
- Implement the
transcription
method logic.- Add error handling and appropriate logging.
- Update the API documentation to reflect the new capability.
- Add unit tests for the new functionality.
- Consider adding integration tests if this feature interacts with external services.
These steps will ensure the new feature is fully functional and maintainable.
Consider whether this transcription feature should be a separate service or module, depending on its complexity and potential for reuse across different parts of the application.
app/locales/cn.ts (1)
541-550
: LGTM! Consider adding a description for the STT feature.The new STT (Speech-to-Text) section is well-structured and consistent with the existing localization patterns. The translations are appropriate for the Chinese language.
For improved consistency with other sections, consider adding a brief description of the STT feature, similar to how the TTS section has a description at line 530.
You could add a description like this:
STT: { + Description: { + Title: "语音转文本", + SubTitle: "将语音输入转换为文本", + }, Enable: { Title: "启用语音转文本", SubTitle: "启用语音转文本", }, Engine: { Title: "转换引擎", SubTitle: "音频转换引擎", }, },app/components/chat.tsx (4)
13-13
: Speech recognition feature looks good, consider error handling.The addition of speech recognition functionality is a great enhancement to the chat interface. The implementation handles different speech recognition engines and provides appropriate fallbacks.
A few suggestions for improvement:
- Consider adding error handling for cases where speech recognition fails to initialize or encounters runtime errors.
- It might be helpful to add user feedback (e.g., a visual indicator) when speech recognition is active.
Consider adding error handling:
const startListening = async () => { if (speechApi) { try { await speechApi.start(); setIsListening(true); } catch (error) { console.error("Failed to start speech recognition:", error); showToast("Failed to start speech recognition. Please try again."); } } };Also applies to: 554-590
827-835
: Speech input toggle looks good, consider adding aria-label.The integration of the speech recognition feature into the ChatActions component is well-implemented. The conditional rendering based on the config is a good practice.
Suggestion for improvement:
Add an aria-label to the ChatAction button for better accessibility.Consider adding an aria-label to the ChatAction:
<ChatAction onClick={async () => isListening ? await stopListening() : await startListening() } text={isListening ? Locale.Chat.StopSpeak : Locale.Chat.StartSpeak} icon={<VoiceWhiteIcon />} aria-label={isListening ? "Stop speech recognition" : "Start speech recognition"} />
Line range hint
1561-1637
: Keyboard shortcuts are a great addition, consider extracting to a separate function.The implementation of keyboard shortcuts significantly improves the application's usability. The shortcuts are well-thought-out and consider different operating systems.
Suggestion for improvement:
Consider extracting the keyboard shortcut logic into a separate function for better code organization and reusability.Extract the keyboard shortcut logic:
const handleKeyboardShortcuts = useCallback((event: KeyboardEvent) => { const isMac = navigator.platform.toUpperCase().indexOf("MAC") >= 0; const modifierKey = isMac ? event.metaKey : event.ctrlKey; if (modifierKey && event.shiftKey && event.key.toLowerCase() === "o") { event.preventDefault(); chatStore.newSession(); navigate(Path.Chat); } else if (event.shiftKey && event.key.toLowerCase() === "escape") { event.preventDefault(); inputRef.current?.focus(); } // ... (other shortcuts) }, [messages, chatStore, navigate]); useEffect(() => { window.addEventListener("keydown", handleKeyboardShortcuts); return () => { window.removeEventListener("keydown", handleKeyboardShortcuts); }; }, [handleKeyboardShortcuts]);
Line range hint
1638-1930
: Image paste handling is a great addition, consider adding visual feedback.The implementation of image paste handling for vision models is a valuable feature that enhances the user experience. The code correctly checks if the current model supports vision before processing the pasted image.
Suggestion for improvement:
Consider adding visual feedback when an image is successfully pasted, such as a brief toast notification or a subtle animation.Add visual feedback for successful image paste:
const handlePaste = useCallback( async (event: React.ClipboardEvent<HTMLTextAreaElement>) => { // ... existing code ... if (file) { // ... existing code ... setAttachImages(images); showToast("Image successfully attached"); // Add this line } // ... existing code ... }, [attachImages, chatStore], );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (12)
- app/client/platforms/alibaba.ts (2 hunks)
- app/client/platforms/anthropic.ts (2 hunks)
- app/client/platforms/baidu.ts (2 hunks)
- app/client/platforms/bytedance.ts (2 hunks)
- app/client/platforms/google.ts (2 hunks)
- app/client/platforms/iflytek.ts (2 hunks)
- app/client/platforms/moonshot.ts (2 hunks)
- app/client/platforms/openai.ts (2 hunks)
- app/client/platforms/tencent.ts (2 hunks)
- app/components/chat.tsx (7 hunks)
- app/locales/cn.ts (1 hunks)
- app/locales/en.ts (1 hunks)
🧰 Additional context used
🔇 Additional comments (13)
app/client/platforms/moonshot.ts (1)
23-23
: LGTM: Import of TranscriptionOptionsThe addition of
TranscriptionOptions
to the import statement is consistent with the implementation of the newtranscription
method.app/client/platforms/bytedance.ts (1)
17-17
: LGTM: Import statement updated correctly.The addition of
TranscriptionOptions
to the import statement is consistent with the newtranscription
method implementation.app/client/platforms/iflytek.ts (1)
16-16
: LGTM: Import statement addition is appropriate.The addition of
TranscriptionOptions
to the import statement is consistent with the newtranscription
method being introduced. This ensures type safety and maintains consistency with the LLMApi interface.app/client/platforms/alibaba.ts (2)
16-16
: LGTM: Import statement added correctly.The
TranscriptionOptions
import is correctly added and is consistent with the newtranscription
method being implemented.
Line range hint
1-295
: Summary: Incomplete implementation of transcription feature.This PR introduces the groundwork for adding a transcription feature to the
QwenApi
class, which aligns with the PR objective of adding a new feature (test stt). However, the implementation is currently incomplete:
- The
TranscriptionOptions
import has been added correctly.- The
transcription
method has been added with the correct signature.- The actual implementation of the transcription functionality is missing.
To fully meet the PR objectives:
- Complete the implementation of the
transcription
method.- Add any necessary helper methods or utilities for transcription.
- Update the PR description with details about the new transcription feature and its usage.
- Consider adding tests for the new functionality.
To ensure no other parts of the codebase are expecting this method to be fully implemented, run the following script:
This will help identify if there are any other parts of the codebase that might be affected by this incomplete implementation.
✅ Verification successful
Verified: No existing usages of
QwenApi.transcription
or imports ofTranscriptionOptions
found.The implementation of the
transcription
method in theQwenApi
class is currently incomplete. However, verification confirms that there are no existing usages of this method or imports ofTranscriptionOptions
in the codebase. This means that adding the unimplemented method does not impact other parts of the project at this time.To fully meet the PR objectives:
- Complete the implementation of the
transcription
method.- Add any necessary helper methods or utilities for transcription.
- Update the PR description with details about the new transcription feature and its usage.
- Consider adding tests for the new functionality.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any usage of the transcription method in the codebase # Search for any calls to transcription method of QwenApi echo "Searching for usage of QwenApi.transcription:" rg --type typescript --type javascript 'QwenApi.*transcription' # Search for any imports of TranscriptionOptions that might be using it echo "Searching for imports of TranscriptionOptions:" rg --type typescript --type javascript 'import.*TranscriptionOptions'Length of output: 411
Script:
#!/bin/bash # Description: Check for any usage of the transcription method in the codebase # Search for any calls to transcription method of QwenApi echo "Searching for usage of QwenApi.transcription in *.ts and *.js files:" rg 'QwenApi\.transcription' -g '*.ts' -g '*.js' # Search for any imports of TranscriptionOptions in *.ts and *.js files echo "Searching for imports of TranscriptionOptions in *.ts and *.js files:" rg 'import.*TranscriptionOptions' -g '*.ts' -g '*.js'Length of output: 401
app/client/platforms/tencent.ts (2)
12-12
: LGTM: Import of TranscriptionOptionsThe addition of
TranscriptionOptions
to the import list is correct and necessary for the newtranscription
method.
Line range hint
1-99
: Incomplete implementation of speech-related featuresThe changes in this file lay the groundwork for speech-to-text and text-to-speech functionality, which aligns with the PR objective of adding a new feature. However, the actual implementation of these features is missing.
To ensure the PR meets its objectives, please clarify:
- Is this PR intended to be a partial implementation or a complete feature?
- Are there other files in this PR that contain the actual implementations?
If this is meant to be a complete feature, consider updating the PR description to reflect the current state of the implementation.
app/client/platforms/baidu.ts (1)
18-18
: LGTM: Import statement addition is appropriate.The addition of
TranscriptionOptions
to the import statement is consistent with the newtranscription
method being added to theErnieApi
class.app/client/platforms/google.ts (1)
9-9
: LGTM: Import statement updated correctly.The addition of
TranscriptionOptions
to the import statement is consistent with the newtranscription
method being added to theGeminiProApi
class.app/client/platforms/anthropic.ts (1)
2-8
: LGTM: Import changes are consistent with new feature addition.The addition of
TranscriptionOptions
to the imports is in line with the introduction of transcription functionality to theClaudeApi
class.app/client/platforms/openai.ts (2)
Line range hint
1-223
: LGTM with minor suggestionsThe implementation of the
transcription
method and the updates to theChatGPTApi
class are well-done and consistent with the existing code structure. The new functionality for audio transcription is a valuable addition to the API client.Please consider the suggested improvements in the previous comments to enhance error handling, response parsing, and code consistency. Once these minor adjustments are made, the changes will be ready for merging.
201-201
: Verify thepath
method implementation for the newmodel
parameterThe
path
method signature has been updated to accept an optionalmodel
parameter, which is used in thetranscription
method. However, the implementation of thepath
method is not visible in this code snippet.Please ensure that the
path
method correctly handles the newmodel
parameter. You can use the following script to verify the implementation:If the
path
method implementation is not found or doesn't handle themodel
parameter, please update it accordingly.app/components/chat.tsx (1)
Line range hint
1-2099
: Overall, excellent enhancements to the chat component.The changes made to this file significantly improve the functionality and user experience of the chat component. Key improvements include:
- Addition of speech recognition capabilities
- Implementation of useful keyboard shortcuts
- Enhanced image handling for vision models
- Improved accessibility and user feedback
These features are well-integrated into the existing component structure, maintaining code consistency and readability. The new capabilities make the chat interface more versatile and user-friendly.
Suggestions for future improvements:
- Consider adding unit tests for the new functionality
- Explore opportunities for further code modularization, especially for complex logic blocks
transcription(options: TranscriptionOptions): Promise<string> { | ||
throw new Error("Method not implemented."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the transcription method.
The transcription
method has been added with the correct signature, which is good. However, it currently throws a "Method not implemented" error.
To complete this feature:
- Implement the transcription logic using the
options
parameter. - Ensure proper error handling.
- Add unit tests for this new method.
- Update any relevant documentation.
Would you like assistance in implementing this method or creating a task to track its implementation?
transcription(options: TranscriptionOptions): Promise<string> { | ||
throw new Error("Method not implemented."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the transcription
method.
The transcription
method is currently a placeholder. To complete this feature:
- Implement the actual transcription logic.
- Handle potential errors and edge cases.
- Ensure the implementation adheres to the
TranscriptionOptions
interface. - Add appropriate error handling and logging.
- Consider adding unit tests for this new functionality.
Would you like assistance in implementing the transcription
method or creating a GitHub issue to track this task?
|
||
transcription(options: TranscriptionOptions): Promise<string> { | ||
throw new Error("Method not implemented."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider implementing both speech and transcription methods
Both the speech
and transcription
methods are currently unimplemented. To fully realize the speech-to-text and text-to-speech functionality mentioned in the PR summary, both methods should be implemented.
Would you like assistance in drafting implementations for both methods? This would ensure the full functionality of the new feature.
transcription(options: TranscriptionOptions): Promise<string> { | ||
throw new Error("Method not implemented."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the transcription
method and add documentation.
The transcription
method has been added with the correct signature, but it currently throws a "Method not implemented" error. Please implement the method to handle transcription requests. Additionally, consider adding documentation to explain the purpose and expected behavior of this method.
Would you like assistance in implementing the transcription
method or creating documentation for it?
transcription(options: TranscriptionOptions): Promise<string> { | ||
throw new Error("Method not implemented."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the transcription
method.
The transcription
method has been added to the ClaudeApi
class, which is a good start for introducing transcription functionality. However, the method is currently not implemented.
To complete this feature:
- Implement the transcription logic using the Anthropic API or any other appropriate service.
- Handle potential errors and edge cases.
- Add unit tests to verify the functionality.
Would you like assistance in implementing this method or creating a GitHub issue to track this task?
app/client/platforms/openai.ts
Outdated
async transcription(options: TranscriptionOptions): Promise<string> { | ||
const formData = new FormData(); | ||
formData.append("file", options.file, "audio.wav"); | ||
formData.append("model", options.model ?? "whisper-1"); | ||
if (options.language) formData.append("language", options.language); | ||
if (options.prompt) formData.append("prompt", options.prompt); | ||
if (options.response_format) | ||
formData.append("response_format", options.response_format); | ||
if (options.temperature) | ||
formData.append("temperature", options.temperature.toString()); | ||
|
||
console.log("[Request] openai audio transcriptions payload: ", options); | ||
|
||
const controller = new AbortController(); | ||
options.onController?.(controller); | ||
|
||
try { | ||
const path = this.path(OpenaiPath.TranscriptionPath, options.model); | ||
const headers = getHeaders(true); | ||
const payload = { | ||
method: "POST", | ||
body: formData, | ||
signal: controller.signal, | ||
headers: headers, | ||
}; | ||
|
||
// make a fetch request | ||
const requestTimeoutId = setTimeout( | ||
() => controller.abort(), | ||
REQUEST_TIMEOUT_MS, | ||
); | ||
const res = await fetch(path, payload); | ||
clearTimeout(requestTimeoutId); | ||
const json = await res.json(); | ||
return json.text; | ||
} catch (e) { | ||
console.log("[Request] failed to make a audio transcriptions request", e); | ||
throw e; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling and response parsing in the transcription
method
The transcription
method implementation looks good overall, but there are a few areas for improvement:
- Error handling could be more specific to provide better debugging information.
- The response parsing assumes the presence of a
text
property without checking. - The
headers
variable is declared but not used in the fetch call.
Consider applying the following improvements:
- Enhance error handling:
} catch (e) {
- console.log("[Request] failed to make a audio transcriptions request", e);
- throw e;
+ console.error("[Request] failed to make an audio transcriptions request", e);
+ throw new Error(`Transcription request failed: ${e.message}`);
}
- Add response status check and error handling:
const res = await fetch(path, payload);
clearTimeout(requestTimeoutId);
+if (!res.ok) {
+ throw new Error(`Transcription request failed with status ${res.status}`);
+}
const json = await res.json();
-return json.text;
+return json.text ?? '';
- Use the
headers
variable in the fetch call:
const payload = {
method: "POST",
body: formData,
signal: controller.signal,
- headers: headers,
+ headers,
};
These changes will improve the robustness and reliability of the transcription
method.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async transcription(options: TranscriptionOptions): Promise<string> { | |
const formData = new FormData(); | |
formData.append("file", options.file, "audio.wav"); | |
formData.append("model", options.model ?? "whisper-1"); | |
if (options.language) formData.append("language", options.language); | |
if (options.prompt) formData.append("prompt", options.prompt); | |
if (options.response_format) | |
formData.append("response_format", options.response_format); | |
if (options.temperature) | |
formData.append("temperature", options.temperature.toString()); | |
console.log("[Request] openai audio transcriptions payload: ", options); | |
const controller = new AbortController(); | |
options.onController?.(controller); | |
try { | |
const path = this.path(OpenaiPath.TranscriptionPath, options.model); | |
const headers = getHeaders(true); | |
const payload = { | |
method: "POST", | |
body: formData, | |
signal: controller.signal, | |
headers: headers, | |
}; | |
// make a fetch request | |
const requestTimeoutId = setTimeout( | |
() => controller.abort(), | |
REQUEST_TIMEOUT_MS, | |
); | |
const res = await fetch(path, payload); | |
clearTimeout(requestTimeoutId); | |
const json = await res.json(); | |
return json.text; | |
} catch (e) { | |
console.log("[Request] failed to make a audio transcriptions request", e); | |
throw e; | |
} | |
} | |
async transcription(options: TranscriptionOptions): Promise<string> { | |
const formData = new FormData(); | |
formData.append("file", options.file, "audio.wav"); | |
formData.append("model", options.model ?? "whisper-1"); | |
if (options.language) formData.append("language", options.language); | |
if (options.prompt) formData.append("prompt", options.prompt); | |
if (options.response_format) | |
formData.append("response_format", options.response_format); | |
if (options.temperature) | |
formData.append("temperature", options.temperature.toString()); | |
console.log("[Request] openai audio transcriptions payload: ", options); | |
const controller = new AbortController(); | |
options.onController?.(controller); | |
try { | |
const path = this.path(OpenaiPath.TranscriptionPath, options.model); | |
const headers = getHeaders(true); | |
const payload = { | |
method: "POST", | |
body: formData, | |
signal: controller.signal, | |
headers, | |
}; | |
// make a fetch request | |
const requestTimeoutId = setTimeout( | |
() => controller.abort(), | |
REQUEST_TIMEOUT_MS, | |
); | |
const res = await fetch(path, payload); | |
clearTimeout(requestTimeoutId); | |
if (!res.ok) { | |
throw new Error(`Transcription request failed with status ${res.status}`); | |
} | |
const json = await res.json(); | |
return json.text ?? ''; | |
} catch (e) { | |
console.error("[Request] failed to make an audio transcriptions request", e); | |
throw new Error(`Transcription request failed: ${e.message}`); | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (3)
app/store/config.ts (1)
136-141
: Add language validation to STTConfigValidatorThe addition of
STTConfigValidator
is good, but it could be more comprehensive:
- The
engine
validation looks correct.- Consider adding validation for the
language
property to ensure it matches the expected format or is within a set of valid languages.Here's a suggested improvement:
export const STTConfigValidator = { engine(x: string) { return x as STTEngineType; }, language(x: string) { // Add logic to validate the language // For example: // return VALID_LANGUAGES.includes(x) ? x : DEFAULT_STT_LANGUAGE; return x; }, };Replace the comment with appropriate validation logic for the
language
property.app/constant.ts (1)
274-289
: LGTM with suggestion: STT constants additionThe new constants for speech-to-text (STT) functionality are well-defined and provide a comprehensive set of options for engines and languages. However, consider the following suggestion:
- The default language (
DEFAULT_STT_LANGUAGE
) is set to "zh-CN" (Chinese Simplified). Consider using a more neutral default, such as "en-US", or making it configurable based on the user's locale or preferences.Would you like assistance in implementing a more flexible default language selection?
app/components/chat.tsx (1)
589-594
: Function to handle speech recognition results.The
onRecognitionEnd
function processes the final transcript and updates the user input. It also resets theisTranscription
state if not using the default engine.However, there's a minor issue:
Consider removing or replacing the
console.log
statement with a more appropriate logging mechanism for production code:- console.log(finalTranscript); + // Consider using a proper logging mechanism here
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (2)
app/icons/vioce-close.svg
is excluded by!**/*.svg
app/icons/vioce-open.svg
is excluded by!**/*.svg
📒 Files selected for processing (6)
- app/components/chat.tsx (9 hunks)
- app/components/stt-config.tsx (1 hunks)
- app/constant.ts (2 hunks)
- app/locales/cn.ts (2 hunks)
- app/locales/en.ts (2 hunks)
- app/store/config.ts (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- app/locales/cn.ts
- app/locales/en.ts
🧰 Additional context used
🪛 Biome
app/components/stt-config.tsx
[error] 23-23: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 34-36: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 61-62: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
🔇 Additional comments (18)
app/components/stt-config.tsx (2)
1-11
: LGTM: Imports and component declaration are well-structured.The imports are appropriate for the component's functionality, and the
STTConfigList
component is correctly exported with properly typed props.
1-75
: Overall assessment: Well-implemented component with minor improvements needed.The
STTConfigList
component is well-structured and correctly implements the speech-to-text configuration functionality. It handles browser compatibility and provides a good user interface for managing STT settings.Key points:
- The component correctly uses React hooks and props for state management.
- Browser compatibility is handled appropriately, especially for Firefox.
- The UI is clear and easy to understand.
Areas for improvement:
- Refactor state updates to avoid assignments in expressions, as noted in previous comments.
- Ensure consistent use of localization throughout the component.
- Consider adding prop-types or TypeScript for better type checking of props.
Once these minor issues are addressed, the component will be in excellent shape and ready for production use.
🧰 Tools
🪛 Biome
[error] 23-23: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 34-36: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 61-62: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
app/store/config.ts (4)
8-10
: LGTM: New STT-related constants importedThe addition of
DEFAULT_STT_ENGINE
,DEFAULT_STT_ENGINES
, andDEFAULT_STT_LANGUAGE
constants is consistent with the existing structure and supports the new Speech-to-Text functionality.
26-26
: LGTM: New STTEngineType addedThe addition of
STTEngineType
is consistent with the existing type definitions and provides type safety for STT engine selection.
106-106
: LGTM: New STTConfig type addedThe addition of
STTConfig
type is consistent with the existing structure and provides type safety for the STT configuration.
88-99
: Review TTS default setting and approve STT config additionThe changes look good overall:
- The new
sttConfig
is well-structured and uses appropriate default values.- The addition of STT configuration aligns with the PR objective.
However, please review the change to
ttsConfig.enable
:Changing
ttsConfig.enable
totrue
by default might affect user experience. Consider if this change is intentional and if it might surprise users who are not expecting TTS to be enabled automatically.✅ Verification successful
Approve TTS and STT Configuration Changes
The updates look good overall:
- The new
sttConfig
is well-structured and uses appropriate default values.- The addition of STT configuration aligns with the PR objective.
- Verified that
ttsConfig.enable
is set totrue
by default with no overrides disabling it.No issues found with the current configurations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify the impact of enabling TTS by default rg "ttsConfig.*enable.*false" --type typescriptLength of output: 84
Script:
#!/bin/bash # Verify the impact of enabling TTS by default by searching for 'ttsConfig.enable = false' in .ts files rg "ttsConfig\.enable\s*=\s*false" --type-add 'ts:*.ts' --type tsLength of output: 67
app/constant.ts (3)
153-153
: LGTM: Addition of TranscriptionPath constantThe new
TranscriptionPath
constant in theOpenaiPath
object is correctly defined and aligns with OpenAI's API structure for audio transcription endpoints. This addition supports the integration of speech-to-text functionality.
Line range hint
1-291
: Summary of changes and recommendationsThe additions to
app/constant.ts
primarily focus on implementing speech-to-text (STT) functionality. The changes are generally well-structured and provide necessary configurations for STT features. However, consider the following recommendations:
- Review the default STT language selection to ensure it's appropriate for your user base.
- Provide clarification on the Firefox-specific STT engine constant.
- Consider adding comments to explain the purpose and usage of new constants, especially browser-specific ones.
These changes will enhance the maintainability and clarity of the code for future development.
290-290
: Please clarify: Firefox-specific STT engine constantA new constant
FIREFOX_DEFAULT_STT_ENGINE
has been added, set to "OpenAI Whisper". Could you please provide context on why Firefox requires a separate default STT engine? Consider adding a comment to explain the rationale behind this browser-specific constant.To better understand the usage of this constant, let's search for its occurrences in the codebase:
This will help us verify if and how this constant is being used in the application.
✅ Verification successful
FIREFOX_DEFAULT_STT_ENGINE Usage Verified
The constant
FIREFOX_DEFAULT_STT_ENGINE
is defined inapp/constant.ts
and appropriately used inapp/components/chat.tsx
to set the STT engine for Firefox browsers. This browser-specific configuration ensures compatibility and optimal performance for Firefox users. No issues found.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for usage of FIREFOX_DEFAULT_STT_ENGINE rg "FIREFOX_DEFAULT_STT_ENGINE" --type typescriptLength of output: 84
Script:
#!/bin/bash # Corrected search for usage of FIREFOX_DEFAULT_STT_ENGINE in TypeScript files rg "FIREFOX_DEFAULT_STT_ENGINE" --glob "*.ts" --glob "*.tsx"Length of output: 289
app/components/chat.tsx (9)
13-14
: New icons imported for voice functionality.Two new icons have been imported:
VoiceOpenIcon
andVoiceCloseIcon
. These are likely used to represent the state of voice input in the UI.
77-77
: Added import for Firefox detection.The
isFirefox
function has been imported. This is likely used to provide browser-specific behavior for speech recognition.
104-106
: New constants imported for speech-to-text configuration.Constants for default speech-to-text (STT) engines have been added, including a specific one for Firefox. This supports the new voice input feature.
126-126
: New import for speech transcription APIs.The
OpenAITranscriptionApi
andWebTranscriptionApi
have been imported. These are likely used to handle the speech-to-text functionality.
556-558
: New state variables for speech recognition.Three new state variables have been added:
isListening
: Likely used to track if speech recognition is active.isTranscription
: Possibly used to indicate if transcription is in progress.speechApi
: Stores the speech recognition API instance.These variables are crucial for managing the state of the new voice input feature.
573-579
: Function to start speech recognition.The
startListening
function initializes speech recognition and updates theisListening
state. It also shows a toast notification to the user.
580-588
: Function to stop speech recognition.The
stopListening
function halts speech recognition, updates theisListening
state, and shows a toast notification. It also setsisTranscription
to true if not using the default engine.
831-840
: Added voice input button to ChatActions component.A new button for toggling voice input has been added to the ChatActions component. It uses the new icons and state variables to represent the current listening state.
Line range hint
1-2000
: Summary of speech-to-text integrationThe changes in this file successfully integrate speech-to-text functionality into the chat interface. Key additions include:
- New state variables for managing speech recognition status.
- Functions for starting and stopping speech recognition.
- A useEffect hook for initializing the speech API.
- UI elements for toggling voice input.
The implementation is generally well-done, with appropriate error handling and user feedback. However, there are a few minor suggestions for improvement:
- Consider adding dependencies to the useEffect hook for speech API initialization.
- Remove or replace the console.log statement in the onRecognitionEnd function.
Overall, these changes enhance the chat interface by providing users with a voice input option, improving accessibility and user experience.
<ListItem | ||
title={Locale.Settings.STT.Enable.Title} | ||
subTitle={Locale.Settings.STT.Enable.SubTitle} | ||
> | ||
<input | ||
type="checkbox" | ||
checked={props.sttConfig.enable} | ||
onChange={(e) => | ||
props.updateConfig( | ||
(config) => (config.enable = e.currentTarget.checked), | ||
) | ||
} | ||
></input> | ||
</ListItem> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor the config update to avoid assignment in expression.
The checkbox implementation for enabling/disabling STT is correct. However, the update logic can be improved to address the static analysis warning about assignment in expression.
Consider refactoring the onChange
handler as follows:
onChange={(e) =>
props.updateConfig(
- (config) => (config.enable = e.currentTarget.checked),
+ (config) => ({ ...config, enable: e.currentTarget.checked })
)
}
This change creates a new object with the updated enable
property, which is a more idiomatic way to update state in React and avoids the assignment in expression issue.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
<ListItem | |
title={Locale.Settings.STT.Enable.Title} | |
subTitle={Locale.Settings.STT.Enable.SubTitle} | |
> | |
<input | |
type="checkbox" | |
checked={props.sttConfig.enable} | |
onChange={(e) => | |
props.updateConfig( | |
(config) => (config.enable = e.currentTarget.checked), | |
) | |
} | |
></input> | |
</ListItem> | |
<ListItem | |
title={Locale.Settings.STT.Enable.Title} | |
subTitle={Locale.Settings.STT.Enable.SubTitle} | |
> | |
<input | |
type="checkbox" | |
checked={props.sttConfig.enable} | |
onChange={(e) => | |
props.updateConfig( | |
(config) => ({ ...config, enable: e.currentTarget.checked }) | |
) | |
} | |
></input> | |
</ListItem> |
🧰 Tools
🪛 Biome
[error] 23-23: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
app/components/stt-config.tsx
Outdated
{props.sttConfig.engine === "Web Speech API" && !isFirefox() && ( | ||
<ListItem title="语言选择"> | ||
<Select | ||
value={props.sttConfig.language} | ||
onChange={(e) => { | ||
props.updateConfig( | ||
(config) => (config.language = e.currentTarget.value), | ||
); | ||
}} | ||
> | ||
{DEFAULT_STT_LANGUAGES.map((v, i) => ( | ||
<option value={v} key={i}> | ||
{v} | ||
</option> | ||
))} | ||
</Select> | ||
</ListItem> | ||
)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor the language update and localize the title.
The conditional rendering of the language selection for Web Speech API is implemented correctly. However, there are two issues to address:
-
The update logic can be improved to address the static analysis warning about assignment in expression.
-
The title for language selection is in Chinese, which might not be consistent with the rest of the application.
-
Refactor the
onChange
handler as follows:
onChange={(e) => {
props.updateConfig(
- (config) => (config.language = e.currentTarget.value),
+ (config) => ({ ...config, language: e.currentTarget.value }),
);
}}
This change creates a new object with the updated language
property, which is a more idiomatic way to update state in React and avoids the assignment in expression issue.
- Replace the hardcoded Chinese title with a localized string:
- <ListItem title="语言选择">
+ <ListItem title={Locale.Settings.STT.Language.Title}>
Make sure to add the corresponding translation key in the locale files.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
{props.sttConfig.engine === "Web Speech API" && !isFirefox() && ( | |
<ListItem title="语言选择"> | |
<Select | |
value={props.sttConfig.language} | |
onChange={(e) => { | |
props.updateConfig( | |
(config) => (config.language = e.currentTarget.value), | |
); | |
}} | |
> | |
{DEFAULT_STT_LANGUAGES.map((v, i) => ( | |
<option value={v} key={i}> | |
{v} | |
</option> | |
))} | |
</Select> | |
</ListItem> | |
)} | |
{props.sttConfig.engine === "Web Speech API" && !isFirefox() && ( | |
<ListItem title={Locale.Settings.STT.Language.Title}> | |
<Select | |
value={props.sttConfig.language} | |
onChange={(e) => { | |
props.updateConfig( | |
(config) => ({ ...config, language: e.currentTarget.value }), | |
); | |
}} | |
> | |
{DEFAULT_STT_LANGUAGES.map((v, i) => ( | |
<option value={v} key={i}> | |
{v} | |
</option> | |
))} | |
</Select> | |
</ListItem> | |
)} |
🧰 Tools
🪛 Biome
[error] 61-62: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
<ListItem title={Locale.Settings.STT.Engine.Title}> | ||
<Select | ||
value={props.sttConfig.engine} | ||
onChange={(e) => { | ||
props.updateConfig( | ||
(config) => | ||
(config.engine = STTConfigValidator.engine( | ||
e.currentTarget.value, | ||
)), | ||
); | ||
}} | ||
> | ||
{isFirefox() | ||
? DEFAULT_STT_ENGINES.filter((v) => v !== "Web Speech API").map( | ||
(v, i) => ( | ||
<option value={v} key={i}> | ||
{v} | ||
</option> | ||
), | ||
) | ||
: DEFAULT_STT_ENGINES.map((v, i) => ( | ||
<option value={v} key={i}> | ||
{v} | ||
</option> | ||
))} | ||
</Select> | ||
</ListItem> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor the engine update to avoid assignment in expression.
The STT engine selection implementation is correct and handles browser compatibility well. However, the update logic can be improved to address the static analysis warning about assignment in expression.
Consider refactoring the onChange
handler as follows:
onChange={(e) => {
props.updateConfig(
- (config) =>
- (config.engine = STTConfigValidator.engine(
- e.currentTarget.value,
- )),
+ (config) => ({
+ ...config,
+ engine: STTConfigValidator.engine(e.currentTarget.value),
+ }),
);
}}
This change creates a new object with the updated engine
property, which is a more idiomatic way to update state in React and avoids the assignment in expression issue.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
<ListItem title={Locale.Settings.STT.Engine.Title}> | |
<Select | |
value={props.sttConfig.engine} | |
onChange={(e) => { | |
props.updateConfig( | |
(config) => | |
(config.engine = STTConfigValidator.engine( | |
e.currentTarget.value, | |
)), | |
); | |
}} | |
> | |
{isFirefox() | |
? DEFAULT_STT_ENGINES.filter((v) => v !== "Web Speech API").map( | |
(v, i) => ( | |
<option value={v} key={i}> | |
{v} | |
</option> | |
), | |
) | |
: DEFAULT_STT_ENGINES.map((v, i) => ( | |
<option value={v} key={i}> | |
{v} | |
</option> | |
))} | |
</Select> | |
</ListItem> | |
<ListItem title={Locale.Settings.STT.Engine.Title}> | |
<Select | |
value={props.sttConfig.engine} | |
onChange={(e) => { | |
props.updateConfig( | |
(config) => ({ | |
...config, | |
engine: STTConfigValidator.engine(e.currentTarget.value), | |
}), | |
); | |
}} | |
> | |
{isFirefox() | |
? DEFAULT_STT_ENGINES.filter((v) => v !== "Web Speech API").map( | |
(v, i) => ( | |
<option value={v} key={i}> | |
{v} | |
</option> | |
), | |
) | |
: DEFAULT_STT_ENGINES.map((v, i) => ( | |
<option value={v} key={i}> | |
{v} | |
</option> | |
))} | |
</Select> | |
</ListItem> |
🧰 Tools
🪛 Biome
[error] 34-36: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
app/components/chat.tsx
Outdated
useEffect(() => { | ||
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | ||
setSpeechApi( | ||
config.sttConfig.engine !== DEFAULT_STT_ENGINE | ||
? new WebTranscriptionApi((transcription) => | ||
onRecognitionEnd(transcription), | ||
) | ||
: new OpenAITranscriptionApi((transcription) => | ||
onRecognitionEnd(transcription), | ||
), | ||
); | ||
}, []); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useEffect hook for initializing speech recognition.
This hook initializes the speech recognition API based on the configured engine. It uses a Firefox-specific engine if the browser is Firefox.
However, there's a potential issue:
The useEffect
hook has an empty dependency array, which means it will only run once when the component mounts. If config.sttConfig.engine
changes, this effect won't re-run. Consider adding the necessary dependencies to the effect:
- useEffect(() => {
+ useEffect(() => {
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE;
setSpeechApi(
config.sttConfig.engine !== DEFAULT_STT_ENGINE
? new WebTranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
)
: new OpenAITranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
),
);
- }, []);
+ }, [config.sttConfig.engine]);
Also, make sure onRecognitionEnd
is memoized with useCallback
to prevent unnecessary re-creations of the speech API.
Committable suggestion was skipped due to low confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- app/components/chat.tsx (7 hunks)
🧰 Additional context used
🔇 Additional comments (5)
app/components/chat.tsx (5)
13-14
: New icon imports added for voice functionality.The addition of
VoiceOpenIcon
andVoiceCloseIcon
imports suggests new voice-related features are being implemented. These imports follow the existing naming conventions and file structure.
77-77
: Firefox detection utility imported.The
isFirefox
utility function is imported, indicating potential Firefox-specific handling in the component. This addition is consistent with the existing import structure.
104-106
: Speech-related constants imported.New constants for speech-to-text (STT) and text-to-speech (TTS) engines are added, including a Firefox-specific STT engine. These additions are consistent with the component's evolving voice-related functionality.
126-126
: Transcription API utilities imported.The addition of
OpenAITranscriptionApi
andWebTranscriptionApi
imports indicates the implementation of multiple transcription methods for speech-to-text functionality. This aligns with the earlier constant imports and enhances the component's voice-related capabilities.
832-840
: Voice input action button added with appropriate conditional rendering.A new voice input action button is implemented within the ChatActions component. The button's appearance and behavior change based on the listening state, and it's only rendered when the STT feature is enabled in the configuration. This implementation provides a user-friendly way to access the new voice input functionality.
app/components/chat.tsx
Outdated
const [isListening, setIsListening] = useState(false); | ||
const [isTranscription, setIsTranscription] = useState(false); | ||
const [speechApi, setSpeechApi] = useState<any>(null); | ||
|
||
useEffect(() => { | ||
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | ||
setSpeechApi( | ||
config.sttConfig.engine !== DEFAULT_STT_ENGINE | ||
? new WebTranscriptionApi((transcription) => | ||
onRecognitionEnd(transcription), | ||
) | ||
: new OpenAITranscriptionApi((transcription) => | ||
onRecognitionEnd(transcription), | ||
), | ||
); | ||
}, []); | ||
|
||
const startListening = async () => { | ||
showToast(Locale.Chat.StartSpeak); | ||
if (speechApi) { | ||
await speechApi.start(); | ||
setIsListening(true); | ||
document.getElementById("chat-input")?.focus(); | ||
} | ||
}; | ||
const stopListening = async () => { | ||
showToast(Locale.Chat.CloseSpeak); | ||
if (speechApi) { | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(true); | ||
await speechApi.stop(); | ||
setIsListening(false); | ||
} | ||
document.getElementById("chat-input")?.focus(); | ||
}; | ||
const onRecognitionEnd = (finalTranscript: string) => { | ||
console.log(finalTranscript); | ||
if (finalTranscript) props.setUserInput(finalTranscript); | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(false); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speech recognition functionality implemented with potential improvements.
The addition of speech recognition state variables and functions enhances the component's capabilities. However, there are two points to consider:
- The
useEffect
hook initializing the speech API has an empty dependency array. This means it will only run once when the component mounts. Ifconfig.sttConfig.engine
changes, the effect won't re-run. Consider adding the necessary dependencies:
- useEffect(() => {
+ useEffect(() => {
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE;
setSpeechApi(
config.sttConfig.engine !== DEFAULT_STT_ENGINE
? new WebTranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
)
: new OpenAITranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
),
);
- }, []);
+ }, [config.sttConfig.engine]);
- There's a
console.log
statement in theonRecognitionEnd
function. Consider removing it or replacing it with a more appropriate logging mechanism for production code.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const [isListening, setIsListening] = useState(false); | |
const [isTranscription, setIsTranscription] = useState(false); | |
const [speechApi, setSpeechApi] = useState<any>(null); | |
useEffect(() => { | |
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | |
setSpeechApi( | |
config.sttConfig.engine !== DEFAULT_STT_ENGINE | |
? new WebTranscriptionApi((transcription) => | |
onRecognitionEnd(transcription), | |
) | |
: new OpenAITranscriptionApi((transcription) => | |
onRecognitionEnd(transcription), | |
), | |
); | |
}, []); | |
const startListening = async () => { | |
showToast(Locale.Chat.StartSpeak); | |
if (speechApi) { | |
await speechApi.start(); | |
setIsListening(true); | |
document.getElementById("chat-input")?.focus(); | |
} | |
}; | |
const stopListening = async () => { | |
showToast(Locale.Chat.CloseSpeak); | |
if (speechApi) { | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(true); | |
await speechApi.stop(); | |
setIsListening(false); | |
} | |
document.getElementById("chat-input")?.focus(); | |
}; | |
const onRecognitionEnd = (finalTranscript: string) => { | |
console.log(finalTranscript); | |
if (finalTranscript) props.setUserInput(finalTranscript); | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(false); | |
}; | |
const [isListening, setIsListening] = useState(false); | |
const [isTranscription, setIsTranscription] = useState(false); | |
const [speechApi, setSpeechApi] = useState<any>(null); | |
useEffect(() => { | |
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | |
setSpeechApi( | |
config.sttConfig.engine !== DEFAULT_STT_ENGINE | |
? new WebTranscriptionApi((transcription) => | |
onRecognitionEnd(transcription), | |
) | |
: new OpenAITranscriptionApi((transcription) => | |
onRecognitionEnd(transcription), | |
), | |
); | |
}, [config.sttConfig.engine]); | |
const startListening = async () => { | |
showToast(Locale.Chat.StartSpeak); | |
if (speechApi) { | |
await speechApi.start(); | |
setIsListening(true); | |
document.getElementById("chat-input")?.focus(); | |
} | |
}; | |
const stopListening = async () => { | |
showToast(Locale.Chat.CloseSpeak); | |
if (speechApi) { | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(true); | |
await speechApi.stop(); | |
setIsListening(false); | |
} | |
document.getElementById("chat-input")?.focus(); | |
}; | |
const onRecognitionEnd = (finalTranscript: string) => { | |
console.log(finalTranscript); | |
if (finalTranscript) props.setUserInput(finalTranscript); | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(false); | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (1)
app/utils/speech.ts (1)
56-56
: Fix typo in console warning messageThere's a typo in the console warning on line 56. "Media Decives" should be "Media Devices".
Apply this diff to correct the typo:
console.warn("Media Decives will work only with SSL"); + console.warn("Media Devices will work only with SSL");
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- app/components/chat.tsx (7 hunks)
- app/components/stt-config.tsx (1 hunks)
- app/store/config.ts (4 hunks)
- app/utils/speech.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- app/store/config.ts
🧰 Additional context used
🪛 Biome
app/components/stt-config.tsx
[error] 23-23: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 34-36: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 61-62: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
🔇 Additional comments (9)
app/components/stt-config.tsx (5)
1-11
: LGTM: Imports and component declaration are well-structured.The imports cover all necessary dependencies, and the component declaration with its props is correctly defined.
1-75
: Overall assessment: Well-implemented STT configuration component with minor improvements needed.The
STTConfigList
component effectively manages and displays configuration options for speech-to-text (STT) settings. It handles browser compatibility and conditional rendering appropriately. The suggested refactors will improve code quality by addressing assignment in expression issues and ensuring consistent localization.Key points:
- The component structure and logic are sound.
- Update logic for config changes should be refactored to avoid assignments in expressions.
- Localization should be applied consistently, especially for the language selection title.
Once these minor improvements are implemented, the component will be more robust and maintainable.
🧰 Tools
🪛 Biome
[error] 23-23: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 34-36: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
[error] 61-62: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
14-27
:⚠️ Potential issueRefactor the config update to avoid assignment in expression.
The checkbox implementation for enabling/disabling STT is correct. However, the update logic can be improved to address the static analysis warning about assignment in expression.
Consider refactoring the
onChange
handler as follows:onChange={(e) => props.updateConfig( - (config) => (config.enable = e.currentTarget.checked), + (config) => ({ ...config, enable: e.currentTarget.checked }) ) }This change creates a new object with the updated
enable
property, which is a more idiomatic way to update state in React and avoids the assignment in expression issue.🧰 Tools
🪛 Biome
[error] 23-23: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
28-54
:⚠️ Potential issueRefactor the engine update to avoid assignment in expression.
The STT engine selection implementation is correct and handles browser compatibility well. However, the update logic can be improved to address the static analysis warning about assignment in expression.
Consider refactoring the
onChange
handler as follows:onChange={(e) => { props.updateConfig( - (config) => - (config.engine = STTConfigValidator.engine( - e.currentTarget.value, - )), + (config) => ({ + ...config, + engine: STTConfigValidator.engine(e.currentTarget.value), + }), ); }}This change creates a new object with the updated
engine
property, which is a more idiomatic way to update state in React and avoids the assignment in expression issue.🧰 Tools
🪛 Biome
[error] 34-36: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
55-72
:⚠️ Potential issueRefactor the language update and localize the title.
The conditional rendering of the language selection for Web Speech API is implemented correctly. However, there are two issues to address:
The update logic can be improved to address the static analysis warning about assignment in expression.
The title for language selection is in Chinese, which might not be consistent with the rest of the application.
Refactor the
onChange
handler as follows:onChange={(e) => { props.updateConfig( - (config) => (config.language = e.currentTarget.value), + (config) => ({ ...config, language: e.currentTarget.value }), ); }}This change creates a new object with the updated
language
property, which is a more idiomatic way to update state in React and avoids the assignment in expression issue.
- Replace the hardcoded Chinese title with a localized string:
- <ListItem title="语言选择"> + <ListItem title={Locale.Settings.STT.Language.Title}>Make sure to add the corresponding translation key in the locale files.
🧰 Tools
🪛 Biome
[error] 61-62: The assignment should not be in an expression.
The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.(lint/suspicious/noAssignInExpressions)
app/components/chat.tsx (4)
13-14
: New voice-related icons added.The addition of
VoiceOpenIcon
andVoiceCloseIcon
suggests new voice functionality is being introduced to the chat component. The import statements are correctly formatted and follow the existing naming conventions.
77-77
: Firefox detection utility added.The import of
isFirefox
utility function suggests the introduction of Firefox-specific functionality or checks. This addition is correctly implemented and follows the existing import style.
104-106
: Speech-related constants added.New constants for speech-to-text (STT) and text-to-speech (TTS) engines have been imported, including a Firefox-specific STT engine. These additions align with the previously imported
isFirefox
utility, suggesting the implementation of browser-specific speech functionality.
126-126
: Transcription APIs imported.The addition of
OpenAITranscriptionApi
andWebTranscriptionApi
indicates the implementation of transcription functionality, likely for speech-to-text features. This change supports multiple transcription services and is correctly implemented.
app/components/chat.tsx
Outdated
const [isListening, setIsListening] = useState(false); | ||
const [isTranscription, setIsTranscription] = useState(false); | ||
const [speechApi, setSpeechApi] = useState<any>(null); | ||
|
||
useEffect(() => { | ||
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | ||
const lang = config.sttConfig.lang; | ||
setSpeechApi( | ||
config.sttConfig.engine !== DEFAULT_STT_ENGINE | ||
? new WebTranscriptionApi( | ||
(transcription) => onRecognitionEnd(transcription), | ||
lang, | ||
) | ||
: new OpenAITranscriptionApi((transcription) => | ||
onRecognitionEnd(transcription), | ||
), | ||
); | ||
}, []); | ||
|
||
const startListening = async () => { | ||
showToast(Locale.Chat.StartSpeak); | ||
if (speechApi) { | ||
await speechApi.start(); | ||
setIsListening(true); | ||
document.getElementById("chat-input")?.focus(); | ||
} | ||
}; | ||
const stopListening = async () => { | ||
showToast(Locale.Chat.CloseSpeak); | ||
if (speechApi) { | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(true); | ||
await speechApi.stop(); | ||
setIsListening(false); | ||
} | ||
document.getElementById("chat-input")?.focus(); | ||
}; | ||
const onRecognitionEnd = (finalTranscript: string) => { | ||
console.log(finalTranscript); | ||
if (finalTranscript) props.setUserInput(finalTranscript); | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(false); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Voice input functionality implemented with potential improvements.
The addition of speech recognition state variables and functions enhances the component's capabilities. However, there are two points to consider:
- The
useEffect
hook initializing the speech API has an empty dependency array. This means it will only run once when the component mounts. Ifconfig.sttConfig.engine
changes, the effect won't re-run. Consider adding the necessary dependencies:
- useEffect(() => {
+ useEffect(() => {
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE;
setSpeechApi(
config.sttConfig.engine !== DEFAULT_STT_ENGINE
? new WebTranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
)
: new OpenAITranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
),
);
- }, []);
+ }, [config.sttConfig.engine]);
- There's a
console.log
statement in theonRecognitionEnd
function. Consider removing it or replacing it with a more appropriate logging mechanism for production code.
Committable suggestion was skipped due to low confidence.
// @ts-ignore | ||
navigator.getUserMedia = | ||
// @ts-ignore | ||
navigator.getUserMedia || | ||
// @ts-ignore | ||
navigator.webkitGetUserMedia || | ||
// @ts-ignore | ||
navigator.mozGetUserMedia || | ||
// @ts-ignore | ||
navigator.msGetUserMedia; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary and unused code setting navigator.getUserMedia
The code from lines 35-44 assigns navigator.getUserMedia
with cross-browser support shims. However, this variable is not used later in the code since you're using navigator.mediaDevices.getUserMedia
directly on line 46, which is the standard and widely supported method. Additionally, navigator.getUserMedia
is deprecated. Removing these lines will clean up the code and prevent potential confusion.
Apply this diff to remove the unused code:
// @ts-ignore
- navigator.getUserMedia =
- // @ts-ignore
- navigator.getUserMedia ||
- // @ts-ignore
- navigator.webkitGetUserMedia ||
- // @ts-ignore
- navigator.mozGetUserMedia ||
- // @ts-ignore
- navigator.msGetUserMedia;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// @ts-ignore | |
navigator.getUserMedia = | |
// @ts-ignore | |
navigator.getUserMedia || | |
// @ts-ignore | |
navigator.webkitGetUserMedia || | |
// @ts-ignore | |
navigator.mozGetUserMedia || | |
// @ts-ignore | |
navigator.msGetUserMedia; |
async start(): Promise<void> { | ||
this.listeningStatus = true; | ||
await this.recognitionInstance.start(); | ||
} | ||
|
||
async stop(): Promise<void> { | ||
this.listeningStatus = false; | ||
await this.recognitionInstance.stop(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unnecessary async
and await
in start()
and stop()
methods
The start()
and stop()
methods of the Web Speech API's SpeechRecognition
are synchronous and do not return promises. Using async
and await
is unnecessary and may cause confusion. Update the methods to be synchronous by removing async
and await
, and adjust the method signatures accordingly.
Apply this diff to simplify the methods:
- async start(): Promise<void> {
+ start(): void {
this.listeningStatus = true;
if (!this.recognitionInstance) {
console.warn("Speech recognition is not supported or has been disabled.");
return;
}
- await this.recognitionInstance.start();
+ this.recognitionInstance.start();
}
- async stop(): Promise<void> {
+ stop(): void {
this.listeningStatus = false;
if (!this.recognitionInstance) {
console.warn("Speech recognition is not supported or has been disabled.");
return;
}
- await this.recognitionInstance.stop();
+ this.recognitionInstance.stop();
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async start(): Promise<void> { | |
this.listeningStatus = true; | |
await this.recognitionInstance.start(); | |
} | |
async stop(): Promise<void> { | |
this.listeningStatus = false; | |
await this.recognitionInstance.stop(); | |
} | |
start(): void { | |
this.listeningStatus = true; | |
if (!this.recognitionInstance) { | |
console.warn("Speech recognition is not supported or has been disabled."); | |
return; | |
} | |
this.recognitionInstance.start(); | |
} | |
stop(): void { | |
this.listeningStatus = false; | |
if (!this.recognitionInstance) { | |
console.warn("Speech recognition is not supported or has been disabled."); | |
return; | |
} | |
this.recognitionInstance.stop(); | |
} |
if (isFirefox()) return; | ||
const SpeechRecognition = | ||
(window as any).SpeechRecognition || | ||
(window as any).webkitSpeechRecognition; | ||
this.recognitionInstance = new SpeechRecognition(); | ||
this.recognitionInstance.continuous = true; | ||
this.recognitionInstance.interimResults = true; | ||
this.recognitionInstance.lang = lang ?? getSTTLang(); | ||
if (transcriptionCallback) { | ||
this.onTranscriptionReceived(transcriptionCallback); | ||
} | ||
this.recognitionInstance.onresult = (event: any) => { | ||
const result = event.results[event.results.length - 1]; | ||
if (result.isFinal) { | ||
this.onTranscription(result[0].transcript); | ||
} | ||
}; | ||
} | ||
|
||
async start(): Promise<void> { | ||
this.listeningStatus = true; | ||
await this.recognitionInstance.start(); | ||
} | ||
|
||
async stop(): Promise<void> { | ||
this.listeningStatus = false; | ||
await this.recognitionInstance.stop(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle potential null recognitionInstance
to prevent runtime errors
If isFirefox()
returns true, the constructor exits early, leaving recognitionInstance
as null
. This can lead to null reference errors when start()
, stop()
, or isListening()
methods are called. To prevent runtime errors, add checks to ensure recognitionInstance
is not null
before using it.
Apply this diff to add null checks in the methods:
start(): void {
+ if (!this.recognitionInstance) {
+ console.warn("Speech recognition is not supported or has been disabled.");
+ return;
+ }
this.listeningStatus = true;
this.recognitionInstance.start();
}
stop(): void {
+ if (!this.recognitionInstance) {
+ console.warn("Speech recognition is not supported or has been disabled.");
+ return;
+ }
this.listeningStatus = false;
this.recognitionInstance.stop();
}
isListening = () => this.listeningStatus && this.recognitionInstance !== null;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (isFirefox()) return; | |
const SpeechRecognition = | |
(window as any).SpeechRecognition || | |
(window as any).webkitSpeechRecognition; | |
this.recognitionInstance = new SpeechRecognition(); | |
this.recognitionInstance.continuous = true; | |
this.recognitionInstance.interimResults = true; | |
this.recognitionInstance.lang = lang ?? getSTTLang(); | |
if (transcriptionCallback) { | |
this.onTranscriptionReceived(transcriptionCallback); | |
} | |
this.recognitionInstance.onresult = (event: any) => { | |
const result = event.results[event.results.length - 1]; | |
if (result.isFinal) { | |
this.onTranscription(result[0].transcript); | |
} | |
}; | |
} | |
async start(): Promise<void> { | |
this.listeningStatus = true; | |
await this.recognitionInstance.start(); | |
} | |
async stop(): Promise<void> { | |
this.listeningStatus = false; | |
await this.recognitionInstance.stop(); | |
} | |
if (isFirefox()) return; | |
const SpeechRecognition = | |
(window as any).SpeechRecognition || | |
(window as any).webkitSpeechRecognition; | |
this.recognitionInstance = new SpeechRecognition(); | |
this.recognitionInstance.continuous = true; | |
this.recognitionInstance.interimResults = true; | |
this.recognitionInstance.lang = lang ?? getSTTLang(); | |
if (transcriptionCallback) { | |
this.onTranscriptionReceived(transcriptionCallback); | |
} | |
this.recognitionInstance.onresult = (event: any) => { | |
const result = event.results[event.results.length - 1]; | |
if (result.isFinal) { | |
this.onTranscription(result[0].transcript); | |
} | |
}; | |
} | |
async start(): Promise<void> { | |
if (!this.recognitionInstance) { | |
console.warn("Speech recognition is not supported or has been disabled."); | |
return; | |
} | |
this.listeningStatus = true; | |
await this.recognitionInstance.start(); | |
} | |
async stop(): Promise<void> { | |
if (!this.recognitionInstance) { | |
console.warn("Speech recognition is not supported or has been disabled."); | |
return; | |
} | |
this.listeningStatus = false; | |
await this.recognitionInstance.stop(); | |
} | |
isListening = () => this.listeningStatus && this.recognitionInstance !== null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (1)
app/components/chat.module.scss (1)
78-84
: LGTM with a minor suggestion for improvementThe addition of the
.listening
state for.chat-input-action
enhances the visual feedback, likely for voice input functionality. The implementation is consistent with the existing hover styles, providing a cohesive user experience.To ensure smooth transitions between states, consider adding transition properties to the
.listening
class:&.listening { width: var(--full-width); + transition: width 0.3s ease; .text { opacity: 1; transform: translate(0); + transition: opacity 0.3s ease, transform 0.3s ease; } }This will create a smoother visual effect when entering and exiting the listening state, matching the existing hover transition behavior.
import VoiceOpenIcon from "../icons/vioce-open.svg"; | ||
import VoiceCloseIcon from "../icons/vioce-close.svg"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo in icon file paths
There's a typo in the file paths for the newly imported voice icons. "vioce" should be "voice".
Please apply the following changes:
-import VoiceOpenIcon from "../icons/vioce-open.svg";
-import VoiceCloseIcon from "../icons/vioce-close.svg";
+import VoiceOpenIcon from "../icons/voice-open.svg";
+import VoiceCloseIcon from "../icons/voice-close.svg";
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import VoiceOpenIcon from "../icons/vioce-open.svg"; | |
import VoiceCloseIcon from "../icons/vioce-close.svg"; | |
import VoiceOpenIcon from "../icons/voice-open.svg"; | |
import VoiceCloseIcon from "../icons/voice-close.svg"; |
app/components/chat.tsx
Outdated
const onRecognitionEnd = (finalTranscript: string) => { | ||
console.log(finalTranscript); | ||
if (finalTranscript) props.setUserInput(finalTranscript); | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(false); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove console.log in production code
There's a console.log
statement in the onRecognitionEnd
function. Consider removing it or replacing it with a more appropriate logging mechanism for production code.
const onRecognitionEnd = (finalTranscript: string) => {
- console.log(finalTranscript);
if (finalTranscript) props.setUserInput(finalTranscript);
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE)
setIsTranscription(false);
};
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const onRecognitionEnd = (finalTranscript: string) => { | |
console.log(finalTranscript); | |
if (finalTranscript) props.setUserInput(finalTranscript); | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(false); | |
}; | |
const onRecognitionEnd = (finalTranscript: string) => { | |
if (finalTranscript) props.setUserInput(finalTranscript); | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(false); | |
}; |
@@ -546,6 +555,50 @@ | |||
} | |||
}, [chatStore, currentModel, models]); | |||
|
|||
const [isListening, setIsListening] = useState(false); | |||
const [isTranscription, setIsTranscription] = useState(false); | |||
const [speechApi, setSpeechApi] = useState<any>(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using a more specific type for speechApi state
The speechApi
state is initialized with any
type. Consider using a more specific type to improve type safety.
- const [speechApi, setSpeechApi] = useState<any>(null);
+ const [speechApi, setSpeechApi] = useState<WebTranscriptionApi | OpenAITranscriptionApi | null>(null);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const [speechApi, setSpeechApi] = useState<any>(null); | |
const [speechApi, setSpeechApi] = useState<WebTranscriptionApi | OpenAITranscriptionApi | null>(null); |
useEffect(() => { | ||
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | ||
const lang = config.sttConfig.lang; | ||
setSpeechApi( | ||
config.sttConfig.engine !== DEFAULT_STT_ENGINE | ||
? new WebTranscriptionApi( | ||
(transcription) => onRecognitionEnd(transcription), | ||
lang, | ||
) | ||
: new OpenAITranscriptionApi((transcription) => | ||
onRecognitionEnd(transcription), | ||
), | ||
); | ||
}, []); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding dependencies to useEffect
The useEffect hook initializing the speech API has an empty dependency array. This means it will only run once when the component mounts. If config.sttConfig.engine
or config.sttConfig.lang
changes, the effect won't re-run.
Consider adding the necessary dependencies:
- useEffect(() => {
+ useEffect(() => {
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE;
const lang = config.sttConfig.lang;
setSpeechApi(
config.sttConfig.engine !== DEFAULT_STT_ENGINE
? new WebTranscriptionApi(
(transcription) => onRecognitionEnd(transcription),
lang,
)
: new OpenAITranscriptionApi((transcription) =>
onRecognitionEnd(transcription),
),
);
- }, []);
+ }, [config.sttConfig.engine, config.sttConfig.lang]);
Also, consider memoizing the onRecognitionEnd
function with useCallback to prevent unnecessary re-creations of the speech API.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
useEffect(() => { | |
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | |
const lang = config.sttConfig.lang; | |
setSpeechApi( | |
config.sttConfig.engine !== DEFAULT_STT_ENGINE | |
? new WebTranscriptionApi( | |
(transcription) => onRecognitionEnd(transcription), | |
lang, | |
) | |
: new OpenAITranscriptionApi((transcription) => | |
onRecognitionEnd(transcription), | |
), | |
); | |
}, []); | |
useEffect(() => { | |
if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; | |
const lang = config.sttConfig.lang; | |
setSpeechApi( | |
config.sttConfig.engine !== DEFAULT_STT_ENGINE | |
? new WebTranscriptionApi( | |
(transcription) => onRecognitionEnd(transcription), | |
lang, | |
) | |
: new OpenAITranscriptionApi((transcription) => | |
onRecognitionEnd(transcription), | |
), | |
); | |
}, [config.sttConfig.engine, config.sttConfig.lang]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- app/components/chat.tsx (9 hunks)
🧰 Additional context used
🔇 Additional comments (3)
app/components/chat.tsx (3)
13-14
:⚠️ Potential issueFix typo in icon file paths
There's a typo in the file paths for the newly imported voice icons. "vioce" should be "voice".
Please apply the following changes:
-import VoiceOpenIcon from "../icons/vioce-open.svg"; -import VoiceCloseIcon from "../icons/vioce-close.svg"; +import VoiceOpenIcon from "../icons/voice-open.svg"; +import VoiceCloseIcon from "../icons/voice-close.svg";Likely invalid or redundant comment.
560-560
: 🛠️ Refactor suggestionConsider using a more specific type for speechApi state
The
speechApi
state is initialized withany
type. Consider using a more specific type to improve type safety.- const [speechApi, setSpeechApi] = useState<any>(null); + const [speechApi, setSpeechApi] = useState<WebTranscriptionApi | OpenAITranscriptionApi | null>(null);Likely invalid or redundant comment.
562-575
:⚠️ Potential issueConsider adding dependencies to useEffect
The useEffect hook initializing the speech API has an empty dependency array. This means it will only run once when the component mounts. If
config.sttConfig.engine
orconfig.sttConfig.lang
changes, the effect won't re-run.Consider adding the necessary dependencies:
- useEffect(() => { + useEffect(() => { if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; const lang = config.sttConfig.lang; setSpeechApi( config.sttConfig.engine !== DEFAULT_STT_ENGINE ? new WebTranscriptionApi( (transcription) => onRecognitionEnd(transcription), lang, ) : new OpenAITranscriptionApi((transcription) => onRecognitionEnd(transcription), ), ); - }, []); + }, [config.sttConfig.engine, config.sttConfig.lang]);Also, consider memoizing the
onRecognitionEnd
function with useCallback to prevent unnecessary re-creations of the speech API.Likely invalid or redundant comment.
const onRecognitionEnd = (finalTranscript: string) => { | ||
console.log(finalTranscript); | ||
if (finalTranscript) { | ||
props.setUserInput((prevInput) => prevInput + finalTranscript); | ||
} | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(false); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove console.log in production code
There's a console.log
statement in the onRecognitionEnd
function. Consider removing it or replacing it with a more appropriate logging mechanism for production code.
const onRecognitionEnd = (finalTranscript: string) => {
- console.log(finalTranscript);
if (finalTranscript) {
props.setUserInput((prevInput) => prevInput + finalTranscript);
}
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE)
setIsTranscription(false);
};
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const onRecognitionEnd = (finalTranscript: string) => { | |
console.log(finalTranscript); | |
if (finalTranscript) { | |
props.setUserInput((prevInput) => prevInput + finalTranscript); | |
} | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(false); | |
}; | |
const onRecognitionEnd = (finalTranscript: string) => { | |
if (finalTranscript) { | |
props.setUserInput((prevInput) => prevInput + finalTranscript); | |
} | |
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | |
setIsTranscription(false); | |
}; |
const startListening = async () => { | ||
playSound("/Recordingstart.mp3"); | ||
showToast(Locale.Chat.StartSpeak); | ||
if (speechApi) { | ||
await speechApi.start(); | ||
setIsListening(true); | ||
document.getElementById("chat-input")?.focus(); | ||
} | ||
}; | ||
const stopListening = async () => { | ||
showToast(Locale.Chat.CloseSpeak); | ||
if (speechApi) { | ||
if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) | ||
setIsTranscription(true); | ||
await speechApi.stop(); | ||
setIsListening(false); | ||
} | ||
playSound("/Recordingdone.mp3"); | ||
document.getElementById("chat-input")?.focus(); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve focus management in startListening and stopListening
Both startListening
and stopListening
functions set focus to the chat input. This might not be ideal for all users, especially those using screen readers. Consider making this behavior configurable:
+ const [autoFocusAfterListening, setAutoFocusAfterListening] = useState(true);
const startListening = async () => {
// ... existing code ...
- document.getElementById("chat-input")?.focus();
+ if (autoFocusAfterListening) {
+ document.getElementById("chat-input")?.focus();
+ }
};
const stopListening = async () => {
// ... existing code ...
- document.getElementById("chat-input")?.focus();
+ if (autoFocusAfterListening) {
+ document.getElementById("chat-input")?.focus();
+ }
};
Add a setting in the user preferences to control this behavior.
Committable suggestion was skipped due to low confidence.
合一下主分支 |
Merge the main branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- app/client/platforms/openai.ts (2 hunks)
- app/components/chat.tsx (9 hunks)
- app/components/markdown.tsx (1 hunks)
- app/components/settings.tsx (2 hunks)
- app/locales/cn.ts (2 hunks)
- app/locales/en.ts (2 hunks)
- app/store/config.ts (4 hunks)
✅ Files skipped from review due to trivial changes (1)
- app/components/markdown.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- app/components/settings.tsx
- app/store/config.ts
🧰 Additional context used
🔇 Additional comments (11)
app/client/platforms/openai.ts (3)
37-37
: LGTM: Import statement updated correctlyThe addition of
TranscriptionOptions
to the import statement is consistent with the newtranscription
method implementation. This change is necessary and doesn't introduce any issues.
Line range hint
71-71
: LGTM:path
method signature updated correctlyThe
path
method signature has been updated to include an optionalmodel
parameter, as mentioned in the AI-generated summary. This change:
- Allows for more flexibility in constructing API paths.
- Is consistent with the needs of the new
transcription
method.- Addresses the existing comment about updating the
path
method signature.This update improves the method's versatility without introducing any issues.
Line range hint
1-1
: Summary of changes: Audio transcription functionality addedThe changes in this file enhance the
ChatGPTApi
class by:
- Adding a new
transcription
method for audio transcription functionality.- Updating the
path
method signature to support dynamic API path construction.These modifications improve the class's capabilities and flexibility. While the implementation is generally solid, there are a few suggested improvements for error handling and response parsing in the
transcription
method.Overall, these changes are a valuable addition to the codebase, extending the API's functionality to include audio transcription services.
app/locales/cn.ts (3)
542-551
: LGTM: New STT (Speech-to-Text) section addedThe new STT section has been successfully added to the Chinese localization file. It includes appropriate translations for enabling the STT feature and selecting the conversion engine. This addition aligns well with the integration of voice recognition features mentioned in the PR summary.
95-97
: LGTM: Chat section updated for voice input functionalityThe Chat section has been successfully updated to include translations related to voice input functionality. The changes to StartSpeak, CloseSpeak, and StopSpeak properties are clear and appropriate. The StopSpeak property now includes an informative message about the recording state, which enhances user experience.
Line range hint
1-1000
: Summary: Successful integration of voice recognition features in Chinese localizationThe changes made to
app/locales/cn.ts
effectively integrate voice recognition features into the Chinese localization. The additions include:
- A new STT (Speech-to-Text) section in the Settings object, providing translations for enabling the feature and selecting the conversion engine.
- Updated translations in the Chat section for starting, closing, and stopping voice input.
These modifications align well with the PR objectives and maintain consistency with the existing code structure and style. The translations are clear and appropriate for Chinese users.
app/locales/en.ts (1)
Line range hint
1-1064
: LGTM with minor suggestionsThe changes made to the localization file are appropriate for supporting the new STT (Speech-to-Text) feature. The structure and naming conventions are consistent with the rest of the file. Please consider the two suggestions made earlier:
- Fix the inconsistency in the STT Engine subtitle.
- Consider renaming the
StopSpeak
property to better reflect its dual purpose for speaking and recording states.Once these minor issues are addressed, the localization changes will be fully aligned with the new STT functionality.
app/components/chat.tsx (4)
607-614
:⚠️ Potential issueRemove console.log in production code
There's a
console.log
statement in theonRecognitionEnd
function. Consider removing it or replacing it with a more appropriate logging mechanism for production code.const onRecognitionEnd = (finalTranscript: string) => { - console.log(finalTranscript); if (finalTranscript) { props.setUserInput((prevInput) => prevInput + finalTranscript); } if (config.sttConfig.engine !== DEFAULT_STT_ENGINE) setIsTranscription(false); };
Likely invalid or redundant comment.
587-606
: 🛠️ Refactor suggestionImprove focus management in startListening and stopListening
Both
startListening
andstopListening
functions set focus to the chat input. This might not be ideal for all users, especially those using screen readers. Consider making this behavior configurable:+ const [autoFocusAfterListening, setAutoFocusAfterListening] = useState(true); const startListening = async () => { // ... existing code ... - document.getElementById("chat-input")?.focus(); + if (autoFocusAfterListening) { + document.getElementById("chat-input")?.focus(); + } }; const stopListening = async () => { // ... existing code ... - document.getElementById("chat-input")?.focus(); + if (autoFocusAfterListening) { + document.getElementById("chat-input")?.focus(); + } };Add a setting in the user preferences to control this behavior.
Likely invalid or redundant comment.
13-14
:⚠️ Potential issueFix typo in icon import paths
There's a typo in the file paths for the newly imported voice icons. "vioce" should be "voice".
Please apply the following changes:
-import VoiceOpenIcon from "../icons/vioce-open.svg"; -import VoiceCloseIcon from "../icons/vioce-close.svg"; +import VoiceOpenIcon from "../icons/voice-open.svg"; +import VoiceCloseIcon from "../icons/voice-close.svg";Likely invalid or redundant comment.
565-578
:⚠️ Potential issueConsider adding dependencies to useEffect
The useEffect hook initializing the speech API has an empty dependency array. This means it will only run once when the component mounts. If
config.sttConfig.engine
orconfig.sttConfig.lang
changes, the effect won't re-run.Consider adding the necessary dependencies:
- useEffect(() => { + useEffect(() => { if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE; const lang = config.sttConfig.lang; setSpeechApi( config.sttConfig.engine !== DEFAULT_STT_ENGINE ? new WebTranscriptionApi( (transcription) => onRecognitionEnd(transcription), lang, ) : new OpenAITranscriptionApi((transcription) => onRecognitionEnd(transcription), ), ); - }, []); + }, [config.sttConfig.engine, config.sttConfig.lang]);Also, consider memoizing the
onRecognitionEnd
function with useCallback to prevent unnecessary re-creations of the speech API.Likely invalid or redundant comment.
async transcription(options: TranscriptionOptions): Promise<string> { | ||
const formData = new FormData(); | ||
formData.append("file", options.file, "audio.wav"); | ||
formData.append("model", options.model ?? "whisper-1"); | ||
if (options.language) formData.append("language", options.language); | ||
if (options.prompt) formData.append("prompt", options.prompt); | ||
if (options.response_format) | ||
formData.append("response_format", options.response_format); | ||
if (options.temperature) | ||
formData.append("temperature", options.temperature.toString()); | ||
|
||
console.log("[Request] openai audio transcriptions payload: ", options); | ||
|
||
const controller = new AbortController(); | ||
options.onController?.(controller); | ||
|
||
try { | ||
const path = this.path(OpenaiPath.TranscriptionPath); | ||
const headers = getHeaders(true); | ||
const payload = { | ||
method: "POST", | ||
body: formData, | ||
signal: controller.signal, | ||
headers: headers, | ||
}; | ||
|
||
// make a fetch request | ||
const requestTimeoutId = setTimeout( | ||
() => controller.abort(), | ||
REQUEST_TIMEOUT_MS, | ||
); | ||
const res = await fetch(path, payload); | ||
clearTimeout(requestTimeoutId); | ||
const json = await res.json(); | ||
return json.text; | ||
} catch (e) { | ||
console.log("[Request] failed to make a audio transcriptions request", e); | ||
throw e; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling and response parsing in the transcription
method
The implementation of the transcription
method looks good overall, but there are a few areas for improvement:
- Error handling could be more specific to provide better debugging information.
- The response parsing assumes the presence of a
text
property without checking. - The
headers
variable is declared but not used in the fetch call.
I acknowledge the existing comment about these issues. To address them, consider applying the following improvements:
- Enhance error handling:
} catch (e) {
- console.log("[Request] failed to make a audio transcriptions request", e);
- throw e;
+ console.error("[Request] failed to make an audio transcriptions request", e);
+ throw new Error(`Transcription request failed: ${e.message}`);
}
- Add response status check and error handling:
const res = await fetch(path, payload);
clearTimeout(requestTimeoutId);
+if (!res.ok) {
+ throw new Error(`Transcription request failed with status ${res.status}`);
+}
const json = await res.json();
-return json.text;
+return json.text ?? '';
- Use the
headers
variable in the fetch call:
const payload = {
method: "POST",
body: formData,
signal: controller.signal,
- headers: headers,
+ headers,
};
These changes will improve the robustness and reliability of the transcription
method.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async transcription(options: TranscriptionOptions): Promise<string> { | |
const formData = new FormData(); | |
formData.append("file", options.file, "audio.wav"); | |
formData.append("model", options.model ?? "whisper-1"); | |
if (options.language) formData.append("language", options.language); | |
if (options.prompt) formData.append("prompt", options.prompt); | |
if (options.response_format) | |
formData.append("response_format", options.response_format); | |
if (options.temperature) | |
formData.append("temperature", options.temperature.toString()); | |
console.log("[Request] openai audio transcriptions payload: ", options); | |
const controller = new AbortController(); | |
options.onController?.(controller); | |
try { | |
const path = this.path(OpenaiPath.TranscriptionPath); | |
const headers = getHeaders(true); | |
const payload = { | |
method: "POST", | |
body: formData, | |
signal: controller.signal, | |
headers: headers, | |
}; | |
// make a fetch request | |
const requestTimeoutId = setTimeout( | |
() => controller.abort(), | |
REQUEST_TIMEOUT_MS, | |
); | |
const res = await fetch(path, payload); | |
clearTimeout(requestTimeoutId); | |
const json = await res.json(); | |
return json.text; | |
} catch (e) { | |
console.log("[Request] failed to make a audio transcriptions request", e); | |
throw e; | |
} | |
} | |
async transcription(options: TranscriptionOptions): Promise<string> { | |
const formData = new FormData(); | |
formData.append("file", options.file, "audio.wav"); | |
formData.append("model", options.model ?? "whisper-1"); | |
if (options.language) formData.append("language", options.language); | |
if (options.prompt) formData.append("prompt", options.prompt); | |
if (options.response_format) | |
formData.append("response_format", options.response_format); | |
if (options.temperature) | |
formData.append("temperature", options.temperature.toString()); | |
console.log("[Request] openai audio transcriptions payload: ", options); | |
const controller = new AbortController(); | |
options.onController?.(controller); | |
try { | |
const path = this.path(OpenaiPath.TranscriptionPath); | |
const headers = getHeaders(true); | |
const payload = { | |
method: "POST", | |
body: formData, | |
signal: controller.signal, | |
headers, | |
}; | |
// make a fetch request | |
const requestTimeoutId = setTimeout( | |
() => controller.abort(), | |
REQUEST_TIMEOUT_MS, | |
); | |
const res = await fetch(path, payload); | |
clearTimeout(requestTimeoutId); | |
if (!res.ok) { | |
throw new Error(`Transcription request failed with status ${res.status}`); | |
} | |
const json = await res.json(); | |
return json.text ?? ''; | |
} catch (e) { | |
console.error("[Request] failed to make an audio transcriptions request", e); | |
throw new Error(`Transcription request failed: ${e.message}`); | |
} | |
} |
CloseSpeak: "Stop Speak", | ||
StopSpeak: "Recording...", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider renaming the StopSpeak
property
The StopSpeak
property has been updated to include a "Recording..." state, which is likely related to the new STT functionality. However, the property name "StopSpeak" might be confusing when used for both speaking and recording states.
Consider renaming the StopSpeak
property to something more generic that covers both speaking and recording actions, such as CloseAudio
or StopAudioInteraction
. For example:
-CloseSpeak: "Stop Speak",
-StopSpeak: "Recording...",
+CloseAudio: "Stop Speak",
+StopAudioInteraction: "Recording...",
This change would make the property names more descriptive and less ambiguous when used in different contexts.
Committable suggestion was skipped due to low confidence.
STT: { | ||
Enable: { | ||
Title: "Enable STT", | ||
SubTitle: "Enable Speech-to-Text", | ||
}, | ||
Engine: { | ||
Title: "STT Engine", | ||
SubTitle: "Text-to-Speech Engine", | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistency in STT Engine subtitle
The STT
section has been added successfully, but there's an inconsistency in the Engine.SubTitle
property. It currently refers to "Text-to-Speech" instead of "Speech-to-Text".
Please update the Engine.SubTitle
to correctly reflect Speech-to-Text functionality:
STT: {
Enable: {
Title: "Enable STT",
SubTitle: "Enable Speech-to-Text",
},
Engine: {
Title: "STT Engine",
- SubTitle: "Text-to-Speech Engine",
+ SubTitle: "Speech-to-Text Engine",
},
},
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
STT: { | |
Enable: { | |
Title: "Enable STT", | |
SubTitle: "Enable Speech-to-Text", | |
}, | |
Engine: { | |
Title: "STT Engine", | |
SubTitle: "Text-to-Speech Engine", | |
}, | |
}, | |
STT: { | |
Enable: { | |
Title: "Enable STT", | |
SubTitle: "Enable Speech-to-Text", | |
}, | |
Engine: { | |
Title: "STT Engine", | |
SubTitle: "Speech-to-Text Engine", | |
}, | |
}, |
function playSound(fileName: string) { | ||
const audio = new Audio(fileName); | ||
audio.play().catch((error) => { | ||
console.error("error:", error); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in playSound function
The playSound
function catches errors but only logs them to the console. Consider adding user-facing error handling:
function playSound(fileName: string) {
const audio = new Audio(fileName);
audio.play().catch((error) => {
console.error("error:", error);
+ showToast(Locale.Chat.AudioPlayError);
});
}
Also, consider adding a check to ensure the audio file exists before attempting to play it.
Committable suggestion was skipped due to low confidence.
tts以及stt相关功能可能会暂停一下 |
tts and stt related functions may be suspended for a while |
💻 变更类型 | Change Type
🔀 变更说明 | Description of Change
📝 补充信息 | Additional Information
Summary by CodeRabbit
New Features
Bug Fixes
Chores