-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Improve speech-to-text setup UX and error passing #4587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: a25cb52 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
4121827 to
89d7428
Compare
89d7428 to
56c428f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ No Issues Found
8 files reviewed | Confidence: 90% | Recommendation: Merge
Review Details
Files Reviewed:
webview-ui/src/components/chat/STTSetupPopover.tsx(NEW) - Well-structured popover componentwebview-ui/src/hooks/useSTT.ts- Optimistic state pattern properly implementedwebview-ui/src/components/chat/ChatTextArea.tsx- Main integration, on-demand availability checksrc/services/stt/OpenAIWhisperClient.ts- WebSocket null safety improvementssrc/core/webview/ClineProvider.ts- STT status removed from global statesrc/core/webview/speechToTextCheck.ts- Caching logic removedsrc/core/webview/webviewMessageHandler.ts- New message handler addedwebview-ui/src/components/chat/MicrophoneButton.tsx- Visual-only disabled state
Checked: Security, bugs, performance, error handling, null safety
Key Observations:
- ✅
setTimeout(() => { onSend() })without delay is intentional - defers to next event loop tick to ensure state update completes - ✅
MicrophoneButtondisabledprop now controls visual styling only - button remains clickable to show setup popover (intentional UX improvement) - ✅ WebSocket null checks in
OpenAIWhisperClient.tsproperly guard against nullthis.ws - ✅ Initial
undefinedstate forspeechToTextStatuscorrectly triggers popover until availability is confirmed - ✅ On-demand availability checking pattern is cleaner than eager loading at global state init
Localization: 20+ language files updated with consistent translation keys
- Remove cachedResult variable and CACHE_DURATION_MS constant - Remove cache checking logic and forceRecheck parameter - Always perform fresh check without caching - Update function documentation to reflect no caching behavior Remove speechToTextStatus from getStateToPostToWebview - Remove expensive checkSpeechToTextAvailable call from state generation - Remove speechToTextStatus from returned state object - Remove unused import - Status will now be fetched on-demand instead of on every state update Add speechToTextStatus field to ExtensionMessage interface - Add field for speechToTextStatusResponse message type - Place near other STT-related fields for consistency Add checkSpeechToTextAvailable message handler - Add handler case for checkSpeechToTextAvailable message type - Dynamically import checkSpeechToTextAvailable function - Send speechToTextStatusResponse with status to webview - Place handler near other STT-related handlers Update ChatTextArea to use local state for speechToTextStatus - Remove speechToTextStatus from useExtensionState() destructuring - Add local useState for speechToTextStatus - Add useEffect to request STT check on mount (only if experiment enabled) - Add message handler for speechToTextStatusResponse to update local state Add onMouseEnter prop to MicrophoneButton component - Add onMouseEnter optional prop to MicrophoneButtonProps interface - Pass onMouseEnter handler to button element - Enables hover-triggered STT availability checks Implement hover handler for microphone button - Add handleMicrophoneHover callback that requests STT availability check - Connect handler to MicrophoneButton onMouseEnter prop - Enables real-time status updates when user hovers over microphone icon Rename checkSpeechToTextAvailable to stt:checkAvailability - Rename message type to follow stt: prefix convention - Update handler case in webviewMessageHandler - Update message calls in ChatTextArea (useEffect and hover handler) - Consistent with other STT events (stt:start, stt:stop, stt:cancel) Rename speechToTextStatusResponse to stt:statusResponse - Rename response message type to follow stt: prefix convention - Update handler in webviewMessageHandler - Update message listener in ChatTextArea - Consistent with other STT message types Move stt:statusResponse handler to separate useEvent block - Separate STT status handler from TTS handlers - Add comment explaining it's separate from recording events in useSTT hook - Remove debug console.log from useEffect - Better organization: TTS handlers, then STT status handler Create STTSetupPopover component and update translations - Create new STTSetupPopover component for interactive STT setup help - Update English translations: simplify FFmpeg message, add popover strings - Update Arabic translations: add popover strings - Component shows error message and help actions based on availability reason Update FFmpeg help to use Trans component with clickable link - Change FFmpeg help from Button to Trans component with VSCodeLink - Only 'Click here' text is clickable, rest is plain text - Update translation strings to include <ffmpegLink> placeholder - Matches pattern used in other components like TelemetryBanner Update MicrophoneButton to remove tooltipContent and onMouseEnter props - Remove tooltipContent prop (no longer needed for disabled state) - Remove onMouseEnter prop (hover check removed) - Button is always clickable (removed disabled attribute) - Keep visual disabled styling via className when disabled prop is true Integrate STTSetupPopover into ChatTextArea - Add popover state management - Update handleMicrophoneClick to open popover when STT unavailable - Remove handleMicrophoneHover (no longer needed) - Wrap MicrophoneButton with STTSetupPopover - Implement handleFfmpegHelpClick to send help message to chat - Remove tooltipContent and onMouseEnter props from MicrophoneButton usage Improve both unavailable case in STTSetupPopover - Show both FFmpeg and OpenAI help options when both are missing - Display help actions in a flex column layout - Provides complete setup guidance for users missing both requirements Add optimistic UI updates for microphone button and fix text clearing issue - Add optimistic state management in useSTT hook for immediate UI feedback - Update microphone button icon immediately on click without waiting for backend - Fix text clearing issue by keeping live transcript visible after stop until onComplete - Sync optimistic state with backend state when events arrive - Remove debug logging while keeping optimistic functionality Enhance STTSetupPopover with detailed setup guidance and translations - Introduce STTSetupPopoverContent for improved structure and clarity - Update translations for multiple languages to include detailed setup instructions - Modify popover behavior to show relevant help based on missing requirements (FFmpeg and OpenAI) - Remove redundant code and streamline the component for better maintainability
- Update OpenAIWhisperClient to ensure WebSocket event listeners are removed only if initialized. - Add error handling in ChatTextArea to set and clear STT error state during recording. - Integrate error display in STTSetupPopover for improved user feedback on STT setup issues. - Sync optimistic UI state in useSTT hook for immediate feedback on recording actions.
56c428f to
fc20b1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ No Issues Found
13 source files reviewed | Confidence: 95% | Recommendation: Merge
Review Details
Files Reviewed:
webview-ui/src/components/chat/STTSetupPopover.tsx(new)webview-ui/src/components/chat/ChatTextArea.tsxwebview-ui/src/components/chat/MicrophoneButton.tsxwebview-ui/src/hooks/useSTT.tssrc/core/webview/speechToTextCheck.tssrc/core/webview/webviewMessageHandler.tssrc/core/webview/ClineProvider.tssrc/services/stt/OpenAIWhisperClient.tssrc/shared/ExtensionMessage.tssrc/shared/WebviewMessage.tsapps/storybook/stories/STTSetupPopover.stories.tsx(new)- 20+ localization files
Checked:
- ✅ Security: No vulnerabilities found. API keys handled through existing secure channels.
- ✅ Bug detection: Optimistic state pattern correctly implemented with proper cleanup.
- ✅ Code quality: Proper TypeScript types, consistent patterns, good separation of concerns.
- ✅ Performance: On-demand availability checking is appropriate for the UX improvement.
- ✅ Error handling: WebSocket cleanup properly handled with try-catch.
Architecture Notes:
- Moving STT availability checking from global state to on-demand is a good design choice
- The optimistic UI pattern in
useSTT.tsprovides immediate feedback while waiting for actual status - Proper cleanup of event listeners and timeouts throughout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ No New Issues in Latest Changes
Changes since last review (commit 56c428fd → fc20b1fa) look good.
| File | Changes | Status |
|---|---|---|
OpenAIWhisperClient.ts |
WebSocket null safety, cleanup on error | ✅ Good |
ChatTextArea.tsx |
Error state management | ✅ Good |
STTSetupPopover.tsx |
Error display integration | ✅ Good |
useSTT.ts |
Optimistic state sync on stop | ✅ Good |
Review Details
Latest Commit Analysis:
- ✅ WebSocket null checks properly guard against null
this.wsusing optional chaining - ✅ Error cleanup in catch block (
removeAllListeners(),close()) prevents resource leaks - ✅ Timeout reduction (10s → 5s) is reasonable for connection establishment
- ✅ Error state properly flows from
useSTT→ChatTextArea→STTSetupPopover - ✅ Optimistic state immediately syncs on
stt:stoppedfor accurate UI
Checked: Null safety, error handling, resource cleanup, state management
- Improve error handling in STTService to ensure frontend is notified of errors during STT start. - Add defensive measures to cleanup resources on errors, preserving sessionId for accurate error reporting. - Update STTSetupPopover to integrate FFmpeg help functionality, allowing users to send help messages directly to chat. - Refactor ChatTextArea to streamline STT setup interactions and remove deprecated FFmpeg help logic.
7ce7a71 to
25c0008
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ 1 Issue Found
| Severity | Issue | Location |
|---|---|---|
| SUGGESTION | Unused variable errorSessionId |
src/services/stt/STTService.ts:96-99 |
Recommendation: Minor cleanup - can be addressed before or after merge
Review Details (36 files)
Files Reviewed (since commit fc20b1fa):
src/services/stt/STTService.ts(1 suggestion)src/core/webview/sttHandlers.ts✅webview-ui/src/components/chat/STTSetupPopover.tsx✅webview-ui/src/components/chat/ChatTextArea.tsx✅src/services/stt/OpenAIWhisperClient.ts✅webview-ui/src/hooks/useSTT.ts✅- 20+ localization files ✅
.changeset/clear-hoops-melt.md✅
Checked: Security, bugs, error handling, resource cleanup, state management
Key Observations:
- ✅ Enhanced error handling in
STTService.tsproperly setsisActive = falseimmediately to prevent race conditions - ✅ Defensive error notification in
sttHandlers.tsensures frontend is notified even if service error handling fails - ✅ FFmpeg help integration in
STTSetupPopover.tsxcorrectly usessetTimeoutto deferonSend()after state update - ✅ WebSocket cleanup in
OpenAIWhisperClient.tsproperly handles null checks with optional chaining - ✅ Optimistic UI state in
useSTT.tsimmediately syncs on stop for accurate feedback
…tton components - Simplify cleanupOnError method in STTService by consolidating cleanup tasks into a single Promise.allSettled call. - Update ChatTextArea to manage speech-to-text availability state through the MicrophoneButton, removing deprecated logic. - Introduce onStatusChange prop in MicrophoneButton to handle STT status updates and check availability on hover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ No New Issues in Latest Changes
Changes since last review (commit 7ce7a71 → f93855b7) look good.
| File | Changes | Status |
|---|---|---|
STTService.ts |
Simplified cleanupOnError with Promise.allSettled |
✅ Good |
ChatTextArea.tsx |
Moved STT status management to MicrophoneButton | ✅ Good |
MicrophoneButton.tsx |
Added onStatusChange prop, availability check on hover |
✅ Good |
Review Details
Latest Commit Analysis:
- ✅
cleanupOnErrorrefactored to use cleanerPromise.allSettledpattern with inline.catch(() => {})handlers - ✅ STT availability state properly lifted from global state to local component state
- ✅
MicrophoneButtoncorrectly checks availability on mount viauseEffectand on hover viaonMouseEnter - ✅
useEventhook properly listens forstt:statusResponsemessages
Previous Issue Status:
- The unused
errorSessionIdvariable (SUGGESTION level) flagged in previous review is still present - existing comment remains active
Checked: Code quality, error handling, state management, component architecture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Previous Issue Addressed
The unused errorSessionId variable flagged in my previous review has been removed in commit 4833a08.
No new issues found in the latest changes.
Review Details
Latest Commit: 4833a08 - "address bot comment"
Changes: Removed unused variable declaration in STTService.ts
Status: Ready for merge
4833a08 to
a25cb52
Compare
Currently, the speech-to-text experiment is a little tricky to set up since it requires both FFmpeg and a valid OpenAI provider