Skip to content

Conversation

@hassoncs
Copy link
Contributor

@hassoncs hassoncs commented Dec 19, 2025

Currently, the speech-to-text experiment is a little tricky to set up since it requires both FFmpeg and a valid OpenAI provider

  • add a popover to help users figure out what they need to do in order to get speech-to-text working
  • Check for STT availability anytime you mouse over the microphone
  • Pass errors that happen in the Whisper socket all the way through to the popover
  • Optimistically update the state of the recording button so there's no UI delay
CleanShot 2025-12-19 at 11 37 18

@changeset-bot
Copy link

changeset-bot bot commented Dec 19, 2025

🦋 Changeset detected

Latest commit: a25cb52

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
kilo-code Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@hassoncs hassoncs force-pushed the stt-setup-helper branch 2 times, most recently from 4121827 to 89d7428 Compare December 19, 2025 20:27
@hassoncs hassoncs marked this pull request as ready for review December 19, 2025 20:27
Copy link
Contributor

@kiloconnect kiloconnect bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No Issues Found

8 files reviewed | Confidence: 90% | Recommendation: Merge

Review Details

Files Reviewed:

  • webview-ui/src/components/chat/STTSetupPopover.tsx (NEW) - Well-structured popover component
  • webview-ui/src/hooks/useSTT.ts - Optimistic state pattern properly implemented
  • webview-ui/src/components/chat/ChatTextArea.tsx - Main integration, on-demand availability check
  • src/services/stt/OpenAIWhisperClient.ts - WebSocket null safety improvements
  • src/core/webview/ClineProvider.ts - STT status removed from global state
  • src/core/webview/speechToTextCheck.ts - Caching logic removed
  • src/core/webview/webviewMessageHandler.ts - New message handler added
  • webview-ui/src/components/chat/MicrophoneButton.tsx - Visual-only disabled state

Checked: Security, bugs, performance, error handling, null safety

Key Observations:

  1. setTimeout(() => { onSend() }) without delay is intentional - defers to next event loop tick to ensure state update completes
  2. MicrophoneButton disabled prop now controls visual styling only - button remains clickable to show setup popover (intentional UX improvement)
  3. ✅ WebSocket null checks in OpenAIWhisperClient.ts properly guard against null this.ws
  4. ✅ Initial undefined state for speechToTextStatus correctly triggers popover until availability is confirmed
  5. ✅ On-demand availability checking pattern is cleaner than eager loading at global state init

Localization: 20+ language files updated with consistent translation keys

- Remove cachedResult variable and CACHE_DURATION_MS constant
- Remove cache checking logic and forceRecheck parameter
- Always perform fresh check without caching
- Update function documentation to reflect no caching behavior

Remove speechToTextStatus from getStateToPostToWebview

- Remove expensive checkSpeechToTextAvailable call from state generation
- Remove speechToTextStatus from returned state object
- Remove unused import
- Status will now be fetched on-demand instead of on every state update

Add speechToTextStatus field to ExtensionMessage interface

- Add field for speechToTextStatusResponse message type
- Place near other STT-related fields for consistency

Add checkSpeechToTextAvailable message handler

- Add handler case for checkSpeechToTextAvailable message type
- Dynamically import checkSpeechToTextAvailable function
- Send speechToTextStatusResponse with status to webview
- Place handler near other STT-related handlers

Update ChatTextArea to use local state for speechToTextStatus

- Remove speechToTextStatus from useExtensionState() destructuring
- Add local useState for speechToTextStatus
- Add useEffect to request STT check on mount (only if experiment enabled)
- Add message handler for speechToTextStatusResponse to update local state

Add onMouseEnter prop to MicrophoneButton component

- Add onMouseEnter optional prop to MicrophoneButtonProps interface
- Pass onMouseEnter handler to button element
- Enables hover-triggered STT availability checks

Implement hover handler for microphone button

- Add handleMicrophoneHover callback that requests STT availability check
- Connect handler to MicrophoneButton onMouseEnter prop
- Enables real-time status updates when user hovers over microphone icon

Rename checkSpeechToTextAvailable to stt:checkAvailability

- Rename message type to follow stt: prefix convention
- Update handler case in webviewMessageHandler
- Update message calls in ChatTextArea (useEffect and hover handler)
- Consistent with other STT events (stt:start, stt:stop, stt:cancel)

Rename speechToTextStatusResponse to stt:statusResponse

- Rename response message type to follow stt: prefix convention
- Update handler in webviewMessageHandler
- Update message listener in ChatTextArea
- Consistent with other STT message types

Move stt:statusResponse handler to separate useEvent block

- Separate STT status handler from TTS handlers
- Add comment explaining it's separate from recording events in useSTT hook
- Remove debug console.log from useEffect
- Better organization: TTS handlers, then STT status handler

Create STTSetupPopover component and update translations

- Create new STTSetupPopover component for interactive STT setup help
- Update English translations: simplify FFmpeg message, add popover strings
- Update Arabic translations: add popover strings
- Component shows error message and help actions based on availability reason

Update FFmpeg help to use Trans component with clickable link

- Change FFmpeg help from Button to Trans component with VSCodeLink
- Only 'Click here' text is clickable, rest is plain text
- Update translation strings to include <ffmpegLink> placeholder
- Matches pattern used in other components like TelemetryBanner

Update MicrophoneButton to remove tooltipContent and onMouseEnter props

- Remove tooltipContent prop (no longer needed for disabled state)
- Remove onMouseEnter prop (hover check removed)
- Button is always clickable (removed disabled attribute)
- Keep visual disabled styling via className when disabled prop is true

Integrate STTSetupPopover into ChatTextArea

- Add popover state management
- Update handleMicrophoneClick to open popover when STT unavailable
- Remove handleMicrophoneHover (no longer needed)
- Wrap MicrophoneButton with STTSetupPopover
- Implement handleFfmpegHelpClick to send help message to chat
- Remove tooltipContent and onMouseEnter props from MicrophoneButton usage

Improve both unavailable case in STTSetupPopover

- Show both FFmpeg and OpenAI help options when both are missing
- Display help actions in a flex column layout
- Provides complete setup guidance for users missing both requirements

Add optimistic UI updates for microphone button and fix text clearing issue

- Add optimistic state management in useSTT hook for immediate UI feedback
- Update microphone button icon immediately on click without waiting for backend
- Fix text clearing issue by keeping live transcript visible after stop until onComplete
- Sync optimistic state with backend state when events arrive
- Remove debug logging while keeping optimistic functionality

Enhance STTSetupPopover with detailed setup guidance and translations

- Introduce STTSetupPopoverContent for improved structure and clarity
- Update translations for multiple languages to include detailed setup instructions
- Modify popover behavior to show relevant help based on missing requirements (FFmpeg and OpenAI)
- Remove redundant code and streamline the component for better maintainability
- Update OpenAIWhisperClient to ensure WebSocket event listeners are removed only if initialized.
- Add error handling in ChatTextArea to set and clear STT error state during recording.
- Integrate error display in STTSetupPopover for improved user feedback on STT setup issues.
- Sync optimistic UI state in useSTT hook for immediate feedback on recording actions.
Copy link
Contributor

@kiloconnect kiloconnect bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No Issues Found

13 source files reviewed | Confidence: 95% | Recommendation: Merge

Review Details

Files Reviewed:

  • webview-ui/src/components/chat/STTSetupPopover.tsx (new)
  • webview-ui/src/components/chat/ChatTextArea.tsx
  • webview-ui/src/components/chat/MicrophoneButton.tsx
  • webview-ui/src/hooks/useSTT.ts
  • src/core/webview/speechToTextCheck.ts
  • src/core/webview/webviewMessageHandler.ts
  • src/core/webview/ClineProvider.ts
  • src/services/stt/OpenAIWhisperClient.ts
  • src/shared/ExtensionMessage.ts
  • src/shared/WebviewMessage.ts
  • apps/storybook/stories/STTSetupPopover.stories.tsx (new)
  • 20+ localization files

Checked:

  • ✅ Security: No vulnerabilities found. API keys handled through existing secure channels.
  • ✅ Bug detection: Optimistic state pattern correctly implemented with proper cleanup.
  • ✅ Code quality: Proper TypeScript types, consistent patterns, good separation of concerns.
  • ✅ Performance: On-demand availability checking is appropriate for the UX improvement.
  • ✅ Error handling: WebSocket cleanup properly handled with try-catch.

Architecture Notes:

  • Moving STT availability checking from global state to on-demand is a good design choice
  • The optimistic UI pattern in useSTT.ts provides immediate feedback while waiting for actual status
  • Proper cleanup of event listeners and timeouts throughout

Copy link
Contributor

@kiloconnect kiloconnect bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No New Issues in Latest Changes

Changes since last review (commit 56c428fdfc20b1fa) look good.

File Changes Status
OpenAIWhisperClient.ts WebSocket null safety, cleanup on error ✅ Good
ChatTextArea.tsx Error state management ✅ Good
STTSetupPopover.tsx Error display integration ✅ Good
useSTT.ts Optimistic state sync on stop ✅ Good
Review Details

Latest Commit Analysis:

  • ✅ WebSocket null checks properly guard against null this.ws using optional chaining
  • ✅ Error cleanup in catch block (removeAllListeners(), close()) prevents resource leaks
  • ✅ Timeout reduction (10s → 5s) is reasonable for connection establishment
  • ✅ Error state properly flows from useSTTChatTextAreaSTTSetupPopover
  • ✅ Optimistic state immediately syncs on stt:stopped for accurate UI

Checked: Null safety, error handling, resource cleanup, state management

- Improve error handling in STTService to ensure frontend is notified of errors during STT start.
- Add defensive measures to cleanup resources on errors, preserving sessionId for accurate error reporting.
- Update STTSetupPopover to integrate FFmpeg help functionality, allowing users to send help messages directly to chat.
- Refactor ChatTextArea to streamline STT setup interactions and remove deprecated FFmpeg help logic.
Copy link
Contributor

@kiloconnect kiloconnect bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 1 Issue Found

Severity Issue Location
SUGGESTION Unused variable errorSessionId src/services/stt/STTService.ts:96-99

Recommendation: Minor cleanup - can be addressed before or after merge

Review Details (36 files)

Files Reviewed (since commit fc20b1fa):

  • src/services/stt/STTService.ts (1 suggestion)
  • src/core/webview/sttHandlers.ts
  • webview-ui/src/components/chat/STTSetupPopover.tsx
  • webview-ui/src/components/chat/ChatTextArea.tsx
  • src/services/stt/OpenAIWhisperClient.ts
  • webview-ui/src/hooks/useSTT.ts
  • 20+ localization files ✅
  • .changeset/clear-hoops-melt.md

Checked: Security, bugs, error handling, resource cleanup, state management

Key Observations:

  1. ✅ Enhanced error handling in STTService.ts properly sets isActive = false immediately to prevent race conditions
  2. ✅ Defensive error notification in sttHandlers.ts ensures frontend is notified even if service error handling fails
  3. ✅ FFmpeg help integration in STTSetupPopover.tsx correctly uses setTimeout to defer onSend() after state update
  4. ✅ WebSocket cleanup in OpenAIWhisperClient.ts properly handles null checks with optional chaining
  5. ✅ Optimistic UI state in useSTT.ts immediately syncs on stop for accurate feedback

…tton components

- Simplify cleanupOnError method in STTService by consolidating cleanup tasks into a single Promise.allSettled call.
- Update ChatTextArea to manage speech-to-text availability state through the MicrophoneButton, removing deprecated logic.
- Introduce onStatusChange prop in MicrophoneButton to handle STT status updates and check availability on hover.
Copy link
Contributor

@kiloconnect kiloconnect bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No New Issues in Latest Changes

Changes since last review (commit 7ce7a71f93855b7) look good.

File Changes Status
STTService.ts Simplified cleanupOnError with Promise.allSettled ✅ Good
ChatTextArea.tsx Moved STT status management to MicrophoneButton ✅ Good
MicrophoneButton.tsx Added onStatusChange prop, availability check on hover ✅ Good
Review Details

Latest Commit Analysis:

  • cleanupOnError refactored to use cleaner Promise.allSettled pattern with inline .catch(() => {}) handlers
  • ✅ STT availability state properly lifted from global state to local component state
  • MicrophoneButton correctly checks availability on mount via useEffect and on hover via onMouseEnter
  • useEvent hook properly listens for stt:statusResponse messages

Previous Issue Status:

  • The unused errorSessionId variable (SUGGESTION level) flagged in previous review is still present - existing comment remains active

Checked: Code quality, error handling, state management, component architecture

@hassoncs hassoncs changed the title Improve speech-to-text setup UX Improve speech-to-text setup UX and error passing Dec 19, 2025
Copy link
Contributor

@kiloconnect kiloconnect bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Previous Issue Addressed

The unused errorSessionId variable flagged in my previous review has been removed in commit 4833a08.

No new issues found in the latest changes.

Review Details

Latest Commit: 4833a08 - "address bot comment"

Changes: Removed unused variable declaration in STTService.ts

Status: Ready for merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants