This directory contains locally cached documentation for the libraries and APIs used in the EasyTranslator project.
Location: /docs/web-speech-api/
Browser-native speech recognition and synthesis API.
Files:
README.md- Overview and quick start guideAPI.md- Complete API reference with all methods, properties, and eventsUSAGE.md- Detailed usage guide with best practices and patternsexamples/- Working code examplesbasic-recognition.js- Simple speech recognitioncontinuous-recognition.js- Continuous recognition with interim resultson-device-recognition.js- On-device processing with language pack managementcontextual-biasing.js- Domain-specific recognition improvementspeech-synthesis.js- Text-to-speech with voice selection
version.txt- Version info and compatibility notes
Key Features:
- Speech recognition (audio to text)
- Speech synthesis (text to audio)
- On-device processing (experimental)
- Contextual biasing
- Multilingual support
Browser Compatibility: Requires vendor prefixes (webkitSpeechRecognition)
Location: /docs/mistral-voxtral/
Mistral AI's frontier speech understanding models for transcription and audio Q&A.
Files:
README.md- Overview, model variants, and quick startAPI.md- Complete API reference with endpoints and parametersexamples/- Integration examplesbasic-transcription.js- Simple audio transcriptiontranscription-with-timestamps.js- Segment-level timestamps for subtitlesaudio-chat.js- Q&A and summarization from audiofunction-calling.js- Voice commands triggering backend functionsvue-composable.ts- Vue 3 composable for EasyTranslator integrationedge-function.ts- Serverless function to hide API key
version.txt- Version info, pricing, and capabilities
Model Variants:
- Voxtral (24B) - Production scale (
voxtral-small-latest) - Voxtral Mini (3B) - Edge deployment (
voxtral-mini-latest) - Voxtral Mini Transcribe - API-optimized transcription
Key Features:
- 32k token context (30+ minutes audio)
- Built-in Q&A and summarization
- Native multilingual support (auto-detection)
- Function-calling from voice
- $0.001 per minute pricing
API Endpoints:
POST /v1/audio/transcriptions- Transcription onlyPOST /v1/chat/completions- Chat with audio input
Use Web Speech API when:
- Need browser-native recognition (no API costs)
- Building voice commands for UI
- Want text-to-speech synthesis
- Privacy is critical (can run on-device)
- Working offline
Limitations:
- Server-based in Chrome (requires network)
- Less accurate than Voxtral
- No built-in translation
- Variable browser support
Use Mistral Voxtral when:
- Need high-accuracy transcription
- Want to ask questions about audio
- Building translation features
- Need function-calling from voice
- Working with long-form audio (up to 30 min)
Limitations:
- Requires API key and network
- Costs $0.001 per minute
- Not available offline
EasyTranslator currently uses Voxtral Mini for transcription:
// In useAudioRecorder.ts
const formData = new FormData()
formData.append('file', audioBlob, 'recording.webm')
formData.append('model', 'voxtral-mini') // ← Current model
const response = await fetch('https://api.mistral.ai/v1/audio/transcriptions', {
method: 'POST',
headers: { 'x-api-key': apiKey },
body: formData,
})-
Add Web Speech API as Fallback
- Use for quick, local recognition
- Fall back to Voxtral for accuracy
- Enable offline mode
-
Implement Audio Q&A
- Let users ask questions about conversations
- Summarize long recordings
- Extract action items from meetings
-
Enable Function Calling
- Voice commands: "Translate to French"
- Auto-detect: "Send this to my email"
- Smart actions: "Save this conversation"
-
Add Timestamp Support
- Generate subtitles from transcriptions
- Enable seeking in long recordings
- Create conversation chapters
-
Optimize with Edge Functions
- Hide API key from frontend
- Add rate limiting
- Implement caching
Last Updated: 2025-11-26
Update Schedule:
- Check for API changes monthly
- Update examples when EasyTranslator architecture changes
- Add new examples based on user needs
How to Update:
- Visit source URLs listed in each
version.txt - Check for API changes or new features
- Update relevant markdown files
- Add new examples if needed
- Update this README with changes
- MDN: https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API
- Spec: https://wicg.github.io/speech-api/
- Announcement: https://mistral.ai/news/voxtral
- API Docs: https://docs.mistral.ai/capabilities/audio_transcription
- Pricing: https://mistral.ai/pricing
When adding new documentation:
- Follow the existing structure
- Include working code examples
- Document browser/API compatibility
- Update this README
- Add version info to
version.txt