|
1 | 1 | --- |
2 | | -title: Voice Fallback Plan |
| 2 | +title: Voice fallback configuration |
3 | 3 | subtitle: Configure fallback voices that activate automatically if your primary voice fails. |
4 | 4 | slug: voice-fallback-plan |
5 | 5 | --- |
6 | 6 |
|
7 | | -<Note> |
8 | | - Voice fallback plans can currently only be configured through the API. We are working on making this available through our dashboard. |
9 | | -</Note> |
10 | | - |
11 | | -## Introduction |
| 7 | +## Overview |
12 | 8 |
|
13 | | -Voice fallback plans give you the ability to continue your call in the event that your primary voice fails. Your assistant will sequentially fallback to only the voices you configure within your plan, in the exact order you specify. |
| 9 | +Voice fallback configuration gives you the ability to continue your call in the event that your primary voice fails. Your assistant will sequentially fallback to only the voices you configure within your plan, in the exact order you specify. |
14 | 10 |
|
15 | 11 | <Note> |
16 | 12 | Without a fallback plan configured, your call will end with an error in the event that your chosen voice provider fails. |
17 | 13 | </Note> |
18 | 14 |
|
19 | | -## How It Works |
| 15 | +## How it works |
20 | 16 |
|
21 | 17 | When a voice failure occurs, Vapi will: |
22 | 18 | 1. Detect the failure of the primary voice |
23 | 19 | 2. If a custom fallback plan exists: |
24 | | - - Switch to the first fallback voice in your plan |
25 | | - - Continue through your specified list if subsequent failures occur |
26 | | - - Terminate only if all voices in your plan have failed |
| 20 | + - Switch to the first fallback voice in your plan |
| 21 | + - Continue through your specified list if subsequent failures occur |
| 22 | + - Terminate only if all voices in your plan have failed |
27 | 23 |
|
28 | | -## Configuration |
| 24 | +## Configure via Dashboard |
| 25 | + |
| 26 | +<Steps> |
| 27 | + <Step title="Open Voice tab"> |
| 28 | + Navigate to your assistant and select the **Voice** tab. |
| 29 | + </Step> |
| 30 | + <Step title="Expand Fallback Voices section"> |
| 31 | + Scroll down to find the **Fallback Voices** collapsible section. A warning indicator appears if no fallback voices are configured. |
| 32 | + </Step> |
| 33 | + <Step title="Add a fallback voice"> |
| 34 | + Click **Add Fallback Voice** to configure your first fallback: |
| 35 | + - Select a **provider** from the dropdown (supports 20+ voice providers) |
| 36 | + - Choose a **voice** from the searchable popover (shows gender, language, and deprecated status) |
| 37 | + - The **model** is automatically selected based on your voice choice |
| 38 | + </Step> |
| 39 | + <Step title="Configure provider-specific settings (optional)"> |
| 40 | + Expand **Additional Configuration** to access provider-specific settings like stability, speed, and emotion controls. |
| 41 | + </Step> |
| 42 | + <Step title="Add more fallbacks"> |
| 43 | + Repeat to add additional fallback voices. Order matters—the first fallback in your list is tried first. |
| 44 | + </Step> |
| 45 | +</Steps> |
| 46 | + |
| 47 | +## Configure via API |
29 | 48 |
|
30 | 49 | Add the `fallbackPlan` property to your assistant's voice configuration, and specify the fallback voices within the `voices` property. |
31 | | -- Please note that fallback voices must be valid JSON configurations, and not strings. |
32 | | -- The order matters. Vapi will choose fallback voices starting from the beginning of the list. |
| 50 | + |
| 51 | +<Note> |
| 52 | + Fallback voices must be valid JSON configurations, not strings. The order matters—Vapi will choose fallback voices starting from the beginning of the list. |
| 53 | +</Note> |
33 | 54 |
|
34 | 55 | ```json |
35 | 56 | { |
36 | 57 | "voice": { |
37 | 58 | "provider": "openai", |
38 | 59 | "voiceId": "shimmer", |
39 | 60 | "fallbackPlan": { |
40 | | - "voices": [ |
41 | | - { |
42 | | - "provider": "cartesia", |
43 | | - "voiceId": "248be419-c632-4f23-adf1-5324ed7dbf1d" |
44 | | - }, |
45 | | - { |
46 | | - "provider": "11labs", |
47 | | - "voiceId": "cgSgspJ2msm6clMCkdW9" |
48 | | - } |
49 | | - ] |
| 61 | + "voices": [ |
| 62 | + { |
| 63 | + "provider": "cartesia", |
| 64 | + "voiceId": "248be419-c632-4f23-adf1-5324ed7dbf1d" |
| 65 | + }, |
| 66 | + { |
| 67 | + "provider": "11labs", |
| 68 | + "voiceId": "cgSgspJ2msm6clMCkdW9", |
| 69 | + "stability": 0.5, |
| 70 | + "similarityBoost": 0.75 |
| 71 | + } |
| 72 | + ] |
50 | 73 | } |
51 | 74 | } |
52 | 75 | } |
53 | 76 | ``` |
54 | 77 |
|
| 78 | +## Provider-specific settings |
| 79 | + |
| 80 | +Each voice provider supports different configuration options. Expand the accordion below to see available settings for each provider. |
| 81 | + |
| 82 | +<AccordionGroup> |
| 83 | + <Accordion title="ElevenLabs"> |
| 84 | + - **stability** (0-1): Controls voice consistency. Lower values allow more emotional range; higher values produce more stable output. |
| 85 | + - **similarityBoost** (0-1): Enhances similarity to the original voice. Higher values make the voice more similar to the reference. |
| 86 | + - **style** (0-1): Voice style intensity. Higher values amplify the speaker's style. |
| 87 | + - **useSpeakerBoost** (boolean): Enable to boost similarity to the original speaker. |
| 88 | + - **speed** (0.7-1.2): Speech speed multiplier. Default is 1.0. |
| 89 | + - **optimizeStreamingLatency** (0-4): Controls streaming latency optimization. Default is 3. |
| 90 | + - **enableSsmlParsing** (boolean): Enable SSML pronunciation support. |
| 91 | + - **model**: Select from `eleven_multilingual_v2`, `eleven_turbo_v2`, `eleven_turbo_v2_5`, `eleven_flash_v2`, `eleven_flash_v2_5`, or `eleven_monolingual_v1`. |
| 92 | + </Accordion> |
| 93 | + <Accordion title="Cartesia"> |
| 94 | + - **model**: Model selection (`sonic-english`, `sonic-3`, etc.). |
| 95 | + - **language**: Language code for the voice. |
| 96 | + - **experimentalControls.speed**: Speech speed adjustment (-1 to 1). Negative values slow down; positive values speed up. |
| 97 | + - **experimentalControls.emotion**: Array of emotion configurations (e.g., `["happiness:high", "curiosity:medium"]`). |
| 98 | + - **generationConfig** (sonic-3 only): |
| 99 | + - **speed** (0.6-1.5): Fine-grained speed control. |
| 100 | + - **volume** (0.5-2.0): Volume adjustment. |
| 101 | + - **experimental.accentLocalization** (0 or 1): Toggle accent localization. |
| 102 | + </Accordion> |
| 103 | + <Accordion title="Azure"> |
| 104 | + - **speed** (0.5-2): Speech rate multiplier. Default is 1.0. |
| 105 | + </Accordion> |
| 106 | + <Accordion title="OpenAI"> |
| 107 | + - **speed** (0.25-4): Speech speed multiplier. Default is 1.0. |
| 108 | + - **model**: Select from `tts-1`, `tts-1-hd`, or realtime models. |
| 109 | + - **instructions**: Voice prompt to control the generated audio style. Does not work with `tts-1` or `tts-1-hd` models. |
| 110 | + </Accordion> |
| 111 | + <Accordion title="LMNT"> |
| 112 | + - **speed** (0.25-2): Speech rate multiplier. Default is 1.0. |
| 113 | + - **language**: Two-letter ISO 639-1 language code, or `auto` for auto-detection. |
| 114 | + </Accordion> |
| 115 | + <Accordion title="Rime AI"> |
| 116 | + - **model**: Select from `arcana`, `mistv2`, or `mist`. Defaults to `arcana`. |
| 117 | + - **speed** (0.1+): Speech speed multiplier. |
| 118 | + - **pauseBetweenBrackets** (boolean): Enable pause control using angle brackets (e.g., `<200>` for 200ms pause). |
| 119 | + - **phonemizeBetweenBrackets** (boolean): Enable phonemization using curly brackets (e.g., `{h'El.o}`). |
| 120 | + - **reduceLatency** (boolean): Optimize for reduced streaming latency. |
| 121 | + - **inlineSpeedAlpha**: Inline speed control using alpha notation. |
| 122 | + </Accordion> |
| 123 | + <Accordion title="PlayHT"> |
| 124 | + - **speed** (0.1-5): Speech rate multiplier. |
| 125 | + - **temperature** (0.1-2): Controls voice variance. Lower values are more predictable; higher values allow more variation. |
| 126 | + - **emotion**: Emotion preset (e.g., `female_happy`, `male_sad`, `female_angry`, `male_surprised`). |
| 127 | + - **voiceGuidance** (1-6): Controls voice uniqueness. Lower values reduce uniqueness. |
| 128 | + - **styleGuidance** (1-30): Controls emotion intensity. Higher values create more emotional performance. |
| 129 | + - **textGuidance** (1-2): Controls text adherence. Higher values are more accurate to input text. |
| 130 | + - **model**: Select from `PlayHT2.0`, `PlayHT2.0-turbo`, `Play3.0-mini`, or `PlayDialog`. |
| 131 | + </Accordion> |
| 132 | + <Accordion title="Deepgram"> |
| 133 | + - **model**: Select from `aura` or `aura-2`. Defaults to `aura-2`. |
| 134 | + - **mipOptOut** (boolean): Opt out of the Deepgram Model Improvement Partnership program. |
| 135 | + </Accordion> |
| 136 | + <Accordion title="Hume"> |
| 137 | + - **model**: Model selection (e.g., `octave2`). |
| 138 | + - **description**: Natural language instructions describing how the speech should sound (tone, intonation, pacing, accent). |
| 139 | + - **isCustomHumeVoice** (boolean): Indicates whether using a custom Hume voice. |
| 140 | + </Accordion> |
| 141 | + <Accordion title="Minimax"> |
| 142 | + - **model**: Select from `speech-02-hd` (high-fidelity) or `speech-02-turbo` (low latency). Defaults to `speech-02-turbo`. |
| 143 | + - **emotion**: Emotion preset (`happy`, `sad`, `angry`, `fearful`, `surprised`, `disgusted`, `neutral`). |
| 144 | + - **pitch** (-12 to 12): Voice pitch adjustment in semitones. |
| 145 | + - **speed** (0.5-2): Speech speed adjustment. |
| 146 | + - **volume** (0.5-2): Volume adjustment. |
| 147 | + </Accordion> |
| 148 | + <Accordion title="WellSaid"> |
| 149 | + - **model**: Model selection. |
| 150 | + - **enableSsml** (boolean): Enable limited SSML translation for input text. |
| 151 | + - **libraryIds**: Array of library IDs to use for voice synthesis. |
| 152 | + </Accordion> |
| 153 | + <Accordion title="Neuphonic"> |
| 154 | + - **model**: Model selection (e.g., `neu_fast`). |
| 155 | + - **language**: Language code (required). |
| 156 | + - **speed** (0.25-2): Speech speed multiplier. |
| 157 | + </Accordion> |
| 158 | + <Accordion title="SmallestAI"> |
| 159 | + - **model**: Model selection (e.g., `lightning`). |
| 160 | + - **speed**: Speech speed multiplier. |
| 161 | + </Accordion> |
| 162 | +</AccordionGroup> |
| 163 | + |
55 | 164 | ## Best practices |
56 | 165 |
|
57 | | -- Use <b>different providers</b> for your fallback voices to protect against provider-wide outages. |
| 166 | +- Use **different providers** for your fallback voices to protect against provider-wide outages. |
58 | 167 | - Select voices with **similar characteristics** (tone, accent, gender) to maintain consistency in the user experience. |
| 168 | +- Test your fallback configuration to ensure smooth transitions between voices. |
59 | 169 |
|
60 | | -## How will pricing work? |
| 170 | +## FAQ |
61 | 171 |
|
62 | | -There is no change to the pricing of the voices. Your call will not incur any extra fees while using fallback voices, and you will be able to see the cost for each voice in your end-of-call report. |
| 172 | +<AccordionGroup> |
| 173 | + <Accordion title="How will pricing work?"> |
| 174 | + There is no change to the pricing of the voices. Your call will not incur any extra fees while using fallback voices, and you will be able to see the cost for each voice in your end-of-call report. |
| 175 | + </Accordion> |
| 176 | + <Accordion title="How many fallback voices can I configure?"> |
| 177 | + You can configure as many fallback voices as you need. However, we recommend 2-3 fallbacks from different providers for optimal reliability. |
| 178 | + </Accordion> |
| 179 | + <Accordion title="Will users notice when a fallback is activated?"> |
| 180 | + Users may notice a brief pause and a change in voice characteristics when switching to a fallback voice. Selecting voices with similar properties helps minimize this disruption. |
| 181 | + </Accordion> |
| 182 | +</AccordionGroup> |
0 commit comments