Supported Languages
Live Translate Live supports 103 languages for real-time speech recognition, 74 with live AI voice playback in Audio mode, and any-to-any translation across all 103 — that’s 10,506 unique language pairs in the marquee. This page is the canonical reference: which languages are supported, how accurately each is recognized, and where AI voice is available.
Speech Recognition Accuracy Tiers
We use ElevenLabs Scribe v2 Realtime for live speech recognition. ElevenLabs publishes word-error-rate (WER) benchmarks for Scribe across its supported languages, grouped into four tiers. A lower WER means more words come through correctly. The tiers below are the published benchmarks; in real conversation, microphone quality and ambient noise matter more than the difference between the top two tiers.
| Tier | WER | Languages |
|---|---|---|
| Excellent | ≤ 5% | Belarusian, Bosnian, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Latvian, Macedonian, Malay, Malayalam, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Spanish, Swedish, Turkish, Ukrainian, Vietnamese (36) |
| High | 5–10% | Armenian, Azerbaijani, Bengali, Cantonese, Filipino, Georgian, Gujarati, Hindi, Kazakh, Lithuanian, Maltese, Mandarin Chinese, Marathi, Nepali, Odia, Persian, Serbian, Slovenian, Swahili, Tamil, Telugu (21) |
| Good | 10–15% | Afrikaans, Arabic, Assamese, Asturian, Burmese, Hausa, Hebrew, Javanese, Korean, Kyrgyz, Luxembourgish, Māori, Occitan, Punjabi, Tajik, Thai, Uzbek, Welsh (18) |
| Developing | 15%+ | Amharic, Ganda, Igbo, Irish, Khmer, Kurdish, Lao, Mongolian, Northern Sotho, Pashto, Shona, Sindhi, Somali, Urdu, Wolof, Xhosa, Yoruba, Zulu (18) |
WER ranges and language groupings are based on ElevenLabs’ published Scribe v2 benchmarks. Tier dots appear next to each language in the in-app language picker so you can see at a glance what to expect. The benchmark covers about 93 languages; our app supports a small additional set (Bashkir, Basque, Breton, Faroese, Haitian Creole, Hawaiian, Latin, Lingala, Malagasy, Sanskrit, Sinhala, Albanian, Sundanese, Tatar, Turkmen, Yiddish, Tibetan) that Scribe handles but for which ElevenLabs hasn’t published a WER tier — those show in the picker without a tier dot and work in conversation; we just don’t have an official accuracy number to attach.
Live AI Voice (Audio Mode)
Audio mode plays the translated sentence aloud through your device speaker. The voice is generated by ElevenLabs v3 (with Flash v2.5 as a faster fallback for the languages it covers). When a language isn’t in either TTS model, Audio mode still works — you’ll get the translated text on screen, just without spoken playback.
74 languages have live voice playback today. ElevenLabs adds to this list periodically, and the app pulls the current coverage list from the API at startup — so when v3 grows, your Audio mode grows with it automatically.
In the language picker on the Audio page, the “Their Language” dropdown is automatically filtered to the languages that support voice playback. The marquee picker shows all 103 because the scrolling display doesn’t need TTS.
Translation Across All 103
Translation runs on Google Gemini 2.5. Every Scribe-recognized language can translate to and from every other one — no English intermediary needed. You can speak Japanese and have it land in Portuguese, or Hindi to Arabic, or Korean to Swahili. With 103 source-and-target languages, that’s 10,506 unique pairs.
Gemini 2.5 carries conversational context across turns. Pronouns, gendered agreement, idiomatic phrasing — the translator sees what was said previously and translates the next turn with that context in mind. This matters more than people realize: a single sentence translated in isolation is roughly an order of magnitude harder to translate naturally than a sentence translated as part of an ongoing conversation.
Regional Variants
Many of the 103 languages above have multiple regional dialects. Some are handled as a single model, some have per-region variants you can select, and a few language families collapse into one dominant variant at the speech-recognition layer.
| Language | Regional variants | Notes |
|---|---|---|
| French | fr (default), fr-CA (Quebec) |
Quebec French is a selectable variant in the picker — useful when one speaker is Québécois and the regional vocabulary matters. |
| Portuguese | pt (Brazilian default), pt-PT (European) |
Brazilian Portuguese dominates the training data. European Portuguese is a selectable variant for users in Portugal or with Lusófono African speakers. |
| Chinese | zh (Mandarin, Simplified), zh-TW (Traditional) |
Mandarin is the primary spoken target. Cantonese is also supported as its own entry in the High tier above. Traditional and Simplified scripts both render on the translation side. |
| Spanish | Single model (handles Iberian and Latin American) | Scribe handles both regional varieties cleanly in one model. The translator preserves regional vocabulary differences (coche vs carro) in transcription and renders them appropriately in target languages. |
| English | Single model (US, UK, AU, IN, NZ, ZA accents) | The most heavily trained language with strong cross-accent coverage. Indian English in particular is well-handled. |
| Arabic | Modern Standard Arabic (primary) | MSA works best. Egyptian, Gulf, Levantine, and Maghrebi dialects transcribe with varying accuracy — colloquial dialect speech is the hardest case in the whole language set. |
| Hindi / Urdu | hi, ur |
Linguistically very close at the spoken level but written in different scripts (Devanagari vs Nastaliq). Both are supported as separate ASR targets. |
| Norwegian | no, nn (Nynorsk) |
Bokmål is the default. Nynorsk is selectable for speakers of that written standard. |
| Serbian / Croatian / Bosnian | Separate entries per language | Mutually intelligible at the spoken level but treated as three separate languages with their own scripts and norms. |
When a regional variant is available in the picker, choosing the one that matches the speaker usually improves transcription accuracy noticeably — especially for Portuguese and French.
Two-Way Conversation
In a two-way conversation, the app keeps each speaker’s words on their own side of the screen. The marquee almost never puts the wrong speaker’s words on the wrong side; the hardest cases in practice are extremely short utterances (“OK”, “hmm”, single proper nouns) and code-switching where a bilingual speaker flips languages mid-sentence. Both languages are translated simultaneously, so neither speaker has to wait for the other to finish.
Try It
Pick your two languages and start a real-time bilingual conversation. No app to download — everything runs in the browser. Translation credits start at $1 for 15 minutes; transcription in Audio mode is free until you tap Translate.
Start in the marquee · Try Audio mode · View pricing · See all features
Want the behind-the-scenes on how we got here? Read why we swapped engines or the 103-language launch announcement.