103 Languages Supported for Real-Time Translation — Any Direction
Updated May 2026
Looking for a real-time translation app that supports your language pair? Live Translate Live supports 103 languages with simultaneous two-way translation in any direction. That’s 10,506 possible language combinations — all with live scrolling translation on screen, and AI voice playback in 74 of them. The full canonical reference (with accuracy tiers per language) lives at /languages; this post walks through the list and what to expect from different language pairs.
Full List of Supported Languages
Every language below works as both a source and target language. Pick any two and start a bilingual conversation with real-time translation.
- Afrikaans
- Albanian
- Amharic
- Arabic
- Armenian
- Assamese
- Azerbaijani
- Bashkir
- Basque
- Belarusian
- Bengali
- Bosnian
- Breton
- Bulgarian
- Burmese
- Catalan
- Chinese (Mandarin)
- Chinese (Traditional)
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Faroese
- Filipino
- Finnish
- French
- French (Canada)
- Galician
- Georgian
- German
- Greek
- Gujarati
- Haitian Creole
- Hausa
- Hawaiian
- Hebrew
- Hindi
- Hungarian
- Icelandic
- Indonesian
- Italian
- Japanese
- Javanese
- Kannada
- Kazakh
- Khmer
- Korean
- Lao
- Latin
- Latvian
- Lingala
- Lithuanian
- Luxembourgish
- Macedonian
- Malagasy
- Malay
- Malayalam
- Maltese
- Māori
- Marathi
- Mongolian
- Nepali
- Norwegian
- Norwegian Nynorsk
- Occitan
- Pashto
- Persian
- Polish
- Portuguese
- Portuguese (Portugal)
- Punjabi
- Romanian
- Russian
- Sanskrit
- Serbian
- Shona
- Sindhi
- Sinhala
- Slovak
- Slovenian
- Somali
- Spanish
- Sundanese
- Swahili
- Swedish
- Tagalog
- Tajik
- Tamil
- Tatar
- Telugu
- Thai
- Tibetan
- Turkish
- Turkmen
- Ukrainian
- Urdu
- Uzbek
- Vietnamese
- Welsh
- Yiddish
- Yoruba
Speech Recognition vs Translation: Two Different Layers
A quick thing worth understanding if you’re comparing tools: “language support” in a live translation app is actually two different things stacked on top of each other. First the app has to recognize what you said (speech-to-text, or ASR). Then it has to translate that text into the other language (machine translation, or MT). These are separate systems with separate coverage.
Live Translate Live uses ElevenLabs Scribe v2 Realtime for speech recognition and Google Gemini 2.5 for translation. The 103-language list above is what Scribe v2 streams; Gemini 2.5 covers all of those plus many more on the text-translation side. The limiting factor for live speech translation is almost always the ASR layer, not the translation layer.
This is why a language can appear in Google Translate’s text interface but not in a live speech app. Text translation only needs the MT layer. Live speech translation needs both layers to work in milliseconds, on a streaming audio feed, with whatever accent and background noise the microphone happens to pick up. When a language drops below usable streaming-ASR quality, we’d rather leave it out than ship something that mis-transcribes half the conversation.
ElevenLabs publishes a four-tier word-error-rate (WER) grid for Scribe v2 — Excellent (≤5%), High (5–10%), Good (10–15%), and Developing (15%+). We surface those tiers as colored dots next to each language in the in-app picker, and the full breakdown is at /languages. The new launch announcement post covers the engine swap and what changed for users.
Languages and Regional Variants
Many of the 103 languages above have multiple regional dialects. Some are handled as a single model, some have per-region variants you can select, and a few large language families collapse into one dominant variant at the ASR layer. Here’s how the most common ones break down.
| Language | Regional variants | Notes |
|---|---|---|
| Spanish | Single model (handles Iberian and Latin American) | Scribe handles both regional varieties cleanly in one model. Regional vocabulary differences (e.g. coche vs carro) are preserved in transcription and the translator handles them cleanly into other languages. |
| Portuguese | pt (Brazilian default), pt-PT (European) |
Brazilian Portuguese dominates the training data and transcribes very reliably. European Portuguese is a selectable variant; useful for users in Portugal or with Lusófono African speakers. |
| French | fr (default), fr-CA (Quebec) |
Quebec French is a selectable variant. The default model handles fr-FR, fr-BE, fr-CH cleanly; very strong regional Quebec slang can occasionally get normalized toward standard French in the transcript unless you pick the fr-CA variant. |
| English | Single model (US, UK, AU, IN, NZ, ZA accents) | The most heavily trained language, with excellent coverage across major accents. Indian English in particular is well-handled. |
| Chinese | Mandarin (zh), Traditional (zh-TW); Cantonese as its own entry |
Mandarin is the primary spoken target. Cantonese is supported as its own language in the picker. Simplified and Traditional script both render correctly on the translation side. |
| Arabic | Modern Standard Arabic (primary) | Arabic is supported, but dialect handling varies. MSA works best. Egyptian, Gulf, Levantine, and Maghrebi dialects transcribe with varying accuracy — colloquial dialect speech is the hardest case in the whole language set. |
| Hindi / Urdu | hi, ur |
Linguistically very close at the spoken level but written in different scripts (Devanagari vs Nastaliq). Both are supported as separate ASR targets. |
| Norwegian | no (Bokmål), nn (Nynorsk) |
Bokmål is the default. Nynorsk is a selectable variant for speakers of that written standard. |
| Serbian / Croatian / Bosnian | Separate models per language | Mutually intelligible at the spoken level but treated as three separate languages with their own scripts and norms. Serbian supports both Cyrillic and Latin script output. |
Where the picker offers a specific regional variant, picking the one that matches the speaker usually improves transcription accuracy by a noticeable margin — especially for Portuguese and French.
Which Language Pairs Perform Best
All 10,506 pairs work, but they don’t all feel the same. Three factors drive the real-world experience of a given pair:
- Both sides in the top tier. When both languages are in the Excellent (≤5% WER) or High (5–10%) tier, recognition essentially disappears as a bottleneck. Examples: English ↔ Spanish, English ↔ French, Spanish ↔ Portuguese, English ↔ Japanese, German ↔ Dutch.
- Language-family proximity. Pairs inside the same family (Spanish ↔ Portuguese, Dutch ↔ German, Czech ↔ Slovak, Hindi ↔ Urdu) translate with the highest fluency. The models have seen huge amounts of parallel data, and the syntactic structure often lines up closely enough that translation reads like natural speech.
- Word-order alignment. English is SVO (subject-verb-object). Japanese and Korean are SOV (subject-object-verb). That means a full Japanese sentence often can’t be translated until the verb has been spoken — the translator has to wait for the end of the clause. Pairs with the same word order produce more fluid incremental translations; pairs with inverted word order produce chunkier output that updates at clause boundaries.
The pairs that feel fastest and smoothest in practice:
- English ↔ Spanish, Spanish ↔ Portuguese, Spanish ↔ Italian — Romance family, shared SVO order, extremely well-trained
- English ↔ French, English ↔ German, English ↔ Dutch — high-resource pairs with similar structure
- Hindi ↔ Urdu — near-identical spoken language, only the script differs
- Czech ↔ Slovak, Serbian ↔ Croatian — linguistically close pairs inside Slavic
Pairs that work well but show more visible “thinking” before each sentence appears:
- English ↔ Japanese, English ↔ Korean — SOV structure forces clause-level translation
- English ↔ Arabic — right-to-left script, different morphology, dialect variance
- Pairs involving Developing-tier languages (Pashto, Lao, Sindhi, etc.) — transcripts are rougher but the conversation still flows; Audio mode can be friendlier here because you can review the transcript before tapping Translate
What “Real-Time” Actually Means
“Real-time” is a loose word, so here’s what it looks like in practice. As you speak, words start showing up on the screen almost immediately — interim text that firms up as the recognizer hears more of the sentence. ElevenLabs publishes ~150 ms streaming latency for Scribe v2, and most words appear within ~400–800 ms of being spoken.
Translation kicks in once each sentence is finalized. Gemini 2.5 returns a translation in the low hundreds of milliseconds, with the benefit that it sees the previous turns and maintains conversational context across them. End-to-end latency from “speaker finishes a sentence” to “listener sees the translation” is usually under a second.
This is why word order matters: the recognizer will happily show interim text as you talk, but the translator needs a coherent clause to work with. In English ↔ Spanish, that clause boundary comes at roughly the same place in both languages. In English ↔ Japanese, it doesn’t — which is why Japanese translations appear in slightly larger bursts than Spanish ones.
Edge Cases: Dialects, Scripts, and Names
Live conversation rarely stays inside the neat boundaries of a single language. A few edge cases are worth calling out because they come up a lot.
Code-switching. Bilingual speakers often mix languages mid-sentence — Spanglish, Hinglish, Franglais. The app handles this gracefully most of the time, putting each language on the right side of the screen even when a speaker flips mid-utterance. For predictable code-switching it still helps to keep one language assigned per speaker; for incidental words, the translation layer usually forgives small transcription quirks.
Proper nouns and names. Names of people, companies, and places are the hardest thing for an ASR system to get right because they don’t follow dictionary distributions. You’ll occasionally see a name come out phonetically odd. The translation layer typically leaves proper nouns untranslated, which is the correct behavior — “Guillermo” should stay “Guillermo” in English, not become “William”. For critical names (medical appointments, legal conversations), it helps to also type or show the name in writing once. You can also add names as keyterms from the menu — Scribe v2 supports keyterm prompting, which biases the engine toward recognizing the names you’ve told it about.
“Untranslatable” cultural terms. Words like Schadenfreude, saudade, hygge, or giri don’t have clean one-word equivalents in English. Gemini 2.5 will either transliterate (keep the original word) or paraphrase (e.g. “a cozy feeling of contentment” for hygge). Paraphrases can be long — occasionally a single word in the source produces a clause in the target. This is expected behavior, not a bug.
Technical jargon and industry vocabulary. Medical terms, legal terms, and product names are generally well-handled by the translator but can trip up ASR for languages in the Good or Developing accuracy tiers. If you’re using the app for a domain-specific conversation in a smaller language, expect occasional fuzzy transcripts on specialized terms — and for anything truly safety-critical, a human interpreter is still the right call.
Mixed-language households. For bilingual families where one person speaks one language and another answers in a different one, Live Translate Live’s two-stream architecture handles this natively — each speaker’s side of the conversation gets translated to the other side’s language, regardless of who’s talking when.
Any-to-Any Direction
Most translation apps only translate to and from English. Live Translate Live is different — it supports any-to-any translation between all 103 languages. That means you can translate directly between Japanese and French, Arabic and Korean, Hindi and Portuguese, Vietnamese and Polish, or any other combination. No English intermediary required.
With 103 languages and any-direction pairing, you get 10,506 unique language pairs. Both sides of the conversation are translated simultaneously with two-way translation, so neither speaker has to wait.
Live AI Voice in 74 Languages (Audio Mode)
In addition to the scrolling marquee, Audio mode plays the translated sentence aloud through your device speaker using ElevenLabs v3 (with Flash v2.5 as a faster alternative for the languages it covers). 74 of the 103 supported languages have AI voice playback today; the rest still work in Audio mode, you just see the translated text on screen without spoken playback.
Audio mode is the right tool when reading a scrolling display isn’t practical — markets, taxis, hospital waiting rooms, anywhere you’d naturally pass the phone back and forth. The deep dive is in our Audio mode post.
Popular Language Pairs
While every combination works, here are some of the most commonly used pairs on Live Translate Live:
- English ↔ Spanish — the most popular pair worldwide for business and travel
- English ↔ French — essential across Europe, Africa, and Canada
- English ↔ Arabic — bridging communication across the Middle East and North Africa
- English ↔ Hindi — connecting multilingual families and businesses in South Asia
- English ↔ Japanese — critical for international business and travel in Japan
- English ↔ Korean — growing demand for business and cultural exchange
- English ↔ Mandarin / Cantonese — available as separate language entries
- Spanish ↔ Portuguese — natural pairing across Latin America
- French ↔ German — the key European business corridor
- Japanese ↔ Korean — one of our most-used non-English pairs
- Hindi ↔ Arabic — connecting South Asia and the Middle East without English
Don’t see your pair listed? If both languages are in the supported list above, they work together — all 10,506 combinations are fully supported.
Frequently Asked Questions
Does it handle accents?
Yes, across a broad range. Scribe v2 is trained on diverse accent data, and English in particular handles US, UK, Australian, Indian, and South African accents cleanly. For other languages, heavy regional accents within a country (thick rural dialects, for example) will see slightly higher error rates than news-anchor speech, but the overall trend in 2026 is that modern ASR is noticeably less accent-sensitive than older systems. If a specific speaker is getting poor recognition, the biggest single fix is usually a better microphone position rather than a different language setting.
What if a word doesn’t exist in the target language?
Gemini 2.5 will either transliterate (keep the original word, sometimes rendered in the target script) or paraphrase (use a short phrase that captures the meaning). For loanwords that are already common in the target language — “computer,” “taxi,” “sushi” — it simply passes them through. For genuinely culture-specific words without a direct equivalent, expect a paraphrase. Neither case breaks the conversation; you just occasionally get a slightly longer sentence than the speaker produced.
Do both speakers have to use the same browser?
No. Live Translate Live is a single-screen app — the usual setup is two people on one device, either sitting side-by-side or using vis-à-vis mode to read the display from opposite sides of a table. There’s no separate login or app install required for the second speaker; they just need to be in range of the microphone. For setups across a room or on a TV, the display can be cast or opened on a separate screen.
Can I switch languages mid-conversation?
Yes. The language picker is live — you can change either side’s language at any time without ending the session. This is useful when a third person joins a conversation in a different language, or when a bilingual speaker switches from one language to another halfway through. Credits continue uninterrupted across the switch.
Try It with Your Language Pair
Pick your two languages and start a real-time bilingual conversation in seconds. Live Translate Live works in your browser — no app to download. Both speakers see a scrolling translation display with live translations as they talk, or you can switch to Audio mode for one-on-one handheld use with AI voice playback.
Translation credits start at just $1 for 15 minutes. No subscription required.
Start translating · Audio mode · Full language reference · View pricing