47 Languages Supported for Real-Time Translation — Any Direction
February 2026
Looking for a real-time translation app that supports your language pair? Live Translate Live supports 47 languages with simultaneous two-way translation in any direction. That's 2,162 possible language combinations — all with live scrolling translation on screen.
Full List of Supported Languages
Every language below works as both a source and target language. Pick any two and start a bilingual conversation with real-time translation.
- Arabic
- Belarusian
- Bengali
- Bosnian
- Bulgarian
- Catalan
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Filipino
- Finnish
- French
- German
- Greek
- Hebrew
- Hindi
- Hungarian
- Indonesian
- Italian
- Japanese
- Kannada
- Korean
- Latvian
- Lithuanian
- Macedonian
- Malay
- Marathi
- Norwegian
- Persian
- Polish
- Portuguese
- Romanian
- Russian
- Serbian
- Slovak
- Slovenian
- Spanish
- Swedish
- Tamil
- Telugu
- Turkish
- Ukrainian
- Urdu
- Vietnamese
Speech Recognition vs Translation: Two Different Layers
A quick thing worth understanding if you're comparing tools: "language support" in a live translation app is actually two different things stacked on top of each other. First the app has to recognize what you said (speech-to-text, or ASR). Then it has to translate that text into the other language (machine translation, or MT). These are separate systems with separate coverage.
Live Translate Live uses Deepgram for speech recognition and Google Cloud Translation for translation. The 47-language list above is the intersection — languages that both engines handle well enough for conversation. Google Cloud Translation by itself covers well over 130 languages; Deepgram's real-time streaming model covers around 40–50 at production quality. The limiting factor for live speech translation is almost always the ASR side, not the MT side.
This is why a language can appear in Google Translate's text interface but not in a live speech app. Text translation only needs the MT layer. Live speech translation needs both layers to work in milliseconds, on a streaming audio feed, with whatever accent and background noise the microphone happens to pick up. The minute a language drops to "beta" quality on the ASR side, we'd rather leave it out than ship something that mis-transcribes half the conversation.
A useful way to think about it: MT quality in 2026 is strong and roughly comparable across the top engines. ASR quality varies widely — it's the part that determines whether a live translation app works for a given language at all.
Languages and Regional Variants
Many of the 47 languages above have multiple regional dialects. Some are handled as a single model, some have per-region variants, and a few large language families collapse into one dominant variant at the ASR layer. Here's how the most common ones break down.
| Language | Regional variants | Notes |
|---|---|---|
| Spanish | es-ES (Spain), es-MX / es-419 (Latin America), es-US | Deepgram handles both Iberian and Latin American Spanish well. Regional vocabulary differences (e.g. coche vs carro) are preserved in transcription and the MT layer handles them cleanly into English. |
| Portuguese | pt-BR (Brazil), pt-PT (Portugal) | Brazilian Portuguese dominates the training data and transcribes very reliably. European Portuguese works but can see slightly higher error rates on fast colloquial speech. |
| French | fr-FR (France), fr-CA (Quebec), fr-BE, fr-CH | Quebec French has its own phonetic quirks; the model handles it well but very strong regional slang can occasionally get normalized toward standard French in the transcript. |
| English | en-US, en-GB, en-AU, en-IN, en-NZ, en-ZA | The most heavily trained language, with excellent coverage across major accents. Indian English in particular is well-handled. |
| Chinese | Mandarin only (zh-CN / zh-TW) | Only Mandarin Chinese is supported for live speech. Cantonese, Shanghainese, and other Chinese languages are not currently in the ASR set. Simplified and Traditional script both render correctly on the translation side. |
| Arabic | Modern Standard Arabic (primary) | Arabic is supported, but dialect handling varies. Modern Standard Arabic (MSA) works best. Egyptian, Gulf, Levantine, and Maghrebi dialects transcribe with varying accuracy — colloquial dialect speech is the hardest case in the whole language set. |
| Hindi / Urdu | hi-IN, ur-PK | Linguistically very close at the spoken level but written in different scripts (Devanagari vs Nastaliq). Both are supported as separate ASR targets. |
| Norwegian | Bokmål (primary) | Bokmål is the default. Nynorsk speakers are recognized but transcripts will normalize toward Bokmål. |
| Serbian / Croatian / Bosnian | Separate models per language | Mutually intelligible at the spoken level but treated as three separate languages with their own scripts and norms. Serbian supports both Cyrillic and Latin script output. |
Where an app setting lets you pick a specific regional variant, picking the one that matches the speaker usually improves transcription accuracy by a noticeable margin — especially for Portuguese, French, and English.
Which Language Pairs Perform Best
All 2,162 pairs work, but they don't all feel the same. Three factors drive the real-world experience of a given pair:
- Language-family proximity. Pairs inside the same family (Spanish ↔ Portuguese, Dutch ↔ German, Czech ↔ Slovak, Hindi ↔ Urdu) translate with the highest fluency. The models have seen huge amounts of parallel data, and the syntactic structure often lines up closely enough that machine translation reads like natural speech.
- Script and tokenization. Languages that share a script (Latin ↔ Latin, Cyrillic ↔ Cyrillic) tend to have slightly faster end-to-end rendering. Pairs that mix scripts (Japanese ↔ English, Arabic ↔ French, Hindi ↔ Korean) all work, but can show a small extra rendering step on the client as the display font switches.
- Word-order alignment. English is SVO (subject-verb-object). Japanese and Korean are SOV (subject-object-verb). That means a full Japanese sentence often can't be translated until the verb has been spoken — the translator has to wait for the end of the clause. Pairs with the same word order (English ↔ Spanish, German ↔ Dutch) produce more fluid incremental translations; pairs with inverted word order produce chunkier output that updates at clause boundaries.
The pairs that feel fastest and smoothest in practice:
- English ↔ Spanish, Spanish ↔ Portuguese, Spanish ↔ Italian — Romance family, shared SVO order, extremely well-trained
- English ↔ French, English ↔ German, English ↔ Dutch — high-resource pairs with similar structure
- Hindi ↔ Urdu — near-identical spoken language, only the script differs
- Czech ↔ Slovak, Serbian ↔ Croatian — linguistically close pairs inside Slavic
Pairs that work well but show more visible "thinking" before each sentence appears:
- English ↔ Japanese, English ↔ Korean — SOV structure forces clause-level translation
- English ↔ Arabic — right-to-left script, different morphology, dialect variance
- Any pair involving Tamil, Telugu, Kannada, or Marathi — smaller training corpora than the big Indo-European languages, so MT is slightly less fluent on rare vocabulary
What "Real-Time" Actually Means
"Real-time" is a loose word, so here's what it actually looks like under the hood. When you speak, audio is streamed to the server in small chunks (about 100 milliseconds each). The ASR model is emitting partial hypotheses continuously — interim transcripts that update as more context arrives, then firming up into finalized transcripts once the engine is confident. Sub-second recognition is normal for the supported 47 languages; most words appear on screen within ~400–800 ms of being spoken.
Translation happens on each finalized sentence chunk. Google Cloud Translation returns a translated string in the low tens of milliseconds. The translated text is pushed to both participants over a server-sent event stream and rendered on the scrolling marquee immediately. End-to-end latency from "speaker finishes a sentence" to "listener sees the translation" is usually under a second, with most of that budget spent on the ASR finalization step rather than the translation step itself.
This is why word order matters: the ASR will happily show interim text as you talk, but the translation engine needs a coherent clause to work with. In English ↔ Spanish, that clause boundary comes at roughly the same place in both languages. In English ↔ Japanese, it doesn't — which is why Japanese translations appear in slightly larger bursts than Spanish ones.
Edge Cases: Dialects, Scripts, and Names
Live conversation rarely stays inside the neat boundaries of a single language. A few edge cases are worth calling out because they come up a lot.
Code-switching. Bilingual speakers often mix languages mid-sentence — Spanglish, Hinglish, Franglais. Current-generation ASR picks a single language model per stream, so a code-switched phrase will get interpreted through the dominant language of the stream. If you set "yours" to Spanish and say an English word, the ASR will do its best to fit that word into Spanish phonetics and may transcribe it unusually. For predictable code-switching, a workaround is to flip the language setting mid-conversation; for incidental English words inside an otherwise non-English sentence, the translation layer usually forgives small transcription quirks.
Proper nouns and names. Names of people, companies, and places are the hardest thing for an ASR system to get right because they don't follow dictionary distributions. You'll occasionally see a name come out phonetically odd. The translation layer typically leaves proper nouns untranslated, which is the correct behavior — "Guillermo" should stay "Guillermo" in English, not become "William". For critical names (medical appointments, legal conversations), it helps to also type or show the name in writing once.
"Untranslatable" cultural terms. Words like Schadenfreude, saudade, hygge, or giri don't have clean one-word equivalents in English. The MT engine will either transliterate (keep the original word) or paraphrase (e.g. "a cozy feeling of contentment" for hygge). Paraphrases can be long — occasionally a single word in the source produces a clause in the target. This is expected behavior, not a bug.
Technical jargon and industry vocabulary. Medical terms, legal terms, and product names are generally well-handled by the translation layer but can trip up ASR for languages with smaller training corpora. If you're using the app for a domain-specific conversation in a smaller language, expect occasional fuzzy transcripts on specialized terms — and for anything truly safety-critical, a human interpreter is still the right call.
Mixed-language households. For bilingual families where one person speaks one language and another answers in a different one, Live Translate Live's two-stream architecture handles this natively — each speaker's microphone feed gets its own language model rather than forcing the whole conversation into one language.
Any-to-Any Direction
Most translation apps only translate to and from English. Live Translate Live is different — it supports any-to-any translation between all 47 languages. That means you can translate directly between Japanese and French, Arabic and Korean, Hindi and Portuguese, or any other combination. No English intermediary required.
With 47 languages and any-direction pairing, you get 2,162 unique language pairs. Both sides of the conversation are translated simultaneously with two-way translation, so neither speaker has to wait.
Popular Language Pairs
While every combination works, here are some of the most commonly used pairs on Live Translate Live:
- English ↔ Spanish — the most popular pair worldwide for business and travel
- English ↔ French — essential across Europe, Africa, and Canada
- English ↔ Arabic — bridging communication across the Middle East and North Africa
- English ↔ Hindi — connecting multilingual families and businesses in South Asia
- English ↔ Japanese — critical for international business and travel in Japan
- English ↔ Korean — growing demand for business and cultural exchange
- Spanish ↔ Portuguese — natural pairing across Latin America
- French ↔ German — the key European business corridor
- Japanese ↔ Korean — one of our most-used non-English pairs
- Hindi ↔ Arabic — connecting South Asia and the Middle East without English
Don't see your pair listed? If both languages are in the supported list above, they work together — all 2,162 combinations are fully supported.
Frequently Asked Questions
Does it handle accents?
Yes, across a broad range. Deepgram's streaming models are trained on diverse accent data, and English in particular handles US, UK, Australian, Indian, and South African accents cleanly. For other languages, heavy regional accents within a country (thick rural dialects, for example) will see slightly higher error rates than news-anchor speech, but the overall trend in 2026 is that modern ASR is noticeably less accent-sensitive than older systems. If a specific speaker is getting poor recognition, the biggest single fix is usually a better microphone position rather than a different language setting.
What if a word doesn't exist in the target language?
The translation engine will either transliterate (keep the original word, sometimes rendered in the target script) or paraphrase (use a short phrase that captures the meaning). For loanwords that are already common in the target language — "computer," "taxi," "sushi" — it simply passes them through. For genuinely culture-specific words without a direct equivalent, expect a paraphrase. Neither case breaks the conversation; you just occasionally get a slightly longer sentence than the speaker produced.
Do both speakers have to use the same browser?
No. Live Translate Live is a single-screen app — the usual setup is two people on one device, either sitting side-by-side or using vis-à-vis mode to read the display from opposite sides of a table. There's no separate login or app install required for the second speaker; they just need to be in range of the microphone. For setups across a room or on a TV, the display can be cast or opened on a separate screen.
Can I switch languages mid-conversation?
Yes. The language picker is live — you can change either side's language at any time without ending the session. This is useful when a third person joins a conversation in a different language, or when a bilingual speaker switches from one language to another halfway through. Credits continue uninterrupted across the switch.
Try It with Your Language Pair
Pick your two languages and start a real-time bilingual conversation in seconds. Live Translate Live works in your browser — no app to download. Both speakers see a scrolling translation display with live translations as they talk.
Translation credits start at just $1 for 15 minutes. No subscription required.
Start translating · View pricing · See all features