Deepgram today announced Flux Multilingual, a conversational speech recognition model that supports 10 languages with real-time language detection and the ability to switch languages during an active call. It understands English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian and Dutch in a single API, replacing what Deepgram described as a fragmented stack of separate transcription engines, detection layers and routing logic. Flux Multilingual is available now through Deepgram's cloud API and as a self-hosted deployment, with European Union endpoint support.
Key Takeaways
- Deepgram launched Flux Multilingual with 10 languages and mid-call code-switching
- Model-based turn detection delivers end-of-turn decisions in under 400ms
- Available as cloud API or self-hosted, priced same as English-only version
- Platform used by 200,000+ developers and 1,400 organizations
AI-generated summary, reviewed by an editor. More on our AI guidelines.
The Architecture
Flux Multilingual belongs to a category Deepgram calls conversational speech recognition, which the company distinguishes from traditional automatic speech recognition because it is designed for dialogue flow rather than transcription. The system determines turn boundaries from conversational context rather than audio gaps, with end-of-turn decisions delivered in under 400 milliseconds according to Deepgram's documentation.
The optional language_hint parameter lets developers specify target languages, or the system auto-detects the spoken language on each turn without developer input. All TurnInfo events include a languages field that reports detected languages sorted by word count. Flux Multilingual handles native code-switching when a speaker moves between languages inside the same conversation, and mid-stream reconfiguration allows developers to update language hints using a Configure control message without disconnecting the stream.
The release handles interruptions natively and delivers what Deepgram described as monolingual-grade accuracy across all 10 supported languages. The release is backward-compatible with existing Flux API integrations.
Pricing and Availability
Flux Multilingual carries the same pricing as the English-only flux-general-en, according to Deepgram's changelog. The company is offering a limited-time promotional rate on streaming speech-to-text covering both Flux Multilingual and Nova-3.
Developers connect by setting model=flux-general-multi on the /v2/listen endpoint, with no new credentials or endpoints required. The EU endpoint is available at wss://api.eu.deepgram.com/v2/listen. The release supports real-time streaming only, with SDK and self-hosted options available. A self-hosted release from April 16 added Flux Multilingual to Deepgram's on-premises container images, requiring a minimum NVIDIA driver version of 570.172.08.
Get the latest AI news in your inbox
Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.
No spam. Unsubscribe anytime.
Enterprise Traction
More than 200,000 developers and 1,400 organizations use Deepgram's platform, which spans speech-to-text, text-to-speech and full speech-to-speech capabilities. The company has processed over 50,000 years of audio and transcribed more than 1 trillion words. Deepgram has raised approximately $216 million in funding, including a $130 million Series C round in January at a $1.3 billion valuation.
Twilio Vice President of Products Omar Paul said in the announcement that the release eliminates the need for customers to sacrifice accuracy with legacy multilingual systems or stitch multiple models together. "With Flux Multilingual, teams take the exact conversational experience they built for English and extend it across languages with a single system," Paul stated.
Deepgram's offerings have seen broader infrastructure adoption in recent months. Together AI began hosting Deepgram's Nova-3, Flux and Aura-2 natively on its Dedicated Model Inference platform in February, and Cloudflare integrated Deepgram Flux and Nova-3 as built-in speech-to-text providers for its experimental voice pipeline in the Agents SDK, released April 15.
Deepgram co-founder and CEO Scott Stephenson described voice AI agents as the eventual default for enterprise customer interactions and positioned Flux Multilingual as a single perception system for building global voice agents with mid-call language switching. Investors in the company include Madrona Venture Group, Tiger Global Management, Nvidia, Citi Ventures and Goldman Sachs Asset Management.
Frequently Asked Questions
What languages does Flux Multilingual support?
English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch in a single model and API.
How does Flux Multilingual differ from traditional speech recognition?
It uses conversational speech recognition (CSR) designed for dialogue flow rather than transcription, with model-based turn detection instead of silence detection.
Can Flux Multilingual switch languages mid-call?
Yes. It supports native code-switching when a speaker moves between languages and mid-stream reconfiguration without disconnecting the stream.
How much does Flux Multilingual cost?
Same pricing as the English-only flux-general-en model. Deepgram is offering a limited-time promotional rate on streaming speech-to-text.
Where is Flux Multilingual available?
Through Deepgram's cloud API, EU endpoints, and as a self-hosted deployment with SDK support.
AI-generated summary, reviewed by an editor. More on our AI guidelines.



IMPLICATOR