Voice AI's Three-Model Pipeline Is Dying. Its Replacement Fits on a Laptop.

NVIDIA's PersonaPlex 7B runs on a MacBook at 5.3 GB after an indie dev ported it to Swift/MLX. Full-duplex voice AI leaves the data center.

PersonaPlex Brings Voice AI From Cloud to Laptop via MLX

Ask Siri a question and three separate models wake up behind the curtain. One transcribes your voice into text. A second reads that text and writes a response. A third converts the response back into speech. Each handoff is a point of failure, another chance for the meaning of what you actually said to get lost in translation.

NVIDIA's PersonaPlex 7B was supposed to change that. Released in January as an open-source, full-duplex speech-to-speech model, it collapses those three stages into one. Audio goes in. Audio comes out. It listens while it talks. Interruptions don't break it. It throws in "uh-huh" and "right" at the places a human would, and it stays in character the whole time. One model where three used to sit.

That alone would be interesting but not the story.

This article continues below.

Sign up once, read everything for free. No algorithms, no fluff—just the AI intel that actually matters for your work.

Get free access →
Already have an account? Sign in

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Implicator.ai.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.