Tencent Ships Hy3 Preview With 40 Percent Efficiency Gain

Tencent released Hy3 preview on April 23, a 295 billion parameter Hunyuan language model built after a three-month reconstruction led by chief AI scientist Yao Shunyu. The mixture-of-experts model activates 21 billion parameters per token, supports a 256K token context window and is pitched for code, search and multi-step agents. Tencent says the release is open source, already live across Yuanbao, CodeBuddy, WorkBuddy and Tencent Cloud, and meant to pull user feedback into the official Hy3 version.

The headline number is smaller than the old flagship. HY 2.0 had more than 400 billion parameters. Hy3 preview drops the total count, then tries to make up the difference with product tuning, routing and lower inference cost. That is the tell.

Key Takeaways

Tencent released Hy3 preview with 295 billion total parameters and 21 billion active per token.
The model supports 256K context and targets code, search and multi-step agent workloads.
Tencent says internal CodeBuddy and WorkBuddy tests cut latency while raising success rates above 99.99 percent.
The bet is cheaper inference plus product feedback, not another parameter race.

AI-generated summary, reviewed by an editor. More on our AI guidelines.

Smaller is the pitch

Hy3 preview is Tencent's answer to a basic business problem: a model can be smart and still too expensive to use all day. Only 21 billion of its 295 billion parameters light up for each token, about 7 percent of the model. That math is the whole argument in one line.

Hugging Face's Transformers documentation describes Hy3 preview as a dense-MoE hybrid with 192 routed experts, one shared expert in each MoE layer, top-k routing and QK-Norm for training stability. In plain English, the system keeps a lot of capacity on the shelf and calls only part of it for each step.

For Tencent, that matters more than bragging rights. The company says overall reasoning efficiency rose 40 percent from the previous generation, while TokenHub pricing starts at 1.2 yuan per million input tokens, 0.4 yuan for cached input and 4 yuan per million output tokens. If you are buying API calls by the truckload, that is not footnote math. It is the product.

Yao's first report card

The release also gives Yao Shunyu his first public model scorecard at Tencent. Yao took the chief AI scientist role last year after work at OpenAI and is known in agent circles for ReAct, the reasoning-and-action framework that helped define tool-using language models.

SCMP reported that Tencent still sees Hy3 preview as behind the best U.S. systems from OpenAI and Google DeepMind, even while calling it competitive with top Chinese models. That caveat matters. Tencent is not claiming a crown. It is claiming a reset.

The company says Hunyuan rebuilt pre-training and reinforcement-learning infrastructure in February, then trained this first post-rebuild model in less than three months. It also says the team moved away from public leaderboards that can be gamed, using self-built tests, human review and product beta testing instead.

That sounds less glamorous than a leaderboard win. It may be more useful. Product teams do not feel pride when a model wins a benchmark and fails inside a customer-service chat. They feel heat.

Track the agent model race before the benchmark dust settles

Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.

No spam. Unsubscribe anytime.

Agents are the proving ground

Tencent is steering Hy3 preview into tools where failure has a stopwatch attached. The company says CodeBuddy and WorkBuddy saw first-token latency fall 54 percent, end-to-end duration fall 47 percent and success rates rise above 99.99 percent in internal testing. It also says Hy3 preview has driven agent workflows as long as 495 steps.

Those are Tencent's own figures, so treat them as company claims until outside tests catch up. Still, the product list is real: Tencent Cloud, Yuanbao, IMA, CodeBuddy, WorkBuddy, QQ, QQ Browser and Tencent Docs are already named. WeChat Official Accounts, Peacekeeper Elite, Tencent News and WeChat Reading are rolling in after that.

The model is also visible outside Tencent's walls. OpenRouter lists Tencent Hy3 preview with a 262,144-token context window and positions it for agent workflows and configurable reasoning. That gives developers a public way to poke at the claim, not just read Tencent's launch copy.

The benchmark race got boring

Hy3 preview arrives in a Chinese model market crowded with Qwen, Kimi, DeepSeek and Zhipu. The noisy fight is still about which model tops which chart. Tencent's colder bet is that the chart matters less once a model sits inside messaging, games, documents and code.

That makes Hy3 preview a gear, not a trophy. Inside those apps, a failure is not a red cell on a benchmark. It is a bad tool call, a stale answer, a stalled task. Tencent can tune against that.

The risk is that product tuning can hide a weak base model for only so long. If Hy3 preview trails frontier systems on raw capability, Tencent's apps may polish the rough edges without removing them. Users will notice when an agent loses the thread in a 400-step workflow. Developers will notice sooner.

Still, the strategic message is sharp. Tencent is choosing fewer active parameters, cheaper inference and more product contact over the old romance of size. In the next round of agent software, the winning model may not be the one that looks largest in a chart. It may be the one already sitting at the desk.

Frequently Asked Questions

What is Tencent Hy3 preview?

Hy3 preview is a Hunyuan language model Tencent released on April 23. It uses a mixture-of-experts design with 295 billion total parameters, 21 billion active parameters per token and a 256K token context window.

Why is the 21 billion active parameter figure important?

It shows Tencent's cost argument. Hy3 preview keeps a much larger pool of parameters available, but only activates about 7 percent per token. That can lower inference cost if routing and product tuning hold up.

Where is Hy3 preview already being used?

Tencent says Hy3 preview is live in Yuanbao, IMA, CodeBuddy, WorkBuddy, QQ, QQ Browser, Tencent Docs and Tencent Cloud. More products, including WeChat Official Accounts and WeChat Reading, are being rolled in.

Did Tencent open source Hy3 preview?

Tencent's release says Hy3 preview was released and open-sourced. Hugging Face documentation also lists a tencent/Hy3-preview model ID for Transformers loading, though developers will still need hardware capable of serving a large MoE model.

What does this mean for China's AI model race?

Tencent is trying to compete on product use and agent economics rather than raw model size. That may matter if enterprise users care more about cost, latency and task completion than leaderboard rank.

AI-generated summary, reviewed by an editor. More on our AI guidelines.

AI News

Maria Garcia

Los Angeles

Tech culture and generative AI reporter covering the intersection of AI with digital culture, consumer behavior, and content creation platforms. Focusing on technology's beneficiaries and those left behind by AI adoption. Based in California.