Gemini 3.5 Flash Powers Google Agent Layer

Varun Mohan showed Google's agents dividing an operating-system build inside Antigravity on the I/O stage. TechCrunch reported that the agents worked on separate components before converging on a full OS. A few hours later, Google made Gemini 3.5 Flash available across the Gemini app, AI Mode in Search, Antigravity, AI Studio and enterprise products.

Google's argument is that agents become useful when the model is fast enough to sit inside existing surfaces: AI Mode in Search, Antigravity, Spark and Gemini Enterprise. Gemini 3.5 Flash is less a benchmark release than a distribution decision.

Key Takeaways

Google made Gemini 3.5 Flash available across AI Mode, Antigravity, Spark and enterprise products.
The model emphasizes speed for multi-agent work, including a 12x optimized Antigravity version claimed by Google.
Developers face higher Flash pricing: $1.50 input and $9 output per million tokens.
Gemini 3.5 Pro waits until June while Flash handles the worker layer.

AI-generated summary, reviewed by an editor. More on our AI guidelines.

The model ships where Google already has users

In Google's I/O transcript, Sundar Pichai used tokens as the scale marker. Google processed roughly 480 trillion tokens a month at last year's I/O and now processes more than 3.2 quadrillion. Model APIs process about 19 billion tokens per minute; more than 8.5 million developers use Google models monthly.

Consumer surfaces give the same argument. AI Mode has surpassed 1 billion monthly active users, while AI Overviews has more than 2.5 billion. The Gemini app has grown from 400 million monthly users at last year's I/O to more than 900 million.

Flash did not arrive as a standalone chatbot upgrade. Google said it is the default model for the Gemini app and AI Mode globally, and it powers Gemini Spark, the agent running on Google Cloud VMs. Spark starts with Google's tools, later adds third-party connections through MCP, and is designed to ask for approval before high-stakes actions.

Speed is the agent argument

Koray Kavukcuoglu, DeepMind's chief technologist, gave TechCrunch the pitch. "3.5 Flash offers an incredible combination of quality and low latency," he said, adding that it "outperforms our latest frontier model, 3.1 Pro, on nearly all the benchmarks." He said the model is four times faster than other frontier models, and Google says an optimized Antigravity version is 12 times faster at the same quality.

Ars Technica's Ryan Whitwam reported the practical number: nearly 300 output tokens per second, with benchmark scores near larger frontier models that produce output at about one-quarter of that speed. Tulsee Doshi, Google's senior director of product management for Gemini, put the cost in the user interface. "Certain things like UI control are expensive to do because the model has to search the page, it has to know where to click, it has to act through multiple steps," she told Ars. "I think Flash is able to do that well because of that combination of quality and cost."

Several agents wait together.

Get the AI strategy brief

Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.

No spam. Unsubscribe anytime.

What developers pay

Simon Willison found the developer fine print. The model ID is gemini-3.5-flash; it supports a 1,048,576-token input window and a 65,536-token output ceiling. It lacks computer-use support, a constraint for developers comparing it with agents that operate a desktop.

Willison wrote that 3.5 Flash costs $1.50 per million input tokens and $9 per million output tokens, close to Gemini 3.1 Pro Preview's $2 and $12 starting tier. Google's current pricing rises to $4 and $18 above 200,000 tokens. Compared with earlier Flash models, he said, the release is three times the price of Gemini 3 Flash Preview and six times the price of 3.1 Flash-Lite. Artificial Analysis' bill for its high/reasoning configuration was $1,551.60, versus $892.28 for Gemini 3.1 Pro Preview.

Pichai gave the enterprise version of the same math: companies processing about 1 trillion tokens a day could save more than $1 billion a year if they shifted 80% of workloads from other frontier models to Flash. Willison's counterpoint came from the API bill. "Given the price increase it's interesting to see Google roll it out for so many of their own free-to-consumer products."

Pro waits while Flash does the work

Pichai told the I/O crowd that Gemini 3.5 Pro would arrive next month, and Business Insider reported that developers groaned. "Give us until next month to get it to you," he said. Google did not ship Pro at I/O. It did ship the Flash layer that TechCrunch says will handle much of the subagent work.

TechCrunch reported Doshi's view that Pro will become the orchestrator and planner while Flash handles subagent work. The missing Pro model leaves the worker layer exposed first. Antigravity 2.0 is now a standalone desktop app. Search gets information agents this summer, starting with Google AI Pro and Ultra subscribers. Spark began rolling out to trusted testers the week of May 19, with the beta planned for Ultra subscribers in the United States the following week.

One small detail from Willison's first API test says more than the keynote wanted to. He asked Gemini 3.5 Flash for an SVG of a pelican riding a bicycle and found a code comment reading Pelican Eye / Sunglasses (Cool Retro Aviators). The joke cost just under 13 cents. Google has put the same 3.5 Flash model family under products used by billions of people, though not necessarily with Willison's API settings. The next dated tests are Spark's Ultra beta and Gemini 3.5 Pro, both promised before June ends.

Frequently Asked Questions

What is Gemini 3.5 Flash?

Gemini 3.5 Flash is Google’s new fast model for coding, agentic workflows and consumer AI surfaces. Google released it at I/O 2026 across the Gemini app, AI Mode in Search, Antigravity, AI Studio and enterprise products.

Why is Google putting Gemini 3.5 Flash into so many products?

Google is using Flash as the worker layer for agents. Its pitch is that lower latency makes multi-step workflows practical inside Search, coding tools, Spark and enterprise products that already have large user bases.

How fast is Gemini 3.5 Flash?

Google says it is four times faster than other frontier models, and that an optimized Antigravity version is 12 times faster at the same quality. Ars Technica reported nearly 300 output tokens per second.

How much does Gemini 3.5 Flash cost?

Simon Willison reported pricing at $1.50 per million input tokens and $9 per million output tokens. That is near Gemini 3.1 Pro Preview’s starting tier and above earlier Flash-family pricing.

When does Gemini 3.5 Pro arrive?

Sundar Pichai told Google I/O attendees that Gemini 3.5 Pro would arrive next month. The current release puts Gemini 3.5 Flash into production first, with Pro expected to act as the higher-reasoning planner.

AI-generated summary, reviewed by an editor. More on our AI guidelines.

AI News

Marcus Schuler

San Francisco

Editor-in-Chief and founder of Implicator.ai. Former ARD correspondent and senior broadcast journalist with 10+ years covering tech. Writes daily briefings on policy and market developments. Based in San Francisco. E-mail: [email protected]