Google kicked off I/O 2026 today with the announcement of Gemini 3.5, a new model family that the company says marks a "major leap forward in building more capable, intelligent agents." The rollout begins with Gemini 3.5 Flash, which is now enabled as the default model across several Google services.

The release is a pointed statement from Google DeepMind: a Flash-tier model, built for speed and cost efficiency, now outperforms Gemini 3.1 Pro, Google's flagship that only launched in February, across coding and agentic benchmarks.

The Numbers

CEO Sundar Pichai said at the keynote that Gemini 3.5 Flash delivers 289 tokens per second, four times faster than other frontier models. That speed advantage matters because it changes the economics of running AI agents at scale.

According to Google, 3.5 Flash outperforms 3.1 Pro on three key benchmarks: Terminal-Bench 2.1 for coding, GDPval-AA Elo for real-world agentic tasks, and MCP Atlas for scaled tool use. The GDPval-AA result is particularly notable. When Gemini 3.1 Pro launched, it scored 1,317 on this benchmark. Gemini 3.5 Flash scores 1,656, a step-change in agentic capabilities rather than incremental progress.

Advertisement

The model is purpose-built for agents and long-horizon tasks, capable of planning and reasoning across massive codebases, deploying subagents to work in parallel, and sustaining complex workflows over extended periods.

The Compression Problem

What makes this announcement interesting is not just raw capability but the speed at which Google is collapsing the gap between its model tiers. A Pro-class model became a Flash-class benchmark result in roughly three months. If that compression continues, the distinction between flagship and efficient model tiers will keep collapsing.

Gemini 3.5 Flash is available today for everyone in the Gemini app and through AI Mode in Google Search. Google says it's already working on Gemini 3.5 Pro and plans to launch it next month.

Pricing

Gemini 3.5 Flash is priced at $1.50 per million input tokens and $9 per million output tokens, making it 3 to 20 times more expensive than prior Gemini Flash models. It undercuts current competitors like Claude Sonnet 4.6 at $3/$15, positioning it as a mid-tier option rather than the budget play that earlier Flash models represented.

The pricing signals that Gemini 3.5 Flash is a stronger, more capable model than typical lightweight Flash tiers, with output costs aligned to rival mid-tier options rather than cheap fast models.

Advertisement

The Broader I/O Picture

Gemini 3.5 was not the only AI news at I/O. Google also announced Gemini Omni, a new multimodal model for generating polished video from any input. While Veo focused on turning text into video, Omni can take any combination of images, audio, video, and text to generate videos grounded in Gemini's real-world knowledge.

The company is clearly pushing Gemini hard as it competes with Anthropic's Claude and OpenAI's ChatGPT for mindshare. Google said Gemini has grown from 400 million users at last year's I/O to more than 900 million monthly users across more than 230 countries and 70 languages.

The competitive context here matters. Google's Gemini traffic share has already crossed 26% of global generative AI web traffic. Gemini 3.5 Flash, fast and capable and priced to scale, is built to push that further, particularly among developers building agent-driven applications.

Whether 3.5 Pro, expected next month, maintains the same lead over its Flash sibling as previous generations remains to be seen. If Google's recent cadence holds, the next few months will tell us whether the Pro tier is still a meaningful product category or just a temporary flag planted before the next Flash model arrives. For developers who have been waiting for Google's next major model update, today's announcement suggests the wait for 3.5 Pro will be short.