At Google Cloud Next, We're Off to See the AI Agents (And Huge Performance Gains)

(Credit: Google)

Google's AI ambitions were on full display at its Cloud Next conference this week, but for me, two things stood out: It's now firmly promoting AI agents and is laser-focused on efficiency.

As usual, there was a lot of AI hype. "It's a unique moment," said Google Cloud CEO Thomas Kurian, with CEO Sundar Pichai adding that it can "enable us to rethink what's possible."

Pichai pointed to how Google helped recreate The Wizard of Oz for The Sphere in Las Vegas, converting it from a conventional movie format to one that works on a 160,000-square-foot screen. "Even a few years ago, such an undertaking would have been nearly impossible with conventional CGI," according to Google, with the team needing to "account for all the camera cuts in a traditional film that remove characters from parts of certain scenes, which wouldn’t work at the new, theatrical scale that was envisioned."

The AI-enhanced Wizard debuts on Aug. 28. But the "chance to improve lives and reimagine things is why Google has been investing in AI for more than a decade," Pichai said this week.

A Surprisingly Fast Ascent for Agents

AI agents were the real wizards at Cloud Next, however, with Kurian announcing several new interoperability-focused features for Google's AgentSpace platform.

Notably, the new Agent Development Kit within Vertex AI supports the Model Control Protocol (MCP) that allows agents to access and interact with various data sources and tools, rather than requiring custom integrations for plugins. MCP was announced by Anthropic just a few months ago, and now it seems that all the major AI software companies are supporting it.

In addition, Google announced a new Agent2Agent protocol that allows agents to communicate with each other regardless of the underlying model and framework they were developed with.

Google offers its own purpose-built agents and tools for letting you build your own agents, but is now has a multi-cloud platform that "allows you to adopt AI agents while connecting them with your existing IT landscape, including your databases, your document stores, enterprise applications, and interoperating with models and agents from other providers," Kurian says.

Salesforce CEO Mark Benioff appeared via video to talk about how Salesforce is working with Google to develop and connect its agents.

The idea is that you could use Google's agents, create your own, or integrate with third-party agents. And of course, Kurian talked about how Google helps you create AI systems while addressing concerns about sovereignty, security, privacy, and compliance.

Among the agents Google is producing are Customer Agents for call centers with human-like speech and dialog in Google’s Customer Engagement Suite; Creative Agents for media production, marketing, advertising, and design teams; Data Agents for Big Query; and a number of Security Agents. Google also is introducing an Agent Gallery, no-code Agent Designer, Idea Generation agent, and a Deep Research agent.

Meanwhile, agents in Google Workspace include a "Help Me Analyze" agent for Sheets, Workspace Flows to help automate tasks, and audio overviews, which turns Docs into audio summaries.

It's interesting to me how quickly the agents concept has evolved. It was only a year ago that companies started talking about building them, as opposed to chatbots, which just answered questions. To me, it seems like a lot of agents are chatbots connected to robotic processing automation (RPA) tools, but that's fine if it can actually help businesses be more efficient. Now it seems like every major AI company is competing to create platforms that work with agents across software companies.

Gemini Goes Pro

AI isn't cheap; Google invests around $75 billion in capital expense, mostly for servers and data centers, Pichai says. The two most interesting areas here are the underlying models and the next-generation chips that will power them.

A few weeks ago, Google announced Gemini 2.5 Pro, which Pichai describes as "a thinking model that can reason through its thoughts before responding." Gemini 2.5 Pro is now Google's high-end model, available through its AI Studio, Vertex AI, and Gemini app.

At Google Cloud Next, Pichai announced that Gemini 2.5 Flash, a thinking model with low latency and the most cost-efficient performance, is coming soon.

In addition, Google announced improvements to a variety of other AI models for specific uses. Imagen 3, its image-generating model, now offers better detail, richer lighting, and fewer distracting artifacts, Kurian said. Veo 2, the latest version of its video-generation tool, creates 4K video that is watermarked, but with features such as "inpainting," or removing parts of images. Chirp 3 creates custom voices with just 10 seconds of input. And Lyria transforms text prompts into 30-second music clips.

With all these tools, "Google is the only company that offers generative media models across all modalities," Kurian says.

All these models are available on Google's Vertex AI platform, which now supports more than 200 models, including those from Google, third parties, and open-source ones. Other changes include Vertex AI Dashboards to help monitor usage, throughput, and latency, new training and tuning capabilities, and a Vertex AI Model Optimizer.

Strike While the Ironwood Is Hot

In the infrastructure area, the biggest announcements was Ironwood, Google's 7th generation Tensor Processing Unit (TPU). Due later this year, this chip is said to offer twice the performance per watt of the current Trillium chip. Pichai says it has 3,600 times the performance of the first TPU Google introduced in 2013. In that time, Google has become 29 times more energy-efficient.

Amin Vahdat, Google's VP & GM for Machine Learning, Systems, and Cloud AI, says demand for AI compute has increased by more than 10x a year for more than eight years, by a factor of 100 million. Google's newest TPU Pods support over 9,000 TPUs per pod and 42.5 exaflops of compute performance. (The pods will be offered in two sizes, one with 256 TPUs and the other with 9,216.) Still, these chips are "just one piece of our overall infrastructure," Vahdat said.

Instead, Kurian talked about a building an "AI Hypercomputer" that involves multiple technologies. As part of this, Google also announced new compute instances with Nvidia's GPUs, as well as a cluster director that lets users deploy and manage a large number of accelerator chips; some new storage pools, called "hyperdisk exopools" as well as an "anywhere cache" that keeps data close to the accelerators, and a zonal storage solution, which offers five times lower latency for random reads and writes compared with the fastest comparable cloud alternative.

In addition, the company announced new inference capabilities for the Google Kubernetes Engine and Deepmind Pathways for multi-host inferencing with dynamic scaling.

Overall, Kurian claimed that putting all these things together means that Gemini 2.0 Flash powered by Google's AI Hypercomputer achieves 24 times higher intelligence per dollar compared to GPT-4o and five times higher than DeepSeek R1.

And, in partnership with Dell and Nvidia, Kurian announced that Gemini will now run on Google Distributed Cloud for local deployments, including those that need to be "air gapped" for particularly sensitive applications.

As part of the infrastructure push, Google announced that it is offering its global private network to customers. Pichai said the Cloud Wireless Access Network (Cloud WAN) contains over 2 million miles of fiber and underlies Google's services, delivering "over 40% faster performance while reducing total cost of ownership by up to 40%."

I never take vendor performance numbers at face value, and obviously Google's competitors will have new offerings of their own. But it's interesting to see such a focus on not only performance but also cost. I know many CIOs who have been unpleasantly surprised by the cost of running AI models. This is a step in the right direction.

At Google Cloud Next, We're Off to See the AI Agents (And Huge Performance Gains)

Google's AI has led to fun projects like reworking The Wizard of Oz for The Sphere in Vegas, but the real moneymakers are billion-dollar investments like its upcoming Ironwood TPU.

A Surprisingly Fast Ascent for Agents

Gemini Goes Pro

Strike While the Ironwood Is Hot

About Our Expert

Michael J. Miller

Former Editor in Chief

Read the latest from Michael J. Miller

At Google Cloud Next, We're Off to See the AI Agents (And Huge Performance Gains)

Google's AI has led to fun projects like reworking The Wizard of Oz for The Sphere in Vegas, but the real moneymakers are billion-dollar investments like its upcoming Ironwood TPU.

A Surprisingly Fast Ascent for Agents

Gemini Goes Pro

Recommended by Our Editors

Strike While the Ironwood Is Hot

About Our Expert

Michael J. Miller

Former Editor in Chief

Read the latest from Michael J. Miller

Comments