<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Developers &amp; Practitioners</title><link>https://cloud.google.com/blog/topics/developers-practitioners/</link><description>Developers &amp; Practitioners</description><atom:link href="https://cloudblog.withgoogle.com/blog/topics/developers-practitioners/rss/" rel="self"></atom:link><language>en</language><lastBuildDate>Thu, 18 Jun 2026 22:03:38 +0000</lastBuildDate><image><url>https://cloud.google.com/blog/topics/developers-practitioners/static/blog/images/google.a51985becaa6.png</url><title>Developers &amp; Practitioners</title><link>https://cloud.google.com/blog/topics/developers-practitioners/</link></image><item><title>Scaling the Next Generation of Global Innovation: How Google Supports Top Startups Around the World</title><link>https://cloud.google.com/blog/topics/developers-practitioners/scaling-the-next-generation-of-global-innovation-how-google-supports-top-startups-around-the-world/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the high-stakes world of tech entrepreneurship, the leap from a brilliant prototype to a scalable, market-defining business can be brutal. Founders need much more than capital; they need deep architectural guidance, sovereign-level policy alignment, and technical systems engineered to enable rapid growth. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Joy’s Law&lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;states: &lt;/strong&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;"[N]o matter who you are, most of the smartest people work for someone else."&lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We recognize that true innovation inherently happens &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;“elsewhere.”&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This philosophy drives our active support of global accelerators across a diverse, geographic footprint of innovation markets to tap into this decentralized brilliance. For over a decade, our Google accelerator program has acted as a catalyst for this exact transition. By bridging the gap between raw entrepreneurial ambition and Google’s world-class engineering ecosystem, the program has quietly built one of the most resilient, high-performing startup portfolios on Earth.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Power of the Network: A Decade by the Numbers&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While many startup accelerators struggle with significant failure rates, our accelerator program has set a high bar for long-term success. By pairing top-tier founders and CTOs with customized, deeply technical engagement from Google, along with learned industry best practices, the program has consistently helped build both highly valuable companies and products. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The scope of this global network is impressive:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table border="1" style="border-collapse: collapse; width: 99.7931%; height: 335px;"&gt;
&lt;tbody&gt;
&lt;tr style="height: 33.9702px;"&gt;
&lt;td style="width: 28.304%; height: 33.9702px;"&gt;&lt;em&gt;&lt;strong&gt;Metric&lt;/strong&gt;&lt;/em&gt;&lt;/td&gt;
&lt;td style="width: 71.6829%; height: 33.9702px; text-align: left;"&gt;&lt;em&gt;&lt;strong&gt;Impact to Date&lt;/strong&gt;&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 33.9702px;"&gt;
&lt;td style="width: 28.304%; height: 33.9702px;"&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Global Footprint&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 71.6829%; height: 33.9702px;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;2,011&lt;/strong&gt; startups supported across 88 countries&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 33.9702px;"&gt;
&lt;td style="width: 28.304%; height: 33.9702px;"&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Program Experience&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 71.6829%; height: 33.9702px;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;144&lt;/strong&gt; cohorts graduated over 10 years&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 33.9702px;"&gt;
&lt;td style="width: 28.304%; height: 33.9702px;"&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Survival Rate&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 71.6829%; height: 33.9702px;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;93%&lt;/strong&gt; portfolio survival rate&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 33.9702px;"&gt;
&lt;td style="width: 28.304%; height: 33.9702px;"&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Financial Momentum&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 71.6829%; height: 33.9702px;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;$46.3B &lt;/strong&gt;in funding raised; $135.1B collective portfolio valuation&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 33.9915px;"&gt;
&lt;td style="width: 28.304%; height: 33.9915px;"&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Startup Job Creation&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 71.6829%; height: 33.9915px;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;305,900 &lt;/strong&gt;employees across the entire startup portfolio&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p role="presentation"&gt; &lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The Developer Value-Add:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By design, this isn't a high-level business bootcamp. The founders of Accelerator startups identify a deeply technical problem that they then work on with bespoke support from Google to solve. These startups get access to Google engineers and product managers, along with access to our platforms and tools. From advising on architectures to optimizing AI model pipelines, Google experts work directly with the founding teams to help tackle some of their most complex technical hurdles.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Strategic Momentum: Geopolitics, Green Infrastructure, and Robotics&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The startup ecosystem is shifting rapidly, and our accelerator program is evolving along with it. This year, Google launched new initiatives  to support global economic development and explore and evolve critical environmental infrastructure. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Just a few examples:&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Sovereign-Level &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Policy &amp;amp; Strategic Wins&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Australia:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Accelerator alumni have successfully anchored the Google AI stack directly into the country's national R&amp;amp;D strategy, engaging directly with Members of Parliament in Canberra.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Canada:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The Canadian Office of Innovation, Science, and Economic Development officially recognized and cited the impact of the Canada accelerator program in its formal report for the G7 Summit.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cutting-Edge Frontier Programs&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This year marks a major expansion into specialized, frontier tech verticals:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Google DeepMind Accelerator (Europe):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Dedicated strictly to hardening technical builds for AI-native robotics companies, effectively bridging the gap between lab prototyping and commercial market success.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;T&lt;/span&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;he GDM Accelerator (AI for Planet) in APAC&lt;/strong&gt;:&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; A joint initiative between Google DeepMind and Google's Sustainability teams. The program focuses heavily on biodiversity foundation models to position Google at the forefront of the critical ESG (Environmental, Social, and Governance) infrastructure market.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Japan Relaunch:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Marking a major strategic re-entry into one of Asia's most vital technology hubs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;The hive mind opportunity&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To maximize the power of this unique network, earlier this year we successfully transitioned our disparate regional alumni networks into a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Unified Alumni Community&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. We now bring together more than 1,750 startups and 3,000 founders across 90+ countries through shared online channels and the opportunity to attend in-person events, where founders get access to Google senior leadership and our newest models and tech, opportunities to directly influence the development of new Google products to better support their businesses’ growth, and learn from and support each other. &lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Don't Miss It: Upcoming Demo Days&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The culmination of each of our intense accelerator journeys is &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Demo Day&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, where top-tier cohorts showcase their technical builds and new market-defining concepts. You can watch these milestones live streamed directly via the &lt;/span&gt;&lt;a href="https://www.youtube.com/@GoogleCloudEvents/featured" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google for Startups events on YouTube&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Mark your calendar for the remaining 2026 showcases:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Summer &amp;amp; Fall 2026&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Africa Accelerator: June 19&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Middle East, North Africa, and Turkey Accelerator: June 26&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Korea Accelerator: July 15&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Brazil Accelerator: July 16&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Europe and Israel DeepMind Accelerator (Robotics): September 11&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;India: September 30&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Winter 2026&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;India Accelerator: November 4&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Southeast Asia Accelerator: November 13&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;North America Accelerator (Energy): November 19&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;South Africa Accelerator: December 11&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Europe and Israel (Energy): December 11&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Global Google.org Accelerator(Government Innovation): December 11&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Open &amp;amp; Upcoming Applications&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you are a founder or CTO looking to radically scale your technical infrastructure, optimize your product market-fit, and gain equity-free support from Google's global talent pool, applications are officially moving.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Applications Open Right Now:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;GFSA Southeast Asia (Leverage the newly launched AI Startup Innovation Corridor connecting SEA to Silicon Valley)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;GFSA China&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Google.org Accelerator: AI for Science&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Thu, 18 Jun 2026 12:51:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/scaling-the-next-generation-of-global-innovation-how-google-supports-top-startups-around-the-world/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_RoJ1zJA.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Scaling the Next Generation of Global Innovation: How Google Supports Top Startups Around the World</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_RoJ1zJA.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/scaling-the-next-generation-of-global-innovation-how-google-supports-top-startups-around-the-world/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Matt Thompson</name><title>Director, Developer Adoption</title><department></department><company></company></author></item><item><title>Agent Factory Recap:  100X engineering with AI agents in Google Antigravity 2.0</title><link>https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-100x-engineering-with-ai-agents-in-google-antigravity-20/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;In this episode of the Agent Factory, I sat down with Rody Davis, one of Google’s top agentic engineers. We dive into the massive shift from traditional IDEs to agent-first platforms, the reality of code reviews in an AI-driven world, and how to use "skills" to perform at a 100X level.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=Dk4MD6TNiWE"
      data-glue-modal-trigger="uni-modal-Dk4MD6TNiWE-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        &lt;img src="//img.youtube.com/vi/Dk4MD6TNiWE/maxresdefault.jpg"
             alt="Episode 6 of the Agent Factory."/&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-Dk4MD6TNiWE-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="Dk4MD6TNiWE"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=Dk4MD6TNiWE"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.&lt;/p&gt;
&lt;h2&gt;Google Antigravity 2.0 - What is it?&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://antigravity.google/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity 2.0&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; has evolved from a simple agentic IDE into a full-scale agent-first platform. It now consists of four core pillars: a standalone desktop &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Manager&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for orchestration, a robust &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for server-side work, an &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;SDK&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for custom Python-based workflows, and a specialized &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;IDE&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This unbundled approach allows developers to compose their own environment, managing multiple folders and complex project structures without being forced into a single-workspace layout.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Rody Davis on 100X Engineering&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We explored the strategies elite engineers use to scale their impact and reduce the "cognitive toil" of daily development.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling Impact and Reducing Toil&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=115s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;01:55&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Rody explains that AI isn't just about writing code; it's about accelerating the entire lifecycle. He uses agents to write richer test suites and prototype multiple versions of an app before committing to a framework. By offloading "toil", like building marketing sites, he can focus on high-level architecture and problem-solving.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Skills as "Context Cheat Sheets"&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=185s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;03:05&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;A core philosophy in Rody’s workflow is the use of "Skills." He views skills as a way to compress context for the model. "It’s literally a cheat sheet for the agent," Rody notes. By providing the agent with specific design systems or API documentation, the model becomes significantly faster and more accurate, avoiding the latency of searching through massive, unorganized docs.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Customizations, Skills, and MCP Servers&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=257s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;04:17&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/skills_better2.max-1000x1000.jpg"
        
          alt="skills_better2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;Rody walks us through the customizations tab in Antigravity 2.0, showing how to extend an agent's capabilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Android CLI:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Building and deploying mobile apps directly from the command line.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Modern Web Guidance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Grounding the agent in the latest CSS and accessibility standards.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;MCP Servers:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using the Model Context Protocol to enable features like hot reloading for Flutter and Dart.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The Bonsai Approach to Code Review&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=327s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;05:27&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Rody compares maintaining a codebase to being a Bonsai artist: constantly pruning to keep things simple. He advocates for flat architectures where state, UI, and data are strictly separated. This makes it easier for a human to "steer" the agent; if the agent starts putting files in the wrong place, the architectural violation is immediately obvious.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/bonsai.max-1000x1000.jpg"
        
          alt="bonsai"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Do you review 100% of agent-generated code?&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=431s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;07:11&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Rody’s answer depends on the task. For a marketing site, he focuses on the visual output rather than the code. However, for backend logic, he cares deeply about API contracts and schemas. He recommends writing the first example yourself so the agent can simply "copy the pattern" for the rest of the codebase.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Building Extensions to Solve Daily Friction&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=545s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;09:05&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;To solve the problem of managing files across multiple Git projects, Rody used Antigravity to build a custom macOS Finder extension in Swift. This tool allows him to filter files by time boxes (today, last week, etc.), demonstrating how agents can build specialized utilities that reduce daily friction.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/extensionscroped.max-1000x1000.jpg"
        
          alt="extensionscroped"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Do AI engineers still write code by hand?&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=622s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;10:22&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"Oh yeah," Rody says. He still loves the syntax of languages like Go and the challenge of controlling computers. He believes it's vital to understand the building blocks deeply so that when you face a problem two years down the road, you know exactly which "old project" to reach back for.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Powering Personal Websites with Gemma 4&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=702s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;11:42&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Rody showcases his personal website, which uses &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-how-gemma-4-taught-itself-physics?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemma 4&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and Embedding Gemma to provide dynamic content recommendations offline. By vectorizing post summaries at compile time, the site can suggest related content via a local vector database without needing a live backend server.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/gemma4websitecroped.max-1000x1000.jpg"
        
          alt="gemma4websitecroped"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;The Factory Floor&lt;/h2&gt;
&lt;p&gt;The Factory Floor is our segment for getting hands-on. Here, we moved from high-level concepts to practical code with live demos.&lt;/p&gt;
&lt;h3&gt;Multi-Agent Parallelism in Action&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=842s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;14:02&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;n this demo, Rody uses a single stream-of-thought voice prompt to build a full-stack application. We watched as Antigravity:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Spun up parallel sub-agents, including a dedicated DevOps and QA engineer. (see &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=1188s" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;19:48&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Built a multilingual note-taking app using Vite, Go, and SQLite.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Orchestrated the entire stack via Docker Compose.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Localized the app into five different languages simultaneously.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/subagentscropped.max-1000x1000.jpg"
        
          alt="subagentscropped"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Unbundling the IDE Ecosystem&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image2_FHRmWV2.max-1000x1000.png"
        
          alt="image2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=935s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;15:35&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We discussed why Google separated the IDE from the Agent Manager. Rody highlights that this unlocks different workflows: the CLI is perfect for SSH sessions on a Raspberry Pi, while the Agent Manager handles general knowledge work and orchestration across multiple folders.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Turning Documentation into Reusable Skills&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=1541s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;25:41&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Rody shares his process for turning documentation into skills. He wrote a Go CLI that parses websites into markdown, allowing him to install hundreds of skills for the sites he visits frequently. This ensures the agent always has access to the specific version of the docs he is using.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Rapid Fire: Future Tech Predictions&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/hotjob.max-1000x1000.png"
        
          alt="hotjob"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=1655s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;27:35&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We put Rody on the spot with some controversial takes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Vibe Coding:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Rody believes a non-technical founder will launch a company using only vibe coding by 2026, but the real test will be maintaining it in years 2 through 5.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Production Failures:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Rody agrees that vibe coding will cause significant production failures, leading to a new hot job for software engineers: consulting to solve those failures.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Codebase Health:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Rody argues that poor codebase health, not context windows, is the biggest bottleneck in AI speed.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Grounding Yourself in a Changing Landscape&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=Dk4MD6TNiWE&amp;amp;t=1870s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;31:10&lt;/span&gt;&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Rody advises engineers to focus on &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;why&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; they were hired: to solve problems and engineer things that didn't exist before. He suggests using AI to provide better communication handoffs between colleagues, making artifacts so easy to approve that they are "ready to sign off" the moment they are handed over.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The era of agentic engineering is here, but as Rody Davis demonstrated, it requires more architectural discipline, not less. By treating your codebase like a Bonsai tree and your agents like an orchestra, you can move past the "toil" and focus on building the frameworks of the future.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Your turn to build&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Are you ready to build anything? We’ve officially launched the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;#NapkinChallenge&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. Take a handwritten sketch of an app idea, use Antigravity 2.0 to build it, and share your creation on social media.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Try Antigravity 2.0:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://goo.gle/4fnXilj" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;antigravity.google&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Join the Challenge:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://goo.gle/4e0AGF6" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;Napkin Challenge Details&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Rody’s personal &lt;/strong&gt;&lt;a href="https://rodydavis.com/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;website&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://github.com/rodydavis/rodydavis" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;github repo&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://github.com/rodydavis/skills" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;skills&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Connect with us&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Rody Davis&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; → &lt;/span&gt;&lt;a href="https://goo.gle/Rody-on-X" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://goo.gle/Rody-on-LinkedIn" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Shir Meir Lador&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; → &lt;/span&gt;&lt;a href="https://goo.gle/Shir-on-X" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://goo.gle/Shir-on-LinkedIn" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Thu, 18 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-100x-engineering-with-ai-agents-in-google-antigravity-20/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_with_tree.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Agent Factory Recap:  100X engineering with AI agents in Google Antigravity 2.0</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_with_tree.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-100x-engineering-with-ai-agents-in-google-antigravity-20/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI Engineering, Google Cloud Developer Relations</title><department></department><company></company></author></item><item><title>Cloud Network Insights: end-to-end observability for the Cross-Cloud Network</title><link>https://cloud.google.com/blog/products/networking/cloud-network-insights-end-to-end-cross-cloud-observability/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In today’s digital landscape, the network is no longer confined to a single data center or even a single cloud provider. Enterprises are increasingly adopting cross-cloud strategies, connecting Google Cloud workloads to on-premises environments, other clouds like AWS and Azure, and a vast array of internet-facing applications. While this flexibility drives innovation, it can also introduce significant operational complexity. When a user experiences degradation in application performance, the critical question remains: Is it the network, the application, or something else?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are excited to announce the general availability of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-intelligence-center/docs/cloud-network-insights/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Network Insights&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, an out-of-the-box, Google Cloud-native solution that provides comprehensive visibility into network and digital experience performance across complex multi-cloud, and hybrid environments.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Closing the visibility gap with active monitoring&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights, offered in &lt;/span&gt;&lt;a href="https://investors.broadcom.com/news-releases/news-release-details/broadcom-expands-collaboration-google-cloud-cloud-network" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;partnership with Broadcom AppNeta&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, expands your observability beyond Google Cloud to your entire global deployment. By utilizing active synthetic probing, the solution monitors network routes even when no user traffic is present, allowing teams to be proactive rather than reactive.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether the source of degradation is in the cloud, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;on-premises data centers, internet applications,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; ISPs, or last-mile connectivity, Cloud Network Insights helps you pinpoint the exact location of the bottleneck.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights integrates directly into the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Observability suite&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, bringing sophisticated network intelligence into the tools you already use. With Cloud Network Insights, you get:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;End-to-end network path visibility:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Gain a hop-by-hop visualization of the network path between your sources and destinations. Monitor critical metrics like round-trip time (RTT), packet loss, and jitter across networks you don’t directly manage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Digital experience insights:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Go beyond the network layer to monitor &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;digital experience for web applications&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. Measure DNS resolution times, HTTP response codes, and full browser page-load times to identify whether an application's degradation is due to the network or the application itself.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Proactive detection and alerting:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use synthetic testing to identify performance dips before they impact your customers. Alarms are integrated with Cloud Monitoring and Cloud Logging, enabling alerting via email, Slack, or PagerDuty.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;SLA validation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Arm your team with the data needed to verify if ISPs and service providers are meeting their performance commitments.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Rapid root-cause analysis: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Quickly differentiate between network problems, application-level issues, or browser performance impacts.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Integrated monitoring:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Access metrics and logs directly within Google Cloud, leveraging Cloud Monitoring and Cloud Logging for dashboards and alerting. Utilize the open partner ecosystem of Google Cloud as well as support for the OpenTelemetry protocol for metrics and logs, allowing direct ingestion by OTel SDKs and collectors.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agentic workload monitoring:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use synthetic testing to monitor connectivity and network performance to help ensure optimal connectivity to your agents and tools.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1-network_paths_low_res.gif"
        
          alt="1-network paths low res"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="8nkv4"&gt;Network performance and multi-path routes to/from Google Cloud, AWS, and Azure in one view&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;How it works: active synthetic probing&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights uses active synthetic probing technology that consists of three main components: &lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Monitoring Points:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You deploy lightweight software agents, called Monitoring Points, into critical network segments, such as a central VPC, a remote branch, or an on-premises data center. These can be deployed as containers or virtual machines.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Synthetic probes:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; These Monitoring Points send small, frequent bursts of synthetic traffic (simulating a user or application) to a target destination. This allows you to monitor performance 24/7, even when no real users are on the network.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Data synchronization:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The Monitoring Points send real-time performance telemetry to a central backend service. This data is then synchronized back to Google Cloud, with metrics exported to Cloud Monitoring, and alarms and events sent to Cloud Logging.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Core capabilities&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; supports two primary types of monitoring to give you a full picture of your infrastructure:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Network performance monitoring (Layers 3 and 4)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This provides a hop-by-hop visualization of the network between a source and a destination, including.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Metrics captured:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Round-trip time (RTT), packet loss, jitter, and path changes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Single-ended mode:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The agent probes an external target (like a URL, IP address or an API endpoint) that doesn't have a Monitoring Point installed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dual-ended mode:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The Monitoring Point probes another Monitoring Point. This provides richer data, including precise one-way latency and the ability to detect asymmetric routing (when data takes a different path going out than it does coming back).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_d8twiu8.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="21qbk"&gt;Network path metrics in Google Cloud console&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Digital experience monitoring (Layer 7)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With digital experience monitoring, you can track the end-to-end experience of a web application. Here, you can choose from:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Browser mode:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Uses a real browser engine (Selenium) to load full web pages, execute JavaScript, and render content. It measures complete page-load times to validate the actual user experience.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;HTTP mode:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Sends synthetic HTTP/S requests to a URL or API endpoint. This is a lightweight check for server availability, response time, and DNS/TLS performance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_VbaHlX5.max-1000x1000.png"
        
          alt="3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Intelligence and automation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights also offers a variety of monitoring and troubleshooting capabilities. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Proactive alarms: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights leverages auto-baselining to establish dynamic performance thresholds based on your historical metric data. If a metric deviates from your defined parameters, the system instantly triggers an event in Google Cloud, routing alerts directly to your team via email, Slack, or PagerDuty.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Monitoring policies:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can automate monitoring setups across large-scale environments by defining policies that dynamically create or remove paths based on custom tags. For instance, you can automatically track a core web application's performance from specific geographic regions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Root-cause analysis:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Because Cloud Network Insights extends visibility into traditionally "unwatched" areas like ISPs and transit networks, it instantly pinpoints whether a slowdown is occurring within Google Cloud, at the ISP level, or inside another cloud environment like AWS or Azure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-driven insights:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; With integration to Gemini Cloud Assist, you can use natural language to interrogate Cloud Network Insights telemetry alongside your broader infrastructure data. Rather than manually pivoting between dashboards, ask Gemini to cross-reference specific Cloud Network Insights metrics against other Google Cloud metrics, reducing mean time to resolution (MTTR).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What customers are saying&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are already seeing strong interest from customers looking to simplify their cross-cloud operations. Organizations like Sabre and Pexip are already using Cloud Network Insights to gain clarity in their hybrid environments.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"In an environment as complex and high-scale as Sabre’s, total visibility isn't just a luxury — it's a requirement for operational resilience. Cloud Network Insights will enable us to further shift our posture towards proactive optimization. By providing granular, real-time telemetry across our global cloud footprint, it helps eliminate the traditional 'black box' of the network, allowing our teams to resolve bottlenecks before they impact the traveler experience." &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;- Alfredo Rodriguez, VP of Cloud and Infrastructure, Sabre&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Cloud Network Insights closes the 'visibility gap' between the private corporate network and the public cloud, empowering our joint customers to pinpoint performance bottlenecks in seconds rather than hours.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Alan Davidson, CIO, Broadcom&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=MR6dUJKFU4I"
      data-glue-modal-trigger="uni-modal-MR6dUJKFU4I-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_TJbxQsH.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Pexip improves network health with Cloud Network Insights&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-MR6dUJKFU4I-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="MR6dUJKFU4I"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=MR6dUJKFU4I"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Navigating complex digital ecosystems shouldn't mean sacrificing visibility. Cloud Network Insights bridges the gap across multi-cloud and hybrid environments by combining deep network performance metrics with digital experience monitoring. Coupled with direct integrations into Google Cloud Observability and Gemini Cloud Assist, your teams are empowered with intelligent alerting, robust SLA validation, and rapid root-cause analysis. We look forward to helping you gain a clearer, unified view of your Cross-Cloud Network.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can get started in the Google Cloud &lt;/span&gt;&lt;a href="https://console.cloud.google.com/net-intelligence/cloud-network-insights/onboarding"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;console&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; today. To learn more:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Explore our&lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-intelligence-center/docs/cloud-network-insights/overview"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;product documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for deep dives into deploying Monitoring Points and configuring policies.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Check out the latest&lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-intelligence-center/docs/release-notes"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;release notes&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to stay updated on new features.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Watch the &lt;/span&gt;&lt;a href="https://youtu.be/KJ_Qrztildw?si=XKqpAM9yL44HqsR5" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;overview video&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Hear more about the partnership between Google Cloud and Broadcom: &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://youtu.be/XNaFAI5JWnU?si=yLk9SaSK7BbUIxJb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Alan Davidson, CIO, Broadcom talks with Rob Enns, VP/GM, Google Cloud Networking&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://youtu.be/nBdUPRbEFYw?si=BOJx67Lulrl5QDVR" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Michel Melillo, Head of Network Observability, Broadcom chats with Raj Gulani, Director of Product Management, Google Cloud&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Wed, 17 Jun 2026 19:30:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/cloud-network-insights-end-to-end-cross-cloud-observability/</guid><category>Infrastructure Modernization</category><category>Hybrid &amp; Multicloud</category><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Cloud Network Insights: end-to-end observability for the Cross-Cloud Network</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/cloud-network-insights-end-to-end-cross-cloud-observability/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Poonam Yadav</name><title>Product Manager</title><department></department><company></company></author></item><item><title>Build and Deploy a Remote MCP Server to GKE in 30 Minutes</title><link>https://cloud.google.com/blog/topics/developers-practitioners/build-and-deploy-a-remote-mcp-server-to-gke-in-30-minutes/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Build and Deploy a Remote MCP Server to GKE in 30 Minutes&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Integrating context from tools and data sources into LLMs can be challenging, which impacts the ease of development for AI agents. To address this challenge, Anthropic introduced the &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/introduction" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which standardizes how applications provide context to these models. Developers often want to build an MCP server for their APIs to make them available to fellow developers, allowing them to use it as context in their own applications. Google Kubernetes Engine (GKE) provides a scalable, reliable, and secure environment to deploy these remote MCP servers.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;This guide shows the straightforward process of setting up a secure remote MCP server on GKE.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;MCP transports&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Model Context Protocol follows a client-server architecture. It initially only supported running the server locally using the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;stdio&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; transport. The protocol has since evolved and now supports remote access transports, specifically &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/specification/latest/basic/transports#streamable-http" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Streamable HTTP&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;With Streamable HTTP, the server operates as an independent process that can handle multiple client connections. This transport uses HTTP POST and GET requests. The server must provide a single HTTP endpoint path that supports both POST and GET methods, such as &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://example.com/mcp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. You can learn more about the different transports in the &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/docs/concepts/architecture#transport-layer" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;official documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Benefits of running an MCP server on GKE&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Running an MCP server remotely on GKE provides several architecture benefits:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scalability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; GKE Autopilot is built to handle highly variable traffic. Since MCP Servers are stateless, GKE can scale horizontally to handle spikes in demand efficiently.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Centralized access:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Teams can share access to a centralized MCP server, allowing developers to connect from local machines, Agents or pipelines instead of running redundant local servers. Updates to the central server immediately benefit everyone.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enhanced security:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The Kubernetes Gateway API combined with SSL certificates provides an easy way to force secure, encrypted traffic. This allows only secure connections to the MCP server, preventing unauthorized access.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before starting, ensure the following tools are installed:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;python 3.10 or higher&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;uv (for package and project management, see the &lt;/span&gt;&lt;a href="https://docs.astral.sh/uv/getting-started/installation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;installation documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud SDK (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;kubectl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command-line tool&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Installation&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Prepare environment variables&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;export PROJECT_ID=$(gcloud config get-value project)\r\nexport REGION=us-central1&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edb580&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a folder, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;mcp-on-gke&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, to store the code for the server and deployment.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;mkdir mcp-on-gke &amp;amp;&amp;amp; cd mcp-on-gke&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edb730&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now configure the Google Cloud credentials and set the active project.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud auth login\r\ngcloud config set project $PROJECT_ID&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbaf0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Initiate the GKE Autopilot cluster creation in the background. This process takes a few minutes, so starting it now allows the cluster to provision while you complete the rest of the setup. Make sure to use an Autopilot version that ensures &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-compute-classes" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cost-Optimized Compute (CCOP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is enabled for fast autoscale.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud container clusters create-auto mcp-cluster \\\r\n    --region $REGION \\\r\n    --release-channel rapid \\\r\n    --async&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edba60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to create a project, which will generate a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pyproject.toml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv init&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbdf0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Next, create the additional files needed: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;server.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for the MCP server code, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;test_server.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for testing, and a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for the container deployment.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Math MCP server&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Large language models are excellent at non-deterministic tasks, such as generating text, summarizing ideas, and reasoning about concepts. However, they can be unreliable for deterministic tasks like math operations. To solve this, developers can create tools that provide valuable context. Using &lt;/span&gt;&lt;a href="https://gofastmcp.com/getting-started/welcome" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;FastMCP&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a framework for building MCP servers in Python, it is possible to create a simple math server with two tools: add and subtract.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;First, add FastMCP as a dependency.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv add fastmcp\r\nuv add asyncio&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbe50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Copy the following code into &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;server.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to create the server.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;from fastmcp import FastMCP\r\nfrom starlette.requests import Request\r\nfrom starlette.responses import PlainTextResponse\r\nimport asyncio\r\nimport logging\r\n\r\nlogger = logging.getLogger(__name__)\r\nlogging.basicConfig(format=&amp;quot;[%(levelname)s]: %(message)s&amp;quot;, level=logging.INFO)\r\n\r\nmcp_port=3000\r\n\r\n# Initialize the FastMCP server\r\nserver = FastMCP(\r\n    &amp;quot;Math Server&amp;quot;,\r\n)\r\n\r\n@server.tool()\r\ndef add(a: int, b: int) -&amp;gt; int:\r\n    &amp;quot;&amp;quot;&amp;quot;Add two numbers together.&amp;quot;&amp;quot;&amp;quot;\r\n    return a + b\r\n\r\n@server.tool()\r\ndef subtract(a: int, b: int) -&amp;gt; int:\r\n    &amp;quot;&amp;quot;&amp;quot;Subtract the second number from the first.&amp;quot;&amp;quot;&amp;quot;\r\n    return a - b\r\n\r\n@server.custom_route(&amp;quot;/healthz&amp;quot;, methods=[&amp;quot;GET&amp;quot;])\r\nasync def health_check(request: Request) -&amp;gt; PlainTextResponse:\r\n    &amp;quot;&amp;quot;&amp;quot;Simple health check endpoint that returns a 200 OK response&amp;quot;&amp;quot;&amp;quot;\r\n    return PlainTextResponse(&amp;quot;OK&amp;quot;)\r\n\r\nif __name__ == &amp;quot;__main__&amp;quot;:\r\n    logger.info(f&amp;quot; MCP server started on port {mcp_port}&amp;quot;)\r\n    # Could also use \&amp;#x27;sse\&amp;#x27; transport, host=&amp;quot;0.0.0.0&amp;quot; required for Cloud Run.\r\n    asyncio.run(\r\n        server.run_async(\r\n            transport=&amp;quot;streamable-http&amp;quot;, \r\n            host=&amp;quot;0.0.0.0&amp;quot;,\r\n            port=mcp_port\r\n        )\r\n    )&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbeb0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This example uses the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;streamable-http&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; transport, which is recommended for remote servers. The script encapsulates the logic needed to run a scalable MCP endpoint.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Testing the MCP server locally&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;test_mcp_server.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; script to connect to test the MCP Server. This will be useful to test the MCP server before deploying it to GKE.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;from fastmcp import Client, FastMCP\r\nimport asyncio\r\nimport logging\r\n\r\n# Connect to the remote MCP server\r\nclient = Client(&amp;quot;https://localhost:3000/mcp&amp;quot;)\r\n\r\nasync def test_remote_server():\r\n    async with client:\r\n        # Basic server interaction\r\n        await client.ping()\r\n\r\n        # List available operations\r\n        tools = await client.list_tools()\r\n        print(f&amp;quot;Available tools: {tools} \\n&amp;quot;)\r\n\r\n        # Execute add operation\r\n        result = await client.call_tool(&amp;quot;add&amp;quot;, {&amp;quot;a&amp;quot;: 5, &amp;quot;b&amp;quot;: 3})\r\n        print(f&amp;quot;Result of addition: {result} \\n&amp;quot;)\r\n\r\n        # Execute subtract operation\r\n        result = await client.call_tool(&amp;quot;subtract&amp;quot;, {&amp;quot;a&amp;quot;: 5, &amp;quot;b&amp;quot;: 3})\r\n        print(f&amp;quot;Result of subtraction: {result} \\n&amp;quot;)\r\n\r\nif __name__ == &amp;quot;__main__&amp;quot;:\r\n    asyncio.run(test_remote_server())&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbf10&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Run the MCP server locally to test the connection:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv run server.py&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbf70&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Then execute the test script in a new terminal to verify the connection.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv run test_mcp_server.py&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbb80&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The output should print available tools and the results of invocing the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;add&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;subtract&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; tools confirming the MCP server is functional.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Building the container image&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To speed up the deployment process, build the container image while the cluster is still creating.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;First, prepare the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;FROM python:3.10-slim\r\nCOPY --from=ghcr.io/astral-sh/uv:0.4.15 /uv /bin/uv\r\nWORKDIR /app\r\nCOPY pyproject.toml .\r\nCOPY server.py .\r\nRUN uv sync\r\nCMD [&amp;quot;uv&amp;quot;, &amp;quot;run&amp;quot;, &amp;quot;server.py&amp;quot;]&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edb850&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, set up the Artifact Registry and build the container image.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Set up Artifact Registry&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud artifacts repositories create mcp-repo \r\n--repository-format=docker \r\n--location=$REGION&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edb7c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Build and push the image in parallel&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/mcp-repo/math-mcp-server:latest&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbc70&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once the image build is complete, verify that the cluster is ready and retrieve the credentials. If the output of the cluster is not "RUNNING" wait for it to be ready.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud container clusters list\r\ngcloud container clusters get-credentials mcp-cluster --region $REGION&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edb400&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Deploying to GKE with Gateway API and SSL&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The next step involves deploying the server workloads and exposing them securely using the &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/gatewayclass-capabilities" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Kubernetes Gateway API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; rather than the legacy Ingress. This guarantees secure, encrypted traffic via SSL certificates.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file to define the Kubernetes Deployment and Service. Replace the placeholders with your actual project ID and region.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: apps/v1\r\nkind: Deployment\r\nmetadata:\r\n  name: mcp-server\r\nspec:\r\n  replicas: 2\r\n  selector:\r\n    matchLabels:\r\n      app: mcp-server\r\n  template:\r\n    metadata:\r\n      labels:\r\n        app: mcp-server\r\n    spec:\r\n      containers:\r\n      - name: mcp-server\r\n        image: $REGION-docker.pkg.dev/$PROJECT_ID/mcp-repo/math-mcp-server:latest\r\n        ports:\r\n        - containerPort: 3000\r\n        resources:\r\n          requests:\r\n            memory: &amp;quot;256Mi&amp;quot;\r\n            cpu: &amp;quot;250m&amp;quot;\r\n          limits:\r\n            memory: &amp;quot;512Mi&amp;quot;\r\n            cpu: &amp;quot;500m&amp;quot;\r\n        livenessProbe:\r\n          httpGet:\r\n            path: /healthz\r\n            port: 3000\r\n          initialDelaySeconds: 15\r\n          periodSeconds: 20\r\n        readinessProbe:\r\n          httpGet:\r\n            path: /healthz\r\n            port: 3000\r\n          initialDelaySeconds: 5\r\n          periodSeconds: 10\r\n---\r\napiVersion: v1\r\nkind: Service\r\nmetadata:\r\n  name: mcp-service\r\nspec:\r\n  selector:\r\n    app: mcp-server\r\n  ports:\r\n  - port: 80\r\n    targetPort: 3000&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857edbd30&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Apply this configuration to the cluster:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl apply -f deployment.yaml&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f486a3ab580&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Check the pods are up and running&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl get pods&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f486a3ab790&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To ensure our remote MCP Server is accessible let's try to reach it with a port-forward.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl port-forward svc/mcp-service 8080:80&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48696118e0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Run the test script to verify the connection. make sure to edit the MCP Server URL in the test script to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;http://localhost:8080/mcp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv run test_mcp_server.py&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4869611ee0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now let's secure the connection. To do so, we'll use a Google-managed SSL certificate and attach it to a Gateway API resource. First, reserve a static IP address for your load balancer:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud compute addresses create mcp-server-ip --global\r\nexport MCP_SERVER_IP=$(gcloud compute addresses describe mcp-server-ip --global --format=&amp;quot;value(address)&amp;quot;)\r\necho &amp;quot;Your IP: $MCP_SERVER_IP&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4869611c70&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Point your domain's DNS &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;A&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; record at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$MCP_SERVER_IP&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;mcp.yourdomain.com&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a Google-Managed Certificate. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;mcp.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual domain.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud compute ssl-certificates create mcp-cert --domains mcp.yourdomain.com --global&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4869611700&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gateway.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file to provision the load balancer and configure Transport Layer Security (TLS) termination.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Gateway: HTTPS load balancer with the managed certificate and static IP\r\napiVersion: gateway.networking.k8s.io/v1beta1\r\nkind: Gateway\r\nmetadata:\r\n  name: mcp-gateway\r\nspec:\r\n  gatewayClassName: gke-l7-global-external-managed\r\n  listeners:\r\n  - name: https\r\n    protocol: HTTPS\r\n    port: 443\r\n    tls:\r\n      mode: Terminate\r\n      options:\r\n        networking.gke.io/pre-shared-certs: mcp-cert\r\n  addresses:\r\n  - type: NamedAddress\r\n    value: mcp-server-ip\r\n---\r\n# HTTPRoute: forward traffic to the MCP Server\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n  name: mcp-route\r\nspec:\r\n  parentRefs:\r\n  - name: mcp-gateway\r\n  hostnames:\r\n  - &amp;quot;mcp.yourdomain.com&amp;quot;\r\n  rules:\r\n  - matches:\r\n    - path:\r\n        type: PathPrefix\r\n        value: /mcp\r\n    backendRefs:\r\n    - name: mcp-service\r\n      port: 80\r\n---\r\n# The GCPBackendPolicy is used to configure session affinity and other backend.\r\n# Since MCP Servers are stateful we enable session affinity. This ensures that\r\n# requests from the same client are sent to the same backend.\r\napiVersion: networking.gke.io/v1\r\nkind: GCPBackendPolicy\r\nmetadata:\r\n  name: mcp-backend-policy\r\nspec:\r\n  default:\r\n    sessionAffinity:\r\n      type: CLIENT_IP\r\n  targetRef:\r\n    group: &amp;quot;&amp;quot;\r\n    kind: Service\r\n    name: mcp-service\r\n---\r\n# The HealthCheckPolicy is used to configure custom health probes for the MCP Server.\r\napiVersion: networking.gke.io/v1\r\nkind: HealthCheckPolicy\r\nmetadata:\r\n  name: mcp-health\r\n  namespace: default\r\nspec:\r\n  default:\r\n    checkIntervalSec: 15\r\n    timeoutSec: 5\r\n    healthyThreshold: 1\r\n    unhealthyThreshold: 2\r\n    logConfig:\r\n      enabled: false\r\n    config:\r\n      type: HTTP\r\n      httpHealthCheck:\r\n        port: 3000\r\n        requestPath: /healthz\r\n  targetRef:\r\n    group: &amp;quot;&amp;quot;\r\n    kind: Service\r\n    name: mcp-service&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48696111f0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying this configuration creates the infrastructure required to route external traffic securely to the MCP server.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl apply -f gateway.yaml&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4869611970&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wait a few minutes for the load balancer to become active and the certificate to provision. Developers can check the status using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;kubectl get gateway mcp-gateway&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Try to reach the remote MCP Server. Run the test script to verify the connection. make sure to edit the MCP Server URL in the test script to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://mcp.yourdomain.com/mcp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv run test_mcp_server.py&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4869611670&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Cleanup&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl delete -f deployment.yaml\r\nkubectl delete -f gateway.yaml\r\ngcloud compute addresses delete mcp-server-ip --global\r\ngcloud compute ssl-certificates delete mcp-cert --global\r\ngcloud artifacts repositories delete mcp-repo --location=$REGION\r\ngcloud container clusters delete mcp-cluster --region $REGION&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4869611070&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;Continue reading&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying Model Context Protocol servers to Kubernetes enables new use cases for integrated agents and AI workflows. To dive deeper into these capabilities, explore the following resources:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol documentation&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/gateway-api" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Gateway API documentation&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/jlowin/fastmcp" rel="noopener" target="_blank"&gt;FastMCP Repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Wed, 17 Jun 2026 00:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/build-and-deploy-a-remote-mcp-server-to-gke-in-30-minutes/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Gemini_Generated_Image_33hpsi33hpsi33hp.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Build and Deploy a Remote MCP Server to GKE in 30 Minutes</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Gemini_Generated_Image_33hpsi33hpsi33hp.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/build-and-deploy-a-remote-mcp-server-to-gke-in-30-minutes/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abdelfettah Sghiouar</name><title>Cloud Developer Advocate</title><department>Google Cloud</department><company></company></author></item><item><title>How customer collaboration is shaping the future of GenAI security with Model Armor</title><link>https://cloud.google.com/blog/topics/developers-practitioners/how-customer-collaboration-is-shaping-the-future-of-genai-security-with-model-armor/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud, we believe that the best products are built in partnership with our customers. Their feedback and real-world experiences are invaluable in helping refine our services and deliver solutions that truly meet our customers’ needs. In January 2026, our Google Cloud Developer Advocacy team participated in a high-velocity technical sprint with a major Google Cloud customer and a leader in the telecommunications industry.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This collaborative engagement provided us with deep insights, leading to significant enhancements in &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Armor&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; information experience, our service for Runtime security for generative and agentic AI.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Accelerating GenAI adoption through "radical empathy"&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The objective of this engagement was to support the productionization of a next-generation GenAI customer support platform built using Google Cloud's &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Platform&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. By sitting directly with the customer's developers and security specialists, we gained a unique opportunity to observe how developers interact with Gemini Enterprise Agent Platform in a live, complex environment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;This experience provided something traditional documentation cycles cannot replicate: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;radical empathy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. By logging friction point&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;s, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;as developers worked, we translated functional blockers into technical insights in real-time, identifying exactly where developers were hindered by ambiguous configuration guidance or a lack of granular detail.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image_1_ir5Nrkw.max-1000x1000.png"
        
          alt="image_1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Key discoveries from the front lines&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By observing the development workflow firsthand, we identified four critical friction points:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Search-first workflows:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Developers rarely navigate through documentation hierarchies; instead, they rely on search to jump straight to specific code examples. A lack of comprehensive, copy-pasteable snippets for common use cases—like PII redaction—was a primary point of friction.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Balancing confidence levels:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Finding the right balance between comprehensive threat detection and minimizing disruptive false positives proved challenging. For instance, using aggressive settings like "low and above" often caused a high volume of false positives that interrupted legitimate customer support flows.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The need for granular guidance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; While the core concepts of Model Armor were understood, developers needed more detail on how different enforcement methods function in practice to balance security with usability.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Integration roadblocks (the 403 error):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; When integrating Model Armor with other services like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Apigee&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, developers frequently encountered &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;403 PERMISSION_DENIED&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; errors. This indicated a gap in our documentation regarding necessary cross-service IAM roles and permissions.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Turning insights into action&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The insights gained from this partnership were immediately channeled into a comprehensive overhaul of Model Armor’s documentation and guidance:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Tested, copy-pasteable code samples:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We have added numerous tested, ready-to-use code samples throughout the documentation to support search-first workflows.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The confidence level matrix:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We introduced a new technical reference to help users understand the trade-offs between different filter levels. We now explicitly recommend "High" or "Medium" thresholds for general content to minimize false positives, reserving "Low and above" for high-security threats like prompt injection and jailbreak detection.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Explicit integration guides:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We updated our integration guides, with a focus on &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Apigee, Gemini Enterprise Agent Platform, and GKE&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. These now clearly outline the specific IAM roles required (&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;such as &lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code&gt;&lt;span style="vertical-align: baseline;"&gt;roles/modelarmor.user&lt;/span&gt;&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) to ensure smooth, error-free deployments.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deeper technical documentation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We have enhanced the documentation to provide in-depth explanations of enforcement methods and their real-world applications.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;The power of partnership&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Getting "in the room" with our customers allowed us to bridge the gap between technical accuracy and operational utility. This journey of co-innovation ensures that Model Armor serves as a genuine catalyst for your success. We encourage you to explore the updated documentation and share your feedback as we continue to build the most secure platform for your GenAI workloads.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Explore the updated &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/model-armor/overview" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Armor documentation&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt; &lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 16 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/how-customer-collaboration-is-shaping-the-future-of-genai-security-with-model-armor/</guid><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How customer collaboration is shaping the future of GenAI security with Model Armor</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/how-customer-collaboration-is-shaping-the-future-of-genai-security-with-model-armor/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Darshana Bhangare</name><title>Technical Writer</title><department>Google Cloud</department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Leonid Yankulin</name><title>Senior Developer Relations Engineer</title><department></department><company></company></author></item><item><title>How I learned Go in a Day with Antigravity 2.0 and How You Can Do the Same</title><link>https://cloud.google.com/blog/topics/developers-practitioners/how-i-learned-go-in-a-day-with-antigravity-20-and-how-you-can-do-the-same/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I have been exploring how&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to reclaim my software stack from NPM dependency overhead and replace my resource-intensive Node.js runtime with a compiled, single-binary Go CLI. The result of my efforts is &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;skl&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a fast tool we use for managing Agent Skills, that launches in 2ms and uses only 11MB of memory.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But how exactly did I do it?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Simply, I set the architectural goals and audited the logic, while Antigravity handled the mechanical work of code translation, test generation, and platform path mappings for us. This post describes the step-by-step walkthrough of our migration workflow to help you build yours.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 0: Seed personal learning goals&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before writing any code, you start by defining the boundaries of your project. In our case, I wanted a zero-dependency core that used minimal external packages. I decided that our CLI tool needs to be fast, and our security model had to be zero-trust wherever appropriate. In the process, my agent added specific constraints: sanitizing all of our inputs, blocking path traversals, and enforcing depth limits on our folder scans to prevent CPU hangs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I began by prompting Gemini to audit alternative stacks and help us weigh their tradeoffs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Research online and identify 3-5 CLI tool building alternatives to use over TS and explain why (focus on performance and security) with specific example and links&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8520&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here are some alternatives we considered:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Rust&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; was exceptionally performant, but navigating its borrow checker rules and managing its lifetime annotations added too much friction for our simple symlinking tool.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If you choose &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Python&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, you will have to distribute a runtime interpreter and manage virtual environments, dragging in packaging overhead via &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pip&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that we wanted to avoid.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Zig&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; offered excellent low-level memory controls and compiling speed, but it lacked high-level standard library abstractions for HTTP operations and archive extraction out of the box.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Compiled &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Swift&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; provided clean scripting on macOS, but its cross-platform compilation capabilities for Windows and Linux were less suited for our multi-platform requirements.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For us, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Go&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; struck the right balance: it gave us synchronous, linear code, instant compiling, and a rich standard library.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To ensure I was not doing the same work that someone had already completed before me, I kicked off the project by asking directly:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;I want to port the `npx skills` to go. Did anyone do this before?&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8040&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agent researched the web and verified that there was no official Go port of the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;vercel-labs/skills&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; repository. It confirmed that while the official CLI is TypeScript-based and distributed via &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;npm&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, the Agent Skills specification itself is open and language-agnostic. This meant we were free to build a compiled Go port from scratch.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;And since I want to learn in the process, I also asked for Go-specific tips, tricks, and traps:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Identify 3-5 patterns on how to / how NOT to use GO and explain them to me&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8310&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: It's about Skills&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To make best use of best practices in a language that I'm not familiar with, I decided to find the most popular, well-received Agent Skill (instructions that guide AI coding assistants) and install it before we write any code or even start planning. Grounding the environment first ensures that any code written or planned subsequently conforms to the community's consensus style.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Skill search prompt&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I asked the agent what community agent skills were available for Go:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;what are the top community agent skills for `go`?&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8100&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once the agent suggested &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;samber/cc-skills-golang&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, I directed it to install the skill pack:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;add all skills from samber/cc-skills-golang&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8430&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once installed, I manually verified that the skill was discovered and ready by typing &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;/golang-&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to invoke autocompletion.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Gap analysis and planning&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I initialized the architectural goals by providing the agent with the following instruction:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Plan 100% functionality port of `npx skills` to Go, focusing on safety, best practices, and with 90% unit test coverage. Pull the repo and map things out. Ask me any questions.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd87c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our first topic task was the dynamic onboarding flow. When asked what the default should be, I suggested prompting to install &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;antigravity-cli&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; if no agent is found. I also defined the fallback behavior to the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;universal&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory when multiple active agents are detected:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;For the MVP, we target Antigravity 2 support as default and fallback to universal through the standards-compliant &amp;#x27;.agents&amp;#x27; directory (if multiple agents detected).&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Implementation&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;After I approved the Plan, Antigravity handled the systematic conversion of all 51+ agent configuration records (even though I didn't explicitly ask for all this, the AI correctly identified the task as simple enough to just include in the MVP scope), mapping distinct directories for Aider, Claude Code, Cursor, Zed, and others from TypeScript to Go, ensuring we fully covered all environments.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The core structures are conveniently located in one file &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/src/skl/types.go" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;types.go&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;type AgentType string\r\n\r\ntype AgentConfig struct {\r\n\tName                string\r\n\tDisplayName         string\r\n\tSkillsDir           string\r\n\tGlobalSkillsDir     string\r\n\tShowInUniversalList bool\r\n\tDetectInstalled     func(home, configHome, cwd string) bool\r\n}\r\n\r\n...&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8790&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This mapping works well. For example, the detection logic for Zed handles Linux (Flatpak), macOS, and Windows configurations dynamically in just a few lines:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;&amp;quot;zed&amp;quot;: {\r\n\tName:        &amp;quot;zed&amp;quot;,\r\n\tDisplayName: &amp;quot;Zed&amp;quot;,\r\n\tSkillsDir:   &amp;quot;.agents/skills&amp;quot;,\r\n\tGlobalSkillsDir: filepath.Join(home, &amp;quot;.agents/skills&amp;quot;),\r\n\tDetectInstalled: func(h, c, w string) bool {\r\n\t\treturn exists(filepath.Join(c, &amp;quot;zed&amp;quot;)) ||\r\n\t\t\t(zedAppDataHome != &amp;quot;&amp;quot; &amp;amp;&amp;amp; exists(filepath.Join(zedAppDataHome, &amp;quot;Zed&amp;quot;))) ||\r\n\t\t\t(zedFlatpakConfigHome != &amp;quot;&amp;quot; &amp;amp;&amp;amp; exists(filepath.Join(zedFlatpakConfigHome, &amp;quot;zed&amp;quot;)))\r\n\t},\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8610&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Next, I noticed that the Antigravity user onboarding code was intermingled with the automated mapping. A default like this one is a personal user choice and is better suited for isolation in its own file: &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/src/skl/agy-onboarding.go" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;agy-onboarding.go&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;move default Antigravity 2 prompting to agy-onboarding.go&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8640&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With version zero scaffolded, it was time to test.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Enforcing a quality assurance (QA) loop&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To guarantee that the Go port behaved identically to the original TypeScript CLI, we adopted a Test-Driven Development (TDD) loop. I kicked it off with this prompt:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Apply TDD principles and https://preslav.me/2026/05/19/10-golang-error-handling-commandments/&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8850&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This initiated the TDD process. Rather than explicitly prompting the agent to use skills, I guided it to fetch the 3rd party best-practice blog post, which reminded the agent about relevant Agent Skills (&lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/.agents/skills/golang-how-to/SKILL.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;golang-how-to&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/.agents/skills/golang-testing/SKILL.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;golang-testing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/.agents/skills/golang-error-handling/SKILL.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;golang-error-handling&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/.agents/skills/golang-cli/SKILL.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;golang-cli&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;). Because Antigravity has a sandbox, it parsed these skills and automatically started executing the QA loop. And it will keep re-applying these TDD principles in the current trajectory, anytime it is about to change functional code.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Test-first frontmatter parsing&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For frontmatter parsing, the agent wrote &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/src/skl/frontmatter_test.go" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;frontmatter_test.go&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; first using Go's table-driven test pattern (which was a delightful new pattern for me to discover):&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;func TestParseFrontmatter(t *testing.T) {\r\n\ttests := []struct {\r\n\t\tname        string\r\n\t\traw         string\r\n\t\twantData    map[string]interface{}\r\n\t\twantContent string\r\n\t}{\r\n\t\t{\r\n\t\t\tname:        &amp;quot;valid frontmatter&amp;quot;,\r\n\t\t\traw:         &amp;quot;---\\nname: my-skill\\n---\\n# Content\\n&amp;quot;,\r\n\t\t\twantData:    map[string]interface{}{&amp;quot;name&amp;quot;: &amp;quot;my-skill&amp;quot;},\r\n\t\t\twantContent: &amp;quot;# Content\\n&amp;quot;,\r\n\t\t},\r\n\t}\r\n\tfor _, tt := range tests {\r\n\t\tt.Run(tt.name, func(t *testing.T) {\r\n\t\t\tgotData, gotContent, err := ParseFrontmatter(tt.raw)\r\n\t\t\t# assert results...\r\n\t\t})\r\n\t}\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8760&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When Antigravity ran &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;go test&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, it failed cleanly as we expected. My agent then generated &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/src/skl/frontmatter.go" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;frontmatter.go&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, implementing a linear string scanning loop that splits the document and unmarshals its YAML metadata. By using simple linear scanning instead of complex regular expressions, we hardened our tool against Regular Expression Denial of Service (ReDoS) vulnerabilities that could crash the application. Including &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;safety&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; as a goal (in my initial prompt) resulted in safer code, even though the original Node implementation was using regular expressions.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Grounding via error commandments&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since we're talking about error handling, I'll cover here how we aligned our error structures with Preslav Rachev's &lt;/span&gt;&lt;a href="https://preslav.me/2026/05/19/10-golang-error-handling-commandments/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;10 Golang Error Handling Commandments&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Go requires you to return error values explicitly rather than catching them as exceptions. By integrating these rules, I directed the agent to check its errors immediately at every level (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;if err != nil&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) and wrap them with contextual detail (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;fmt.Errorf("action: %w", err)&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) before it propagates them up our call stack. While doing a final review of the generated code, I realized Antigravity forgot about this best practice, so I reminded it:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;shorten error messages in all files, remove &amp;#x27;failed to&amp;#x27; prefixes, etc. See the 10 golang commandments&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8160&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It promptly &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/commit/59822bc69464a5fce961231ef56ac0e775855aeb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;fixed&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; them across the codebase.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Are unit tests enough?&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The short answer is &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;No&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To ensure that the AI did not introduce subtle bugs or hallucinations during the translation process, I performed code reviews rather than blindly trusting passing test suites.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When I audited the generated tests, I realized that passing green checks alone weren't enough: We were missing tests for that long list of installation locations and the various combinations of having no agents, a single agent, or multiple agents active at the same time. Since this was a complete rewrite, I wanted end-to-end integration coverage for these journeys. To address this gap, I prompted Antigravity with a set of targeted scenarios:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Add integration tests:\r\n1. no agents installed: verify that it installs to antigravity and outputs the agy-cli onboarding tip.\r\n2. support for all agents but one\r\n3. exactly one agent installed, including cases where the same path might be attributed to multiple agents\r\n4. support for non-parametrized agents&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8550&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Note&lt;/strong&gt;: Non-parameterized agents like Claude Code or Codex define their configuration paths globally when the package loads (or via environment variables) instead of scanning the active workspace folder at runtime.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/commit/02f170e" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;changelist&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that added these tests didn't touch any production files, the logic was solid. But I didn't want to leave this to luck. If you care about a specific feature or workflow, you have to be explicit about it. Taking five minutes to verify your end-to-end coverage and defining a few solid tests protects your users from experiencing a broken release down the line.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 4: Parallel subagents for CLI commands&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you port a full suite of CLI commands (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;init&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;add&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;list&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;remove&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;find&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;update,...&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) along with their sub-options, you face a large surface area. Rather than migrating them sequentially, it might be better to parallelize our work. In our case, it was a good choice because we wanted each subagent to focus on its specific topic rather than keep in mind the entire tool, and this helped spot a few gaps.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;However, subagents are not always the best choice; you should only prioritize parallel execution on voluminous, independent tasks that are clearly bounded. When done right, parallel subagents won't consume significantly more tokens than a single long-running thread, but they protect the main coordinator agent from hitting context compression limits under the weight of a massive codebase. Most simple projects do not require this level of scale. A good rule of thumb is to reserve subagents for workloads equivalent to tens of features with tens of subfeatures.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In previous steps, I ran a single agent to quickly and efficiently build an MVP. But I was not sure whether it fully ported the code. So I asked it directly:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;did you cover 100% of the original CLI? \r\nhave subagents research each option individually and each test and fill in the gaps&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8220&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It turned out this was the right call. The subagents conducted an in-depth audit of the commands, catching several option gaps and missing tests that were subsequently integrated in this &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/commit/b9467b6783bbbadbb4236bbde5f49aab7224bd78" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;audit commit&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: []&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/mermaid_chart.max-1000x1000.jpg"
        
          alt="mermaid_chart"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Each subagent worked on exactly one command. They analyzed flag permutations like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;-g/--global&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--copy&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, drafted table-driven unit tests, and verified their code compiled cleanly. Once they reported back, the main coordinator integrated their changes, resolved any conflicts, and validated that the entire combined project compiled successfully.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Elephant and the Goldfish&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To keep our agent focused during this migration, we used the Elephant and Goldfish metaphor, an architectural pattern documented in Google Research's &lt;/span&gt;&lt;a href="https://research.google/pubs/elephants-goldfish-and-the-new-golden-age-of-software-engineering" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Elephants, Goldfish, and the New Golden Age of Software Engineering&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This relies on two distinct roles: the Elephant (the long-term coordinator session holding design rules and codebase memory) and the Goldfish (transient, clean subagents that you spawn to run a single task without background history).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While Antigravity does use automated session compression to manage its context size, you might want to actively manage your context window by maintaining your own checklists and partitioning your work to isolated, transient subagents, when &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;less (context) is more (clarity)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 5: Package structure, compilation, and CI/CD&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Through some back-and-forth communication, I learned how Go packages are structured and identified the limitations I needed to consider. I now had a cleanly structured and well documented package &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/main.go" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;main.go&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that supported native installation:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;go install github.com/alexastrum/skl@latest&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8b50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I prompted the agent to capture the implementation details and document them for future reference:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;summarize findings for humans in README.md, considerations for agents in AGENTS.md&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8e50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To verify the build, auto-run tests, and make sure it works on other machines as well, I asked the agent to:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;make sure it builds on all supported platforms&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8b20&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Antigravity set up the &lt;/span&gt;&lt;a href="https://github.com/alexastrum/skl/blob/main/.github/workflows/ci.yml" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ci.yml&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; workflow to run a matrix build, which had a surprising dependency:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;env:\r\n  FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: &amp;quot;true&amp;quot; # HMMMMMM ???\r\njobs:\r\n  test:\r\n    strategy:\r\n      matrix:\r\n        os: [ubuntu-latest, macos-latest, windows-latest]\r\n# ...&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8cd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Unexpected caveats&lt;/span&gt;&lt;/h3&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Paradoxically, even though we migrated from Node to Go, our &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GitHub pipeline still depends&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; on Node for standard GitHub Actions helpers like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;actions/checkout&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;actions/setup-go&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The tool is completely ready to be run and compiled locally. However, if we want to distribute pre-compiled binaries to other users, we would need to configure code signing for macOS and Windows.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since building a custom action with code signing is a complex process, it is best reserved for another time.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 6: Create an Agent Skill&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It was time to document the process itself. To codify this workflow, we &lt;a href="https://github.com/alexastrum/skl/blob/main/.agents/skills/cli-to-go-migration/SKILL.md" rel="noopener" target="_blank"&gt;created a reusable Agent Skill&lt;/a&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I started by asking the agent to plan a skill creation prompt that included the most important steps:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Review the current trajectory (including my specific prompts that generated accepted results) and lets plan to create a `/cli-to-go-migration` skill. What steps should the skill follow?&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8f40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I got a draft prompt which I iterated upon. After some back-and-forth, I anchored my final instructions on five core rules (though yours might be different). Here's the final prompt I used:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Review the current trajectory (including my specific prompts that generated accepted results) and lets plan to create a `/cli-to-go-migration` skill. Rules:\r\n\r\n#### 1. Goals\r\nThe agent must start with research before proposing code. It identifies broader user goals, reviews multiple stack alternatives, and checks for prior work to lock in on one target language and research its idioms.\r\n\r\n#### 2. Setup\r\nBefore modifying any files, the agent verifies or initializes a Git repository to keep a clean history. Later, it must also report download failures directly and fail gracefully once all independent work is finished, rather than falling back to placeholders or non-terminating loops.\r\n\r\n#### 3. Importing existing knowledge\r\nIf required grounding skills (like `golang-cli` or `golang-testing`) are missing but are explicitly named in a prompt, the agent blocks execution and offers to install them automatically after asking for confirmation, rather than printing instructions for the developer to follow.\r\n\r\n#### 4. Breakpoints\r\nThe skill establishes hard halts for known AI pain points. The agent stops for human or algorithmic validation when encountering specific problems and anytime confusion sets in.\r\n\r\n#### 5. Alignment checks\r\nWhenever we see signs of misalignment, we need to set explicit rules. For example, when I noticed that the agent was over-editing some docs and missing others, I set the rule that the agent should only apply the `/humanizer` skill to human-facing files, like the `README.md` or help docs, while leaving structured developer context, like `AGENTS.md`, clean of style edits so that other agents can parse its metadata accurately.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857fd8a60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There isn't a one-size-fits-all approach, but asking the agent to create a skill and anchor it on a few guardrails is a good start. In practice, you will likely take turns polishing multiple prompts, until you feel like the agent's responses are aligned with your goals. Then you will ask for a proof read from the AI, and finally perform a human review of the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;SKILL.md&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; contents.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Rebuilding &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;skl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in Go was a fun, educational experience that solved a personal tooling need. It worked, so I decided to document the process. Thinking through this prism, I realized that &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;the journey itself was the reward&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. You grow as an engineer by codifying your architectural choices into reusable skills and personal experience; while the compiled binary is the physical proof that your process worked.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Surprisingly, the most significant shift I experienced during this migration is behavioral.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Pulling away from an IDE (integrated development environment) and using Antigravity 2.0 made it easier for me to keep a high-level view, preventing me from going in and fixing the issues that arose during the migration. Instead, it guided me to understand why the issues occurred, and learn Go-language specific details.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a traditional IDE, the moment your assistant encounters an issue, your instinct is to grab your keyboard and debug. Operating without an editor forces you to remain the architect, steering the machine from the navigation deck rather than fighting the engine room fires yourself. That's exactly how we learn to manage agents at scale.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 15 Jun 2026 09:29:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/how-i-learned-go-in-a-day-with-antigravity-20-and-how-you-can-do-the-same/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Pfswm9P.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How I learned Go in a Day with Antigravity 2.0 and How You Can Do the Same</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Pfswm9P.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/how-i-learned-go-in-a-day-with-antigravity-20-and-how-you-can-do-the-same/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Alex "Sandu" Astrum</name><title>Developer Relations, Antigravity</title><department></department><company></company></author></item><item><title>10 Indispensable Prompts Our Team Refuses to Build Without</title><link>https://cloud.google.com/blog/topics/developers-practitioners/10-indispensable-prompts-our-team-refuses-to-build-without/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Look at any builder's prompt history and you'll see a collection of highly specific, sometimes chaotic, one-off prompts. We use AI to debug a single error message, refactor a messy email, or generate a quick boilerplate.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you sit down with people who consistently ship high-quality work, you'll find something interesting. They aren't just improvising. They have a set of go-to prompts they have tweaked and improved over time and used on nearly every project.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I asked some of my peers and leaders a simple question: "What prompt do you use most often, and why?"&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What they shared wasn't just a list of arbitrary commands. Here's the unfiltered look at the prompts our team refuses to ship without, and more importantly, why they use them.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Build a spec&lt;/span&gt;&lt;/h2&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Maja Bilić&lt;/span&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;S&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;enior Outbound Product Manager • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/mbilic/" rel="noopener" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Prompt:&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;Act as a cynical Principal Architect and Technical PM. I want to build a [product] that allows [user] to do [action]. Do not write code. Analyze this concept and list the top 5 technical, UX and architectural considerations. Then ask me key questions for each of the 5 considerations so we can work together on building the spec. Once you have all the answers, create a PRD doc and implementation plan. Don&amp;#x27;t over engineer or over simplify the design or implementation plan.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f760&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;I have written bad product requirements documentations (PRDs), and I have read many bad PRDs. This prompt ensures I use the persona of a cynical Architect / PM who helps distill the idea, critique the approach and concept, and collaborate on defining the most important pieces. This way I make sure I work through the plan with an agent's help while also developing the product design idea further. I also love the guardrail of not over engineering or over simplifying things; AI tends to do that sometimes, especially when writing product design docs.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Widget t&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ests&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Andrew Brogdon&lt;/strong&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Staff Developer Relations Engineer • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/redbrogdon" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/redbrogdon/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;I&amp;#x27;d like to partner with you on increasing the robustness of this project by creating widget tests. If you haven&amp;#x27;t already, please read the Flutter team&amp;#x27;s skill for creating widget tests (https://github.com/flutter/skills/tree/main/skills/flutter-add-widget-test). Then, let&amp;#x27;s do these things:\r\n\r\n* Examine my application&amp;#x27;s codebase to identify areas of the UI/UX that are not being tested properly.\r\n* Determine if the existing code is written in a testable way (are dependencies injected? Are domains loosely or tightly coupled? Etc.).\r\n* Determine which domains require more rigor than others.\r\n* Create an overall testing plan for the application.\r\n* Determine which areas of functionality are already aligned with that plan, and which are missing tests.\r\n* Create a plan to implement those tests.\r\n* Execute that plan.\r\n\r\nDo not proceed from one step to another unless you are completely confident about your reasoning. You are encouraged to as many questions as needed.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;My favorite use of agentic coding tools is to actually &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;do&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; all the things I used to feel guilty about not doing in my projects. Proper testing is definitely on that list. The official skills from the Dart/Flutter team do a great job of instructing agents on what good widget tests look like, so combining it with this prompt (which essentially just fits those steps into my own coding workflow) helps me reduce the toil required to maintain reliable, guilt-free codebases.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Find all the tests / Clean-up commit&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Aja Hammerly&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Director of Builder Relations • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/the_thagomizer" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/ajahammerly/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Run all the tests and identify any missing tests and write them. Pay special attention to edge cases and race conditions.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f400&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;Find any unused code, embarrassing comments, comment to code inconsistencies, unresolved TODOs, or other things in this commit that shouldn&amp;#x27;t be in there.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f940&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;I find that when I'm working on code I'll often get extremely focused on the "happy path", the main path I want a user to take through the code. While I'm focused on that I'll put in TODO or FIX comments on edge cases I don't want to think about yet. I'll also forget to update comments and leave debugging comments in sometimes. And while I try to follow test driven development, I don't always get tests in on all the edge cases. I run these two prompts, usually in a new conversation without the development context as a first round of code review before submitting to an AI or human reviewer for the next step. This ensures that what I've built is in good shape for others to review and use. &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Check for correct and compliant permissions&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rich Hyndman&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Head of Antigravity Developer Relations • Engineering &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/geekyouup" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/richardhyndman/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;Run a comprehensive check on this Android project to ensure all permissions are correct and compliant. Perform the following steps:\r\n1. Locate and analyze all &amp;#x27;AndroidManifest.xml&amp;#x27; files (including main, debug, and flavor-specific manifests), extract a master list of declared &amp;lt;uses-permission&amp;gt; tags. \r\n2. Cross-reference these declared permissions against the codebase to verify where they are actually used. Identify any bloatware or unused permissions that can be safely removed.\r\n3. Check the Kotlin/Java source files to ensure that all runtime permissions implement the dynamic runtime permission request flow &amp;#x27;checkSelfPermission&amp;#x27;,&amp;#x27;onRequestPermissionsResult&amp;#x27; or the Activity Result API.\r\n4. Verify that any hardware features associated with the permissions (like android.hardware.camera) are correctly declared. \r\nOutput your findings as a Markdown report. Provide file paths and suggested code diffs for any fixes. Do not make any file edits until I approve the plan.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f640&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Antigravity, with Gemini 3.5 Flash and the Android plugin is an excellent Android development partner! Checking for the correct permissions can keep your app running smoothly and help avoid delays when uploading to the Play Store.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Conduct code review&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Shir Meir Lador&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Head of AI, Developer Relations • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/shirmeir86" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Act as a strict, highly analytical Principal Engineer conducting a pre-production code review. You have incredibly high standards and zero tolerance for fragile, &amp;quot;happy-path&amp;quot; code. Your goal is to guide me to write bulletproof, production-ready systems.\r\nGrade my uncommitted changes on an A-to-F scale for production readiness. \r\nDo not award an &amp;quot;A&amp;quot; unless my code is exceptionally robust. Specifically, analyze the changes for:\r\n1. Efficiency: Redundant API calls, wasteful database queries, or un-cached resource leaks.\r\n2. Resilience: Silent failure points, lack of explicit error boundaries, and missing rate-limit fallbacks.\r\n3. Architecture: Tight coupling and lack of clear separation of concerns.\r\nFor every issue, explain pragmatically where the code is vulnerable to real-world production failures. Then, provide the exact git diffs needed to upgrade my code and earn that &amp;quot;A.&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f0d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; &lt;span style="vertical-align: baseline;"&gt;If you ask an LLM to review your code, it almost always defaults to being polite. It tells you your naming is clean, suggests a few docstrings, and hands you a green checkmark. But polite reviews don't prevent production outages. I like this prompt because it completely cuts through that AI fluff. By forcing the model to grade your work on a harsh scale and demanding a working git diff to fix it, you turn it into a real partner. It stops guessing and starts actually reading your network calls and database queries to find where the code is going to break. It’s like having an uncompromising senior dev sitting over your shoulder, pointing out exactly where you got lazy, and then handing you the exact code to fix it.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Explain trade-offs to aid decision-making&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;James O'Reilly&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Staff Developer Relations Engineer • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/JamesOR" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/jamesor" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;Explain the pros and cons of executing your suggested Implementation Plan. Be specific about the trade-offs we&amp;#x27;re making related to perforance, cost, security and maintainability so I can make an informed decision on how to proceed.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f1f0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;I force AI to stress-test its own logic. By asking it about the trade-offs being made, I find the AI will rethink its strategy, stay hyper-focused on our specific implementation and avoid giving vague, hand-wavy responses. I also find this approach prevents AI from acting like the final authority and keeps me in control of the decision making.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Improve AI-generated code through research&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Emma Twersky&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Head of Flutter &amp;amp; Dart Developer Relations • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/twerske" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/emmatwersky/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;Research online, focusing on X threads, StackOverflow, GitHub issues and tech blogs for common security pitfalls, architectural misalignments, and subtle logic errors found in AI-generated INSERT_TECH_YOU&amp;#x27;RE_USING_HERE code. Based on these findings, generate a manual review checklist specifically for auditing high-risk areas like platform channel validation, deep link routing, and sensitive data logging in crash reports.&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f1c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;While AI can write code 10x faster, it often produces slop—code that is rational but conceptually buggy because it makes incorrect assumptions about unspecified details. Research shows that up to 40% of AI-generated code contains vulnerabilities, and developers often trust it more than their own, which creates a dangerous mismatch. I use this prompt to generate a targeted checklist that protects against 'rubber-stamping' verbose AI changes and ensures my human judgment focuses on the high-risk 'seams' where models typically fail. Use AI to generate the tasks, but still keep a human in the loop where it matters most.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;Find problems through iteration&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Fred Sauer&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Head of Frameworks &amp;amp; Languages Developer Relations  • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/fredsa" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/fredsa/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Simplified, my "last" (series of) prompt(s) looks something like:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;- Code review the uncommitted changes.\r\n\r\nI prefer being less specific has oversteering can lead to blind spots.\r\nI prefer a new chat session for a fresh set of &amp;quot;eyes&amp;quot;.\r\nI iterate until the results returned are boring and I\&amp;#x27;m satisfied.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f2e0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If I come into this last phase with an opinion, (e.g. the change feels too complex), or I feel I don't have a good insight into how "good" the change is, then I might challenge the model with this prompt:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;- Code review the uncommitted changes. Identify any unhandled corner cases. Assess performance. Summarize findings.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6f310&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Then, having received 5 findings:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;- Fix 1, 3 and 5.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6fa90&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;I don't have ONE last prompt I send. It's more that my change goes through stages. The earliest stage is often about discovery (find the needle or thread to pull on). Then I move on to existence proof, i.e. I just want it to prove the thing I want to do can be done. Then I evaluate: is the PoC reasonable? Too complex? Makes changes entirely in the wrong place(s)? I then iterate and try to make the solution elegant, both how it's implemented, and where what is changed. Once I have something I'm happy with, like I feel happy if I had written what I now have, I move on to that last phase you discuss with is code review. This is about finding problems or identifying opportunities to make the change even better. I'm often surprised with what insights the model comes up with.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Review every pull request&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Remigiusz Samborski&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Lead Developer Relations Engineer • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/RemikSamborski" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I use the following prompt embedded in GitHub Actions for most of my engineering projects:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;## Role\r\n\r\nYou are a world-class autonomous code review agent. You operate within a secure GitHub Actions environment. Your analysis is precise, your feedback is constructive, and your adherence to instructions is absolute. You do not deviate from your programming. You are tasked with reviewing a GitHub Pull Request.\r\n\r\n\r\n## Primary Directive\r\n\r\nYour sole purpose is to perform a comprehensive code review and post all feedback and suggestions directly to the Pull Request on GitHub using the provided tools. All output must be directed through these tools. Any analysis not submitted as a review comment or summary is lost and constitutes a task failure.\r\n\r\n[...]&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6fac0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Full prompt: &lt;/span&gt;&lt;a href="https://github.com/google-github-actions/run-gemini-cli/blob/main/examples/workflows/pr-review/gemini-review.toml" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;link&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Using an automated Gemini CLI review in PRs helps catch issues and improvement opportunities during the review process. Additionally as more code is generated by AI Agents and development speed increases, reviews are becoming the bottleneck. By ensuring every PR gets reviewed automatically, human reviewers can focus on the higher-level architectural and conceptual review of the proposed change.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Apply d&lt;span style="vertical-align: baseline;"&gt;irected acyclic graph analysis for tests&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Karl Weinmeister&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Director, Developer Relations • Engineering&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Follow on &lt;/span&gt;&lt;a href="https://x.com/kweinmeister" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/karlweinmeister/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Prompt:&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Analyze the application workflow as a directed acyclic graph. Identify impactful tests for components, seams across components, and across the system. Present your findings in a markdown table as a prioritized gap analysis.&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f6fb80&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Most application workflows aren't linear. When you ask an LLM to suggest tests, you typically get a generic checklist that could apply to any project.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;However, when you force it to think about your system as a Directed Acyclic Graph (DAG) with nodes and edges, it starts reasoning structurally about where things can break.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I’ve also asked to consider the “seams” - a term from Michael Feathers' Working Effectively with Legacy Code. It points the model toward boundaries between components that are often under-tested.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, I’ve asked the model to summarize the results as a prioritized table of opportunities. This gives your agent a clear roadmap for making your app more resilient.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The thread connecting all of these prompts is about &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;de-risking human assumptions&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. Whether it's hunting for obscure edge cases, translating developer speak for end-users, or stress testing an architecture before code is written. Our team uses AI as an adversarial thinker designed to ask the hard questions we might overlook when we're deep in the weeds.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By building these "must-run" prompts into our daily workflows, we don't just ship faster, we ship with a level of confidence that used to require entire committees to achieve.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 11 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/10-indispensable-prompts-our-team-refuses-to-build-without/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/10-indispensable-prompts.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>10 Indispensable Prompts Our Team Refuses to Build Without</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/10-indispensable-prompts.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/10-indispensable-prompts-our-team-refuses-to-build-without/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>James O'Reilly</name><title>Staff Developer Relations Engineer</title><department>Google Cloud</department><company></company></author></item><item><title>Choosing your surface: Antigravity 2.0, Antigravity CLI, Antigravity IDE, or Antigravity SDK</title><link>https://cloud.google.com/blog/topics/developers-practitioners/choosing-your-surface-antigravity-20-antigravity-cli-antigravity-ide-or-antigravity-sdk/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity 2.0:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A desktop app to orchestrate multiple autonomous agents working in parallel across independent projects.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity CLI:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A terminal interface designed for command-line workflows and headless execution.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity IDE:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; An editor for developers who want to write code directly alongside an agent.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity SDK:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A Python library for building and deploying your own custom agents that use the Antigravity Harness.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;Quick Comparison&lt;/h4&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table border="1" style="border-collapse: collapse; width: 100%; height: 67.1952px;"&gt;
&lt;tbody&gt;
&lt;tr style="height: 22.3984px;"&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Feature&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity 2.0&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity CLI&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity IDE&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity SDK&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 22.3984px;"&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Interface&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Desktop App&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Terminal (TUI)&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Desktop App&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Python Code&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 22.3984px;"&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Multiple simultaneous tasks&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Command-line / Headless&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Directly editing code&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 18.1727%; height: 22.3984px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Building custom agents&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;The Four Surfaces of Antigravity&lt;/h2&gt;
&lt;h3&gt;1. Antigravity 2.0&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The default recommendation. Manages tasks across multiple projects at the same time.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity-new-chat.max-1000x1000.png"
        
          alt="antigravity-new-chat"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;Antigravity 2.0 is a standalone desktop application. It is designed to let you run multiple tasks without blocking your main workspace. You can easily switch between and monitor different projects from one screen. You can also schedule tasks to run on a regular schedule to check code quality or find outdated packages.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;2. Antigravity CLI&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For terminal workflows and headless execution.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity-cli.max-1000x1000.png"
        
          alt="antigravity-cli"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;Built in Go for speed, the Antigravity CLI is for those who prefer to work in the terminal with fast, keyboard-driven navigation and simple shortcuts. You can start background agents using terminal commands without locking up your active command-line window. Choose the CLI if you need headless execution (such as working over SSH or inside remote containers).&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;3. Antigravity IDE&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For developers who want to see and edit the code directly.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity-ide.max-1000x1000.png"
        
          alt="antigravity-ide"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;The IDE surface puts agents directly inside your current workspace. This is the best choice if you want to see exactly what code the agent is editing and accept or reject changes line-by-line. With built-in debugging, the agent can see runtime errors and offer a one-click fix right in your editor.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;4. Antigravity SDK (Python)&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Best for: Writing custom agent logic and automated pipelines.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import asyncio\r\nfrom google.antigravity import Agent, LocalAgentConfig\r\n\r\nasync def main():\r\n    config = LocalAgentConfig(\r\n        system_instructions=&amp;quot;You are an expert assistant for codebase navigation.&amp;quot;,\r\n        # api_key=&amp;quot;your_api_key_here&amp;quot;,\r\n    )\r\n    async with Agent(config) as agent:\r\n        response = await agent.chat(&amp;quot;What files are in the current directory?&amp;quot;)\r\n        print(await response.text())\r\n\r\nasync def run():\r\n    await main()\r\n\r\nif __name__ == &amp;quot;__main__&amp;quot;:\r\n    asyncio.run(run())&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f61b50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://antigravity.google/docs/sdk-overview" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Antigravity SDK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is a Python library that lets you build your own custom agents from scratch. Because it runs on the same shared harness, you get direct access to the exact same tools and rules that power Google’s official Antigravity tools. You can write an agent locally and deploy it to Google Cloud with zero code changes.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While each interface looks different, they all run on the same underlying agent harness. No matter which of the Antigravity surfaces you choose, you get support for &lt;/span&gt;&lt;a href="https://antigravity.google/docs/plugins" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;plugins&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://antigravity.google/docs/skills" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;skills&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and more. Your agents have access to the same core logic, so pick the one that works best for your project.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;For guides and documentation, visit &lt;/span&gt;&lt;a href="https://antigravity.google" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;antigravity.google&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and when you’re ready to get started, visit the &lt;/span&gt;&lt;a href="https://antigravity.google/download" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity Download Page&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 10 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/choosing-your-surface-antigravity-20-antigravity-cli-antigravity-ide-or-antigravity-sdk/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/gemini_v2_antigravity-surfaces-cover_image.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Choosing your surface: Antigravity 2.0, Antigravity CLI, Antigravity IDE, or Antigravity SDK</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/gemini_v2_antigravity-surfaces-cover_image.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/choosing-your-surface-antigravity-20-antigravity-cli-antigravity-ide-or-antigravity-sdk/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Alex "Sandu" Astrum</name><title>Developer Relations, Antigravity</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Luke Schlangen</name><title>Developer Advocate, Google Cloud</title><department></department><company></company></author></item><item><title>Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot</title><link>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While building AI agents locally using Google’s Agent Development Kit (ADK) is an excellent way to prototype, production-ready agents require a robust, scalable infrastructure. For developers looking to move beyond simple instances and into the world of managed container orchestration, Google Kubernetes Engine (GKE) Autopilot offers the perfect balance of flexibility and ease of use.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this tutorial, I will walk you through building a technical agent with ADK and deploying it to GKE Autopilot. We will focus on utilizing Gemini on Vertex AI as the core model and ensure highest security standards by implementing Workload Identity for permission management.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Understanding the GKE ADK Architecture&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying an ADK agent on GKE Autopilot involves more than just running a container. We leverage GKE's native capabilities to handle scaling and security. Our architecture consists of an ADK-based Python application packaged as a Docker image and stored in Artifact Registry. This container runs as a Deployment on GKE Autopilot, where it communicates securely with Vertex AI using Workload Identity—mapping a Kubernetes Service Account to a Google Cloud IAM Service Account.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;To expose the agent to the world, we use the Kubernetes Gateway API, the modern successor to Ingress, which provides a cleaner separation of concerns and native support for Google Cloud Load Balancing.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Prerequisites&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before we begin, ensure you have the following tools and accounts ready:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Python 3.10 or higher.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for package management.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud SDK (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) installed and configured.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;A Google Cloud project with billing enabled.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;kubectl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command-line tool.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;jq&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for parsing JSON responses.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The following APIs enabled: Kubernetes Engine, Artifact Registry, and Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 0: Configuring Google Cloud and Authentication&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before interacting with Google Cloud services, you must authenticate your environment and set the active project. This ensures that both the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CLI and your local Python environment can access Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Login to Google Cloud SDK&lt;/strong&gt;:&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud auth login&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Set your active project&lt;/strong&gt;:&lt;span style="vertical-align: baseline;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud config set project [PROJECT_ID]&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Setup Application Default Credentials (ADC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This is crucial for the ADK library to authenticate with Vertex AI during local testing.&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud auth application-default login&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Define Environment Variables&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To ensure we can easily reuse our configuration in subsequent steps, let's export our project, region, and cluster name as environment variables. &lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;export PROJECT_ID=$(gcloud config get-value project)
export REGION=us-central1
export CLUSTER_NAME=adk-cluster&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Step 1: Provisioning GKE Autopilot&lt;/h3&gt;
&lt;p&gt;GKE Autopilot is the recommended way to run Kubernetes without managing nodes. It allows you to focus on your agent deployment while Google manages the infrastructure. Starting the cluster creation now allows it to provision in the background while we build the agent.&lt;br/&gt;&lt;br/&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud container clusters create-auto $CLUSTER_NAME --region $REGION&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While the cluster is provisioning, we can move on to building our agent.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Building the Agent with ADK&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, let's create our agent. Start by creating a folder for the agent code:&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;mkdir adk-agent
cd adk-agent&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Initialize a new Python project with uv:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv init&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Add dependencies&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv add google-adk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a new agent using the adk cli&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv run adk create weather_agent&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;You will be asked to choose a model for the root agent. Choose &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini-2.5-flash&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (Number 1). Next you will be asked to choose a backend. Choose &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Vertex AI&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (Number 2). Next you will be asked to enter your Google Cloud project ID. Enter your project ID. Next you will be asked to enter your Google Cloud region. Choose a region of your choice. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The previous command scaffolded a new directory &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;weather_agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the following structure:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;weather_agent/
├── .env
├── __init__.py
└── agent.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ADK requires the agent code to be in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file. Let's edit the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file to add a simple tool for the agent.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt; from google.adk import Agent
# Define a simple tool for the agent
def get_weather(city: str) -&amp;gt; str:
    """Returns the current weather in a city."""
    return f"The weather in {city} is 90 degrees Fahrenheit and sunny."
# Initialize the agent with Vertex AI and Gemini
root_agent = Agent(
    name="weather_agent",
    model="gemini-2.5-pro",
    tools=[get_weather]
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file is the entry point for the agent. It is used to define the agent and its tools. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_weather&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function is a simple tool that returns the current weather in a city. For the purpose of this tutorial, we are using a hardcoded value for the weather. In a real-world scenario, you would use an API to get the current weather.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Testing the Agent Locally&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before deploying the agent to GKE Autopilot, we need to test it locally to ensure it works as expected. Run the following command to start the agent in debug mode with the web UI:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv run adk web&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Open &lt;/span&gt;&lt;a href="http://localhost:8000" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;http://localhost:8000&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in your browser and you should see the ADK web UI. You can then interact with your agent by typing messages in the chat interface.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;If the agent returns a message like "The weather in [CITY] is 90 degrees Fahrenheit and sunny." Congratulations! your ADK agent is working. Now you can proceed to the next step.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 4: Preparing for GKE Autopilot&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The ADK cli has a built-in command to deploy the agent to GKE Autopilot. However the default settings are not suitable for a production environment. For example, the default settings do not use Workload Identity for authentication with Vertex AI and to expose the Web UI via a Load Balancer on port 80.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We will instead manage the lifecycle of the container ourselves. First we need to containerize the agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;.dockerignore&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk-agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory to prevent your local virtual environment from being copied into the image:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;.venv
.adk
__pycache__
*.pyc
.env&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for your agent in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk-agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory. We will use a multi-stage build to keep the final production image lightweight and secure:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;# Stage 1: Build the virtual environment
FROM python:3.10-slim AS builder

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Set working directory
WORKDIR /app

# Force uv to use the system Python and use copy instead of symlinks
ENV UV_PYTHON_PREFERENCE=only-system
ENV UV_LINK_MODE=copy
ENV UV_COMPILE_BYTECODE=1
ENV UV_PYTHON=/usr/local/bin/python3

# Install dependencies
# We copy only files needed for installation to maximize cache
COPY pyproject.toml uv.lock ./
# Note: We don't use --frozen yet as the host lock file might be slightly out of sync
# but sync will update it in the builder stage.
RUN uv sync --no-install-project --no-dev --no-cache

# Copy the agent code
COPY . .
# Sync the project itself
RUN uv sync --no-dev --no-cache

# Stage 2: Runtime image
FROM python:3.10-slim

WORKDIR /app

# Copy the pre-built environment from the builder
COPY --from=builder /app/.venv /app/.venv
# Copy the application code (including weather_agent folder)
COPY . .

# Add the environment to the PATH
ENV PATH="/app/.venv/bin:$PATH"
ENV PYTHONUNBUFFERED=1

# Run the ADK API server
# We point to the weather_agent folder
CMD ["adk", "api_server", ".", "--host", "0.0.0.0", "--port", "8080"]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Build and push the image to Artifact Registry:&lt;br/&gt;&lt;br/&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create repository
gcloud artifacts repositories create adk-repo --repository-format=docker --location=$REGION

# Build and push
gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 5: Implementing Workload Identity for Security&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Security is paramount. Instead of hardcoding API keys, we use Workload Identity to grant the GKE pod permission to access Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;1. Create an IAM Service Account&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud iam service-accounts create adk-gke-sa&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;2. Grant Vertex AI permissions&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud projects add-iam-policy-binding $PROJECT_ID \

    --member="serviceAccount:adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;3. Allow the Kubernetes Service Account to impersonate the IAM SA&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud iam service-accounts add-iam-policy-binding adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:$PROJECT_ID.svc.id.goog[default/adk-ksa]"&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 6: Deploying the Agent to GKE&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Now, we define the Kubernetes resources. Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that includes the Service Account annotation for Workload Identity. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$PROJECT_ID&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$REGION&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual project ID and region.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;apiVersion: v1
kind: ServiceAccount
metadata:
  name: adk-ksa
  annotations:
    iam.gke.io/gcp-service-account: adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: adk-agent
spec:
  replicas: 2
  selector:
    matchLabels:
      app: adk-agent
  template:
    metadata:
      labels:
        app: adk-agent
    spec:
      serviceAccountName: adk-ksa
      containers:
      - name: adk-agent
        image: $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits: 
            cpu: "1"
            memory: "1Gi"
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: adk-service
spec:
  selector:
    app: adk-agent
  ports:
  - port: 80
    targetPort: 8080&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply the configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl apply -f deployment.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Check the status of the deployment:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl get pods -w&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the pods are running, you can use kubectl port-forward to access the agent locally:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl port-forward svc/adk-service 8080:80&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since we deployed the agent without Web UI, we can't access it at &lt;/span&gt;&lt;a href="http://localhost:8080" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;http://localhost:8080&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. However, we can still interact with it using the API and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a new terminal, run the following commands:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create a new session
curl -X POST http://localhost:8080/apps/weather_agent/users/u_123/sessions/s_123

# Run a message
curl -s -X POST http://localhost:8080/run \
-H "Content-Type: application/json" \
-d '{
"appName": "weather_agent",
"userId": "u_123",
"sessionId": "s_123",
"newMessage": {
    "role": "user",
    "parts": [{
    "text": "Hey whats the weather in new york today"
    }]
}
}' | jq .&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command will return the response in JSON format. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;jq&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command is used to parse the JSON response and display it in a more readable format. . You should see a response like:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;{
    "sessionId": "s_123",
    "messages": [
        {
            "role": "assistant",
            "parts": [
                {
                    "text": "The weather in New York today is sunny with a high of 90 degrees Fahrenheit."
                }
            ]
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;(Optional) Step 7: Exposing via Gateway API and HTTPS load balancer&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, we expose the agent using the GKE Gateway API with a Google-managed TLS certificate. This is the recommended, production-grade approach — Google will automatically provision and renew the certificate for your domain.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;NB: GKE supports other options to provision certificates. You can use Let's Encrypt with cert-manager, pre-shared certificates, or any other certificate authority. You can check the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/secure-gateway#secure-using-ssl-certificate"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for more details.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, reserve a static IP address for your load balancer:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute addresses create adk-agent-ip --global
export AGENT_IP=$(gcloud compute addresses describe adk-agent-ip --global --format="value(address)")
echo "Your IP: $AGENT_IP"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Point your domain's DNS &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;A&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; record at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$AGENT_IP&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk.mydomain.com&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a Google-Managed Certificate. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual domain::&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute ssl-certificates create adk-cert --domains adk.yourdomain.com --global&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gateway.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the following content:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;# Gateway: HTTPS load balancer with the managed certificate and static IP
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: adk-gateway
spec:
  gatewayClassName: gke-l7-global-external-managed
  listeners:
  - name: https
    protocol: HTTPS
    port: 443
    tls:
      mode: Terminate
      options:
        networking.gke.io/pre-shared-certs: adk-cert
  addresses:
  - type: NamedAddress
    value: adk-agent-ip
---
# HTTPRoute: forward traffic to the ADK service
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: adk-route
spec:
  parentRefs:
  - name: adk-gateway
  hostnames:
  - "api.yourdomain.com"
  rules:
  - backendRefs:
    - name: adk-service
      port: 80
---
apiVersion: networking.gke.io/v1
kind: HealthCheckPolicy
metadata:
  name: adk-health
  namespace: default
spec:
  default:
    checkIntervalSec: 15
    timeoutSec: 5
    healthyThreshold: 1
    unhealthyThreshold: 2
    logConfig:
      enabled: false
    config:
      type: HTTP
      httpHealthCheck:
        port: 8080
        requestPath: /health
  targetRef:
    group: ""
    kind: Service
    name: adk-service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply the configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl apply -f gateway.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Certificate provisioning can take up to 20 minutes. Monitor the status with:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute ssl-certificates describe adk-cert --global&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the status shows &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Active&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, your agent is live at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://api.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. You can test it with:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create a new session
curl -X POST https://api.yourdomain.com/apps/weather_agent/users/u_124/sessions/s_124

# Run a message
curl -s -X POST https://api.yourdomain.com/run \
-H "Content-Type: application/json" \
-d '{
"appName": "weather_agent",
"userId": "u_124",
"sessionId": "s_124",
"newMessage": {
    "role": "user",
    "parts": [{
    "text": "Hey whats the weather in new york today"
    }]
}
}' | jq .&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion &amp;amp; Looking Ahead&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By following these steps, you have successfully deployed a production-ready AI agent built with ADK onto GKE Autopilot that invokes Gemini on Vertex AI with Workload Identity for authentication. This setup ensures that your agent can scale horizontally to meet demand while maintaining a high security posture.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As you look ahead, consider integrating more complex tools or leveraging GKE's multi-cluster capabilities for even greater resilience. For more details on the technologies used here, explore the official &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;a href="https://github.com/google/adk" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To avoid ongoing charges, remember to delete the GKE cluster and the Artifact Registry repository when finished:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl delete -f gateway.yaml
kubectl delete -f deployment.yaml
gcloud compute addresses delete adk-agent-ip --global
gcloud compute ssl-certificates delete adk-cert --global
gcloud container clusters delete $CLUSTER_NAME --region $REGION
gcloud artifacts repositories delete adk-repo --location $REGION&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description><pubDate>Thu, 04 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Hero_Image_Resizing.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Hero_Image_Resizing.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abdel Sghiouar</name><title>Senior Cloud Developer Advocate</title><department></department><company></company></author></item><item><title>Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers</title><link>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud Storage (GCS) is a foundational component of the modern agentic tech stack and the preferred home for unstructured data at scale. As enterprises deploy agents in production, the critical focus has shifted to turning data into context and building secure, standardized integrations to access context. This is the core of smart storage: making unstructured data inherently agent-ready by turning passive objects into rich context for reasoning. Whether it’s automating complex financial workflows or diagnosing system failures in seconds, AI success now depends on how seamlessly agents can leverage this intelligence to make smart, high-stakes decisions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will share &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;three&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; examples of agents built by customers using GCS, and then share how you can securely and reliably connect your agents to GCS using &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (MCP). Combined with smart storage features like auto annotations and object contexts, GCS MCP server makes the whole agent deployment process easy and simple.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Real-world agent success on Google Cloud Storage&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are seeing incredible innovation from customers leveraging MCP and Google’s agentic tech stack to solve complex business problems:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Palo Alto Networks&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; built the Strata Co-Pilot agent, a screen-aware AI assistant that guides network security administrators through complex configuration flows—either by highlighting steps or executing them directly. The agent is powered by the Gemini Live API, with GCS serving as its “historical memory” connected via the GCS MCP server.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Airwallex &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;developed an AI Assistant that understands user context, answers questions, and executes workflows on their behalf. For example, it can smartly analyze expense policy documents and generate detailed approval workflows - a task that would normally take hours to do manually. GCS and GCS metadata are used by the agent to store documents and the extracted information, respectively.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=F0Kw_eD5Y04"
      data-glue-modal-trigger="uni-modal-F0Kw_eD5Y04-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_zruL8XX.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Introducing Airwallex AI Assistant: Your concierge for effortless global finance&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-F0Kw_eD5Y04-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="F0Kw_eD5Y04"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=F0Kw_eD5Y04"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Snap's &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Job Optimization Agent analyzes Flink and Spark job specs, metadata, and historical metrics stored on GCS across thousands of jobs to find optimization opportunities, generate cost estimates, and tune configurations. Using this agent, Snap is already seeing investigation time reduced from 30 minutes to 30 seconds!&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In all these three agents, the GCS MCP server handles data operations as well as enforces standard RBAC and access policies. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting agents to GCS using MCP &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;MCP has rapidly emerged as the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;universal &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;standard for connecting agents to data sources, but building custom servers from scratch is often a slow, distracting process that diverts focus from innovation. This path introduces significant development overhead and risk, as it forces you to manage everything from authentication and error handling to keeping pace with GCS’s evolving capabilities. To solve this, GCS offers two powerful MCP server options — Remote and Local — allowing you to offload the foundational plumbing and focus on creating value.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Remote MCP server: Fully-managed &lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting your agents to the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/use-cloud-storage-mcp"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage MCP server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; requires zero infrastructure deployment. By simply pointing your agent configuration to the managed endpoint, you gain immediate access to your unstructured data on GCS, allowing you to scale your agentic workloads effortlessly without the burden of operational overhead. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because the Cloud Storage MCP server follows the open MCP standard, it works seamlessly with major agentic frameworks like ADK and is compatible with MCP clients. You can easily connect clients like Google Antigravity and Anthropic’s Claude by adding a Custom Connector in the settings. Simply point it to your Cloud Storage MCP endpoint, and you are ready to start building — no complex configuration files required.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/image1_9FCB2cO.gif"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting an agent to storage requires robust security and governance. GCS MCP server is built on Google Cloud's standard identity, observability, and security frameworks:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identity-first security&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Authentication is handled entirely through Identity and Access Management (IAM) rather than shared keys. This ensures agents can only access data (buckets and objects) explicitly authorized by the user.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Full observability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To track agent activity, every request and action taken via these MCP servers is logged in Cloud Audit Logs. This provides security teams with a record of every interaction, maintaining visibility alongside ease of access.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;MCP security - content scanning&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: You can optionally configure the MCP endpoint with Google’s content security service, Google Cloud Model Armor. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;This allows you to implement security controls against common MCP attack vectors—such as direct and indirect prompt injection attacks, MCP Tool poisoning attacks, and malicious URL/SQL injections—as well as prevent the leakage of sensitive data.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Storage MCP servers are perfect for most production use cases; however, as with all remote servers, you lose the capability to fully customize your MCP tools.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Local MCP Server: Self-managed for controlled customization&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;While the Remote server handles standard data access, Local MCP is the right choice when you need to build &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;custom tools&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; specific to your business logic. For example, if your agent needs to perform specialized data transformations—such as redacting PII or adding context from another internal system—whenever it reads a file from GCS, a Local MCP server allows you to define those unique capabilities&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCS Local MCP server is an open-source &lt;/span&gt;&lt;a href="https://github.com/googleapis/gcloud-mcp/tree/main/packages/storage-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of Google-maintained tools that provides you with a reliable bridge to your data. Here are a few tips to keep in mind while designing custom tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Provide precise, clear descriptions to minimize incorrect invocations by the models&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Implement model-friendly error handling for models to understand their mistakes and self-correct&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCS Local MCP is now also a part of the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/pre-built-tools-with-mcp-toolbox"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Toolbox for Databases&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a single open-source repository containing connectors for major data services such as GCS, BigQuery, AlloyDB, Spanner, and Cloud SQL, making it easier to monitor and manage your data ecosystem. The Toolbox offers simplified development with reduced boilerplate code, enhanced security through OAuth2 and OIDC, and end-to-end observability with OpenTelemetry integration.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are optimizing an existing process like Snap or automating workflow creations like Airwallex, your unstructured data is one of your agent's greatest assets.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Explore the generally available &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/use-cloud-storage-mcp"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GCS Remote MCP Server&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Check out our GCS Local MCP&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;a href="https://github.com/googleapis/gcloud-mcp/tree/main/packages/storage-mcp" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to start building custom tools today, or use it as part of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/pre-built-tools-with-mcp-toolbox"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Toolbox for Databases&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:storage-ai@google.com"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Reach out&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to us to discuss your Agent use case with GCS data.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</guid><category>Storage &amp; Data Transfer</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Hero-image.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Hero-image.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Himanshu Kohli</name><title>Product Manager, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Manjul Sahay</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway</title><link>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What happens when your workload fails in one region but you need access to service? This is a common case for availability and uptime. With recent enhancement to the Kubernetes ecosystem and capabilities like &lt;/span&gt;&lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dynamic Resource Allocation (DRA)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://gateway-api-inference-extension.sigs.k8s.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Inference Gateway.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;span style="vertical-align: baseline;"&gt;I decided to experiment with these capabilities in Google Cloud for a simple test using an AI inference workload.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will explore this setup and you can also jump straight into the detailed configs in this codelab &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Building blocks &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To build out this experiment, use the following products, features, and tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Kubernetes Engine &lt;/span&gt;&lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;(GKE) managed DRANET&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: This is a managed feature that lets you request and share resources among Pods. This supports &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#use-rdma-interfaces-gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#use-non-rdma-interfaces-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;TPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. In this test TPUs were used in two different regions with networking assigned using managed DRANET.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-multi-cluster-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Multi-cluster GKE Inference gateway&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Load balances your AI/ML inference workloads across multiple GKE clusters. This works in a failover situation which is what my experiment intended to test. The type which supports this is the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/gateway-api#gatewayclass"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Multi-cluster Cross-region internal Application Load Balancer&lt;/span&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;code style="vertical-align: baseline;"&gt;gke-l7-cross-regional-internal-managed-mc&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/cloud-storage-fuse/overview"&gt;&lt;strong&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage FUSE&lt;/span&gt;&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Provides a way to store data, models, checkpoints, and logs directly in Cloud Storage. To speed up the deployment, an open source gemma model was downloaded to this storage for retrieval. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Virtual private Cloud (VPC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The foundational global network providing isolated, secure communication for the internal load balancers and compute nodes&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/fleets-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Fleets&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Fleets group the separate regional clusters under a unified management control plane&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/tpu/docs/v6e"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;TPU v6e&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Google's custom AI accelerators that provide the high-performance compute required to serve the model. The VM family type used was the  &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ct6e-standard-4t&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/tpu/docs/v6e#configurations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2x2 Slice&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Design pattern example&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The aim is to deploy a LLM model (Gemma 3) onto 2 GKE clusters in different regions. Each cluster will use 4 TPU v6e chips. The model should be stored in Cloud Storage. The workload is served using GKE Inference Gateway which supports multi-clusters. The traffic should be routed to the region closest to the user and failover to the other region if one region fails.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-build.max-1000x1000.png"
        
          alt="1-build"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div data-draftjs-conductor-fragment='{"blocks":[{"key":"ct469","text":"Putting it together","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"a673f","text":"To get access to the TPUs for your project in two regions you have to ensure you have the necessary quota in those regions.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":90,"length":15,"key":0}],"data":{}},{"key":"8ufpl","text":"Begin: Set up the environment","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":6,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"3hun0","text":"Create a standard VPC, with firewall rules and subnet in the same zone as the reservation.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":9,"length":12,"key":1}],"data":{}},{"key":"afkbe","text":"Create a proxy-only subnet this will be used with the Internal regional application load balancer attached to the GKE inference gateway.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":9,"length":17,"key":2}],"data":{}},{"key":"23sv0","text":"Set up firewall rules allowing traffic and health checks.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"b83on","text":"Reserve static internal IP addresses in both regions for the Gateway.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"5sqev","text":"Provision a Cloud Storage FUSE bucket and configure a dedicated IAM Service Account. Bind this to a Kubernetes Workload Identity so your pods can securely mount the bucket and read the model weights directly.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"65eu0","text":"Next: Create standard GKE clusters and node pools","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":49,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"3nj2n","text":"Deploy two separate GKE clusters in your chosen regions configured.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"d6395","text":"Enable the Gateway API (--gateway-api=standard) and the Cloud Storage FUSE CSI driver (--addons GcsFuseCsiDriver) during cluster creation.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":24,"length":22,"style":"CODE"},{"offset":87,"length":25,"style":"CODE"},{"offset":24,"length":22,"style":"ITALIC"},{"offset":87,"length":25,"style":"ITALIC"}],"entityRanges":[{"offset":55,"length":30,"key":3}],"data":{}},{"key":"37hd5","text":"Create dedicated TPU v6e node pools (ct6e-standard-4t) for both clusters.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":37,"length":16,"style":"CODE"},{"offset":37,"length":16,"style":"ITALIC"}],"entityRanges":[],"data":{}},{"key":"e6o1h","text":"Enable managed DRANET on these TPU node pools by setting the flags\n ---accelerator-network-profile=auto, and\n --node-labels=cloud.google.com/gke-networking-dra-driver=true.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":68,"length":35,"style":"CODE"},{"offset":110,"length":61,"style":"CODE"},{"offset":68,"length":35,"style":"ITALIC"},{"offset":110,"length":62,"style":"ITALIC"}],"entityRanges":[{"offset":31,"length":14,"key":4}],"data":{}},{"key":"e6iod","text":"Next: Establish the global mesh via Fleet Registration","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":54,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"8nj7o","text":"Register both GKE clusters to a unified GKE Fleet by following the fleet creation and registration setup.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":66,"length":38,"key":5}],"data":{}},{"key":"6f71o","text":"Enable Multi-Cluster Service Discovery and Multi-Cluster Ingress on your fleet.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"cbent","text":"Designate your primary region as the configuration hub to act as the control plane for routing rules across both regions.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"2k3c3","text":"Next: Deploy the AI Workload","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":28,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"b56k8","text":"Use a temporary Kubernetes job to download the Gemma 3 (gemma-3-27b-it) model weights directly into your Cloud Storage bucket.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":56,"length":14,"style":"CODE"},{"offset":56,"length":14,"style":"ITALIC"}],"entityRanges":[],"data":{}},{"key":"lihp","text":"Define a ResourceClaimTemplate that explicitly requests the managed DRANET device class (deviceClassName: netdev.google.com) with the allocation mode set to \"All\".","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":9,"length":21,"style":"CODE"},{"offset":89,"length":34,"style":"CODE"},{"offset":9,"length":21,"style":"ITALIC"},{"offset":89,"length":34,"style":"ITALIC"}],"entityRanges":[],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/tpus#ensure-quota-od-spot"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://cloud.google.com/kubernetes-engine/docs/concepts/cloud-storage-fuse-csi-driver"}},"4":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-tpu"}},"5":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://cloud.google.com/kubernetes-engine/docs/how-to/creating-fleets"}}}}'&gt;
&lt;h2 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="czag4-0-0"&gt;&lt;span data-offset-key="czag4-0-0"&gt;Putting it together&lt;/span&gt;&lt;/h2&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="4apjo" data-offset-key="31jqe-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="31jqe-0-0"&gt;&lt;span data-offset-key="31jqe-0-0"&gt;To get access to the TPUs for your project in two regions you have to ensure you have the &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/tpus#ensure-quota-od-spot" role="button"&gt;&lt;span data-offset-key="31jqe-1-0"&gt;necessary quota&lt;/span&gt;&lt;/a&gt;&lt;span data-offset-key="31jqe-2-0"&gt; in those regions.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="4apjo" data-offset-key="9e8ff-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9e8ff-0-0"&gt; &lt;/div&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9e8ff-0-0"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Begin:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Set up the environment. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;standard VPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, with firewall rules and subnet in the same zone as the reservation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;proxy-only subnet&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; this will be used with the Internal regional application load balancer attached to the GKE inference gateway&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Set up firewall rules allowing traffic and health checks.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Reserve static internal IP addresses in both regions for the Gateway.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Provision a Cloud Storage FUSE bucket and configure a dedicated IAM Service Account. Bind this to a Kubernetes Workload Identity so your pods can securely mount the bucket and read the model weights directly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Create standard GKE clusters and node pools.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy two separate GKE clusters in your chosen regions configured.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable the Gateway API (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--gateway-api=standard&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) and the&lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/cloud-storage-fuse-csi-driver"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage FUSE CSI driver&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--addons GcsFuseCsiDriver&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) during cluster creation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create dedicated TPU v6e node pools (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ct6e-standard-4t&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) for both clusters.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable managed DRANET on these &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;TPU node pools&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; by setting the flags &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;---accelerator-network-profile=auto&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--node-labels=cloud.google.com/gke-networking-dra-driver=true&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Establish the global mesh via Fleet Registration.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Register both GKE clusters to a unified GKE Fleet by following the&lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/creating-fleets"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;fleet creation and registration setup&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable Multi-Cluster Service Discovery and Multi-Cluster Ingress on your fleet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Designate your primary region as the configuration hub to act as the control plane for routing rules across both regions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the AI workload.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use a temporary Kubernetes job to download the Gemma 3 (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemma-3-27b-it&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) model weights directly into your Cloud Storage bucket.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Define a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ResourceClaimTemplate&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that explicitly requests the managed DRANET device class (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deviceClassName: netdev.google.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; ) with the allocation mode set to "All".&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: resource.k8s.io/v1\r\nkind: ResourceClaimTemplate\r\nmetadata:\r\n  name: all-netdev\r\n  namespace: default\r\nspec:\r\n  spec:\r\n    devices:\r\n      requests:\r\n      - name: req-netdev\r\n        exactly:\r\n          deviceClassName: netdev.google.com\r\n          allocationMode: All&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c4130&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy your inference server (e.g. vLLM) on the TPU nodes in both regions. Ensure the pod spec utilizes node selectors for the 2x2 TPU topology, requests exactly 4 TPUs, and mounts the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;netdev&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; claim. This guarantees your pods utilize the dedicated accelerator networking alongside standard Ethernet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Configure the Multi-Cluster Inference Gateway.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Install the necessary Custom Resource Definitions (CRDs) so Kubernetes can process specialized routing objects like the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy an &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;AutoscalingMetric&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to track hardware utilization, such as KV cache usage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use Helm to group the independent AI deployments from both regions into a single, logical &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferencePool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the Cross-Region Gateway and its associated &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;HTTPRoute&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to manage incoming global traffic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply health checks and backend policies to the pool to ensure load balancing relies on your custom hardware metrics.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Configure an &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to instruct the gateway to route prompts to the region with the highest availability, avoiding overloaded TPUs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: gateway.networking.k8s.io/v1\r\nkind: Gateway\r\nmetadata:\r\n  name: cross-region-gateway\r\n  namespace: default\r\nspec:\r\n  gatewayClassName: gke-l7-cross-regional-internal-managed-mc\r\n  addresses:\r\n  - type: networking.gke.io/named-address-with-region\r\n    value: &amp;quot;regions/europe-west4/addresses/gemma-gateway-ip-europe-west4&amp;quot;\r\n  - type: networking.gke.io/named-address-with-region\r\n    value: &amp;quot;regions/us-east5/addresses/gemma-gateway-ip-us-east5&amp;quot;\r\n  listeners:\r\n  - name: http\r\n    protocol: HTTP\r\n    port: 80\r\n---\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n  name: gemma-route\r\n  namespace: default\r\nspec:\r\n  parentRefs:\r\n  - name: cross-region-gateway\r\n    kind: Gateway\r\n  rules:\r\n  - backendRefs:\r\n    - group: networking.gke.io\r\n      kind: GCPInferencePoolImport\r\n      name: gemma-pool\r\n      port: 8000\r\n---\r\napiVersion: networking.gke.io/v1\r\nkind: HealthCheckPolicy\r\nmetadata:\r\n  name: gemma-health-check\r\n  namespace: default\r\nspec:\r\n  targetRef:\r\n    group: networking.gke.io\r\n    kind: GCPInferencePoolImport\r\n    name: gemma-pool\r\n  default:\r\n    config:\r\n      type: HTTP\r\n      httpHealthCheck:\r\n        requestPath: /health\r\n        port: 8000\r\n---\r\napiVersion: networking.gke.io/v1\r\nkind: GCPBackendPolicy\r\nmetadata:\r\n  name: gemma-backend-policy\r\n  namespace: default\r\nspec:\r\n  targetRef:\r\n    group: networking.gke.io\r\n    kind: GCPInferencePoolImport\r\n    name: gemma-pool\r\n  default:\r\n    timeoutSec: 100\r\n    balancingMode: CUSTOM_METRICS\r\n    trafficDuration: LONG\r\n    customMetrics:\r\n      - name: gke.named_metrics.tpu-cache\r\n        dryRun: false\r\n        maxUtilizationPercent: 60\r\n---\r\napiVersion: autoscaling.gke.io/v1beta1\r\nkind: AutoscalingMetric\r\nmetadata:\r\n  name: tpu-cache\r\n  namespace: default\r\nspec:\r\n  selector:\r\n    matchLabels:\r\n      app: gemma-server\r\n  endpoints:\r\n  - port: 8000\r\n    path: /metrics\r\n    metrics:\r\n    - name: vllm:kv_cache_usage_perc\r\n      exportName: tpu-cache\r\n---\r\napiVersion: inference.networking.x-k8s.io/v1alpha2\r\nkind: InferenceObjective\r\nmetadata:\r\n  name: gemma-objective\r\n  namespace: default\r\nspec:\r\n  priority: 10\r\n  poolRef:\r\n    name: gemma-pool\r\n    group: &amp;quot;inference.networking.k8s.io&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c43d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div data-draftjs-conductor-fragment='{"blocks":[{"key":"5k3m6","text":"Testing the Failover","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":20,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"38ue0","text":"Verify the highly available architecture by simulating a primary region outage. Once the primary deployment is taken offline, the Gateway automatically detects the failure and seamlessly reroutes all subsequent user requests to the active secondary cluster, ensuring continuous availability without dropping traffic.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"44u08","text":"Next Steps","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"3k54t","text":"Take a deeper dive into a hands-on codelab and more information on these features review the following.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"ohd6","text":"Hands-on Codelab: Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":18,"length":92,"key":0}],"data":{}},{"key":"4jgt1","text":"Document set: DRANET","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":14,"length":6,"key":1}],"data":{}},{"key":"ep7ne","text":"Documentation: AI Hypercomputer","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":15,"length":16,"key":2}],"data":{}},{"key":"3c9h1","text":"Want to ask a question, find out more or share a thought? Please connect with me on Linkedin.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":84,"length":8,"key":3}],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/ai-hypercomputer/docs/overview"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://www.linkedin.com/in/ammett/"}}}}'&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="czag4-0-0"&gt;
&lt;h3 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="czag4-0-0"&gt;&lt;span data-offset-key="czag4-0-0"&gt;Testing the Failover&lt;/span&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="9un4f-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9un4f-0-0"&gt;&lt;span data-offset-key="9un4f-0-0"&gt;Verify the highly available architecture by simulating a primary region outage. Once the primary deployment is taken offline, the Gateway automatically detects the failure and seamlessly reroutes all subsequent user requests to the active secondary cluster, ensuring continuous availability without dropping traffic.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="ef2kc-0-0"&gt; &lt;/div&gt;
&lt;h2 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="ef2kc-0-0"&gt;&lt;span data-offset-key="ef2kc-0-0"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="1r2f1-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="1r2f1-0-0"&gt;&lt;span data-offset-key="1r2f1-0-0"&gt;Take a deeper dive into a hands-on codelab and more information on these features review the following.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;ul class="public-DraftStyleDefault-ul" data-offset-key="6fjff-0-0"&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-reset public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="6fjff-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="6fjff-0-0"&gt;&lt;span data-offset-key="6fjff-0-0"&gt;Hands-on Codelab: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet" rel="noopener" role="button" target="_blank"&gt;&lt;span data-offset-key="6fjff-1-0"&gt;Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="9ku8e-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9ku8e-0-0"&gt;&lt;span data-offset-key="9ku8e-0-0"&gt;Document set: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators" role="button"&gt;&lt;span data-offset-key="9ku8e-1-0"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="3fjdr-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="3fjdr-0-0"&gt;&lt;span data-offset-key="3fjdr-0-0"&gt;Documentation: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview" role="button"&gt;&lt;span data-offset-key="3fjdr-1-0"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="f0ecg-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="f0ecg-0-0"&gt;&lt;span data-offset-key="f0ecg-0-0"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://www.linkedin.com/in/ammett/" rel="noopener" role="button" target="_blank"&gt;&lt;span data-offset-key="f0ecg-1-0"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span data-offset-key="f0ecg-2-0"&gt;.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dra.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dra.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>Developer's guide to Gemini Enterprise and A2UI integration</title><link>https://cloud.google.com/blog/topics/developers-practitioners/guide-to-gemini-enterprise-and-a2ui-integration/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you've built a chatbot, you know this conversation:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;User:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Book a table for two tomorrow at 7pm." &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Okay, for what day?" &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;User:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Tomorrow." &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "What time?"&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A date picker would have ended this in one tap. But until recently, agents had no standard way to render a date picker — or a map, or a multi-select list — inside the chat surface they live in. They could only return text or markdown for generic usage. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we're walking through how to fix that with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, an open protocol for agent-driven user interfaces, and how to integrate an A2UI-enabled agent with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Enterprise (GE)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; so your agent renders rich and interactive UI natively in the GE chat surface — and in your own custom frontend if you want one. We'll use a working restaurant-finder agent — built with the Google Agent Development Kit (ADK), the A2A protocol, and Gemini — as the reference. The full source is on &lt;/span&gt;&lt;a href="https://github.com/wadave/agent-a2ui-demo" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and there's a &lt;/span&gt;&lt;a href="https://youtu.be/_5AaYwyqVio" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2-minute demo video.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=_5AaYwyqVio"
      data-glue-modal-trigger="uni-modal-_5AaYwyqVio-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_4GYPUpq.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Gemini Enterprise and A2UI integration demo&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-_5AaYwyqVio-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="_5AaYwyqVio"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=_5AaYwyqVio"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The problem: agents speak text, but users want UI&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Most agent frameworks today return strings. That's fine for short answers, but it breaks down quickly:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multi-turn slot filling&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (date, time, party size) burns turns and patience.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Choices among options&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (which restaurant? which insurance plan?) become long bulleted lists the user has to copy-paste back.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Spatial information&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (locations, routes, floor plans) is reduced to addresses.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Developers have tried to patch this by sending HTML or JavaScript fragments, but that introduces real risks: cross-site scripting, UI injection from a remote agent you don't fully control, and visual drift from the host app's design system. What's needed is a way to transmit UI that's &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;safe like data and expressive like code&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What A2UI is&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://a2ui.org/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2UI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is an open protocol, &lt;/span&gt;&lt;a href="https://developers.googleblog.com/introducing-a2ui-an-open-project-for-agent-driven-interfaces/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;introduced by Google&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and co-developed with the Flutter team and product teams behind Gemini Enterprise. Instead of returning text or HTML, an agent returns a JSON payload that describes a UI: a tree of &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;components&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (Card, Text, Button, ChoicePicker, Image, …) and a separate &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;data model&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; holding the values those components display.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Three properties make this useful in practice:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Declarative, not executable.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The payload is data. The client only renders components from a pre-approved &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;catalog&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, so a remote agent can't inject arbitrary code or steal credentials through a UI widget.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Streaming-friendly.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The format is a flat list of small JSON messages, so the LLM can emit them incrementally and the client can paint as they arrive.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Framework-agnostic.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The same agent response renders through Lit, Angular, Flutter, or native mobile. The agent doesn't know — or care — what's on the other end.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2UI is also &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;transport-agnostic&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. The messages ride inside whatever pipe you already use: A2A JSON-RPC, AG-UI, WebSockets, SSE. In our reference implementation, A2UI rides inside the &lt;/span&gt;&lt;a href="https://a2aprotocol.ai/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; as &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;DataPart&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; objects with the MIME type &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;application/json+a2ui&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Where A2UI sits in the stack&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2UI is one piece of a four-layer stack. Confusion usually comes from conflating these layers — they're each doing a different job:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div align="left"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table&gt;&lt;colgroup&gt;&lt;col/&gt;&lt;col/&gt;&lt;col/&gt;&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope="col" style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: left;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Layer&lt;/strong&gt;&lt;/p&gt;
&lt;/th&gt;
&lt;th scope="col" style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: left;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Owns&lt;/strong&gt;&lt;/p&gt;
&lt;/th&gt;
&lt;th scope="col" style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: left;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Examples&lt;/strong&gt;&lt;/p&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;App experience&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Client shell and conversation state — chat window, input box, message history&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;CopilotKit, AG-UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Pixel drawing&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Turning component descriptions into actual rendered UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Lit, Flutter, Angular&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Conversation pipeline&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Client–server transport — sending messages, receiving responses&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2A Protocol&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cargo (data format)&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The thing flowing through the pipeline that &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;describes&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Read top to bottom: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;CopilotKit/AG-UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; owns the app experience. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lit/Flutter/Angular&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; own the rendering. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;While &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;CopilotKit&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;AG-UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; provide valuable abstractions, they remain strictly optional for implementing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;In this architecture, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2A&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; serves as the underlying conversation pipeline, while &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; represents the structured cargo that actually traverses that pipe.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That separation is why the same A2UI payload renders identically in three very different deployment shapes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bespoke web app&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — a custom client shell (like the reference repo's Lit &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;frontend/&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;) plus a custom A2UI renderer.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CopilotKit / AG-UI app&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — CopilotKit owns the chat shell, an A2UI renderer is registered inside it for rich cards.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Enterprise&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — GE &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;is&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the shell, the renderer, and the transport client. You only build the agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;So for the GE path, the stack collapses to two layers you control: the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2A endpoint&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (your agent) and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI cargo&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; it emits. The other two layers are GE's responsibility. CopilotKit and AG-UI are great if you're building a standalone product UI elsewhere — they're just out of scope for embedding an agent inside Gemini Enterprise.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Pattern revisions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The protocol evolves quickly, and different clients support different revisions. Two patterns are common today:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Inline pattern&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — the agent sends a component tree with the data baked into each component (the pattern Gemini Enterprise renders today).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Decoupled pattern&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — the agent sends the component tree and the data model as separate messages, so subsequent turns can update one without re-sending the other. This reduces tokens and latency for long-running conversations and is the direction the protocol is heading.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The reference repo serves &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;both&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; patterns from one backend, picking which to emit per request based on the client's &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;X-A2A-Extensions&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; header. As new revisions ship, you add another catalog and the same negotiation pattern keeps working.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How A2UI works inside Gemini Enterprise&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Gemini Enterprise ships with a built-in A2UI renderer. For the developer, that means the integration story is short:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build your A2A agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, embedding an A2UI catalog and example payloads alongside the regular tool definitions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Register the agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with Gemini Enterprise as an A2A endpoint. (Use &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;make register-gemini-enterprise&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in the reference repo.)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A GE admin shares the agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with employees, just like any other agent in the GE catalog.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At runtime, the flow looks like this:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The user types a request in the GE chat. GE calls your agent's A2A endpoint and sends along &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GE's own A2UI catalog&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — the list of UI components GE knows how to render.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Your agent decides whether a UI widget is the right response. If yes, it emits an A2UI JSON message (e.g., a &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ChoicePicker&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; of restaurant options). If no, it falls back to text. Both can coexist in the same response.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;GE receives the JSON, validates it against its catalog, and renders the widget natively in &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GE's own design language&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — so it visually matches the rest of the chat surface.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;When the user interacts with the widget (selects three options, picks a date), GE serializes the interaction back into JSON and sends it to your agent as the next turn. Your agent processes structured input, not free-form text.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One thing worth flagging: because your agent doesn't ship its own renderer for GE, you don't need to choose a frontend framework to start. Your A2A endpoint can run anywhere — Cloud Run, GKE, on-prem — and GE handles the rendering.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;High-level architecture example&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The reference implementation is an ADK backend on Cloud Run designed to plug seamlessly into Gemini Enterprise.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-overview.max-1000x1000.jpg"
        
          alt="1-overview"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Gemini Enterprise connects directly to your agent using standard A2A JSON-RPC calls.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The agent serves the inline message pattern expected by the Gemini Enterprise managed UI.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Custom components like &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GoogleMap&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; render via Google Maps Embed iframes, with the API key injected server-side so the LLM never sees it.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The following demonstration illustrates how Google Maps functions as a live, interactive component within Gemini Enterprise rather than a static image. Leveraging A2UI's streaming-friendly architecture, the agent updates the map view in real-time—dropping pins and adjusting coordinates incrementally as results arrive from the Maps API.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2-maps-ge.max-1000x1000.png"
        
          alt="2-maps-ge"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;See it running, then build your own&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detailed implementation guide&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://github.com/wadave/agent-a2ui-demo/blob/main/docs/implementation_details_guide.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Demo video&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (2 minutes, end-to-end with both the Lit shell and Gemini Enterprise): &lt;/span&gt;&lt;a href="https://youtu.be/_5AaYwyqVio" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://youtu.be/_5AaYwyqVio&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI spec and component reference&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://a2ui.org/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;a2ui.org&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Enterprise updates&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, including the A2UI renderer: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/whats-new-in-gemini-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;What's new in Gemini Enterprise&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI generative UI announcement&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://developers.googleblog.com/a2ui-v0-9-generative-ui/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Introducing A2UI generative UI&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you're already building agents on Google Cloud, the fastest path is to clone the reference repo, run &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;make local-backend&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; for a local smoke test, and then &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;make register-gemini-enterprise&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to wire it into GE. From there, swap in your own catalog, your own tools, and your own domain. The next time a user asks your agent for "a table for two tomorrow at 7pm," the answer can be a date picker instead of another question.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 29 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/guide-to-gemini-enterprise-and-a2ui-integration/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Developer's guide to Gemini Enterprise and A2UI integration</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/guide-to-gemini-enterprise-and-a2ui-integration/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Dave Wang</name><title>Forward Deployed Engineer, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yuan Tian</name><title>Software Engineer, Google Cloud AI</title><department></department><company></company></author></item><item><title>A Guide to AI Cold Starts on Cloud Run</title><link>https://cloud.google.com/blog/topics/developers-practitioners/a-guide-to-ai-cold-starts-on-cloud-run/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I saw a developer asking on Reddit if there &lt;/span&gt;&lt;a href="https://www.reddit.com/r/googlecloud/comments/1s8yzn1/is_there_a_sane_way_to_manage_cloud_run_cold/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;was any “sane way” to manage Cloud Run cold starts for AI across multiple regions.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; They were experiencing startup latencies of up to 20 seconds, a frustrating gap where the infrastructure is spinning up while the user waits for a response.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The discussion was full of developers who had almost given up on serverless GPUs, with some even migrating back to GKE just to escape the latency. I decided it was time to dive deep into the Mechanics of AI Cold Starts and see if we could find that "sane way."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;During my research into &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/tutorials/gpu-gemma-with-ollama"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;hosting models like Gemma 4 on Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, I had the privilege of co-presenting at Google Cloud Next '26 with Oded Shahar (Senior Engineering Manager for Cloud Run) and our guest speaker Ajay Nair (Global VP of Platform at Elastic). &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our session,&lt;/span&gt; "Build AI architectures with custom models on Cloud Run&lt;span style="vertical-align: baseline;"&gt;," Ajay shared the production-hardened strategies that allow Elastic to serve millions of daily requests across 17+ model variants, all while maintaining the 'scale-to-zero' efficiency of Cloud Run. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=7L5gQHcinzE"
      data-glue-modal-trigger="uni-modal-7L5gQHcinzE-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        &lt;img src="//img.youtube.com/vi/7L5gQHcinzE/maxresdefault.jpg"
             alt="A YouTube video that discusses the production-hardened strategies that allow Elastic to serve millions of daily requests across 17+ model variants, all while maintaining the &amp;#x27;scale-to-zero&amp;#x27; efficiency of Cloud Run."/&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
      &lt;figcaption class="article-video__caption h-c-page"&gt;
        
          &lt;h4 class="h-c-headline h-c-headline--four h-u-font-weight-medium h-u-mt-std"&gt;Build AI architectures with custom models on Cloud Run&lt;/h4&gt;
        
        
      &lt;/figcaption&gt;
    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-7L5gQHcinzE-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="7L5gQHcinzE"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=7L5gQHcinzE"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ajay showed us that the secret isn't just in the model, but in treating GPUs as fungible compute rather than infrastructure to manage.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I realized then that minimizing cold start latency isn't just about the model, it's about the infrastructure patterns and architectural decisions that keep it fast, scalable, and secure.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The anatomy of an AI cold start&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;official Google Cloud GPU best practices&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; explain, an AI cold start is a shift from standard web microservices. You aren't just booting code, you're moving gigabytes of weights into a specialized physical accelerator.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Think of it as a four-phase race. If you don't optimize each step, you're going to lose your users.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 1: Infrastructure Provisioning (~5s)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Run allocates the physical GPU and injects pre-installed NVIDIA drivers. Since Google manages the drivers for you, you don't have to bloat your Dockerfile.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 2: Block-Level Container Image Streaming (1-2s)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Run uses "image streaming," meaning it pulls only the blocks needed to boot. Your 15GB CUDA image can actually start as fast as a tiny Node.js app!&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 3: Engine Initialization (5-15s)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is where your inference engine (vLLM, Ollama) warms up. This is a massive CPU-heavy task, and it's where most people get throttled without realizing it.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 4: Model Loading &amp;amp; VRAM Transfer &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is the final hurdle - moving those model weights from storage into the GPU memory. Unlike standard web apps where CPU is king, GPU memory is your primary constraint here. If your &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/decoding-high-bandwidth-memory-a-practical-guide-to-gpu-memory-for-fine-tuning-ai-models/?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;model’s weights don’t fit entirely within the GPU memory&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, performance degrades significantly as it swaps to slower system RAM.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Best practices to handling AI cold starts&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To build a "sane" production environment, here are a few crucial levers you can pull, informed by the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;official Google Cloud documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on AI inference with GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;h2 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Optimize Phase 4&lt;/span&gt;&lt;/h2&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Pick the Right Deployment Option&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 4 is the "final hurdle" where you move gigabytes of weights from storage into GPU memory. Your &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices#loading-storing-models-tradeoff"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;choice of storage&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; determines how fast this transfer happens:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Storage (Concurrent Download) - Fastest:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using the Google Cloud CLI &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;(&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud storage cp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; allows you to download model files in parallel. This is the &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/cloud-run/cloud-run-gpu-rtx-pro-6000?content_ref=can%20complete%20the%20steps%20within%20limited%20storage%20environments%20like%20cloud%20shell%20this%20codelab%20demonstrates%20how%20to%20load%20the%20model%20concurrently%20from%20cloud%20storage%20during%20container%20startup#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;recommended method&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for massive weights because it maximizes network throughput and drastically reduces transfer time. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Storage (FUSE) - Easiest:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This provides "zero-code" changes by mounting a bucket as a local file system. However, because it does not parallelize the initial download, it is significantly slower for large model weights&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Container Image - Best for &amp;lt;10GB: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Baking weights into your image is efficient for smaller models thanks to Cloud Run's Image Streaming. For models over 10GB, however, the import and streaming overhead can become a bottleneck.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Internet: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Avoid this. It is the slowest and least predictable path for production inference.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Model Format &amp;amp; Size&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Optimizing your model's format and size is a direct "hack" to shorten Phase 4 (Model Loading &amp;amp; VRAM Transfer). Because this phase is constrained by how fast you can move gigabytes of data into VRAM, smaller and more efficient files are critical.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt; 4-bit Quantization: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;This is the ultimate cold start hack. Smaller weights mean fewer gigabytes to pull from storage, which directly accelerates the download and transfer portion of Phase 4,&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fast Formats: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Pick a model format with fast load times like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GGUF&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to minimize startup time. For the fastest performance, move away from Python "pickle" files and use Safetensors for zero-copy loading.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Ensure VRAM Fit: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use quantized models to ensure the weights fit entirely within the GPU memory. If the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/decoding-high-bandwidth-memory-a-practical-guide-to-gpu-memory-for-fine-tuning-ai-models/?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;model exceeds VRAM&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Phase 4 will stall as the system swaps to significantly slower RAM. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Optimize Phases 3 &amp;amp; 4: Infrastructure &amp;amp; Network Levers&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These infrastructure settings provide the necessary resources to accelerate the most demanding parts of the startup process.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/cpu#startup-boost"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Startup CPU Boost (Accelerates Phase 3) &lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This feature temporarily doubles your CPU power during startup. A 1 vCPU instance boosts to 2 vCPUs for the duration of startup and the first 10 seconds of serving. It is essential for Phase 3, as engine initialization is a massive CPU-heavy task.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/networking-best-practices?content_ref=for%20the%20best%20networking%20performance%20for%20cloud%20run%20services%20use%20the%20second%20generation%20execution%20environment%20when%20routing%20traffic%20with%20direct%20vpc%20egress#direct-vpc-throughput"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Direct VPC Egress &amp;amp; PGA (Accelerates Phase 4)&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Utilizing&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Direct VPC Egress&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Google Access (PGA)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; ensures your model weight traffic stays on Google’s internal high-speed backbone. This optimizes the network path to shorten the time spent moving gigabytes of weights into VRAM.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Concurrency Tuning (Cold Start Avoidance): &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In Cloud Run, "&lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/about-instance-autoscaling?content_ref=request%20concurrency%20calculates%20the%20number%20of%20instances%20by%20averaging%20the%20request%20concurrency%20per%20second%20over%20a%201%20minute%20and%2010%20minute%20period%20and%20divides%20this%20by%20the%20maximum%20concurrency"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;concurrency&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;" refers to the maximum number of requests a single instance can handle before the platform scales out to start a new one. For AI workloads, you must tune this setting in tandem with your model engine's internal parallelism flags (e.g., &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--max-num-seqs &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for vLLM or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;OLLAMA_NUM_PARALLEL&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for Ollama). &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Use the official &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices#max-concurrent-requests"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud formula&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to find your ideal Cloud Run concurrency:&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: center;"&gt;&lt;span style="vertical-align: baseline;"&gt;(&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Number&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;of&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;model&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;instances&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;∗&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;parallel&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;queries&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;per&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;model&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;)+(&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;number&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;of&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;model&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;instances&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;∗&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;ideal&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;batch&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;size&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Example: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;If your instance loads 3 model instances onto the GPU, and each model instance can handle 4 parallel queries with an ideal batch size of 4, you would set your Cloud Run maximum concurrent requests to 24: &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;(3×4)+(3×4)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;How the math works: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The goal is to keep the GPU fully saturated while ensuring users aren't stuck in a long queue. In this example, the total of 24 concurrent requests is split into two functional groups:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Active Processing (12 requests): &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Calculated as (3 instances×4 queries), this represents the total number of requests the GPU can actively process at any given moment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The "Next Batch" Buffer (12 requests): &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Calculated as (3 instances×4 batch size), these are the requests waiting "on deck" inside the container. As soon as the GPU finishes the first batch, it immediately picks up these waiting requests.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By tuning this value as high as your VRAM allows (usually 10-20 users), one warm instance can serve many requests without triggering a new scale-out event and the cold start that comes with it.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling Controls (Tuning the Threshold)&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While the formula above defines your maximum capacity, you can also tune when Cloud Run decides to start the next instance. Cloud Run's autoscaler typically targets 60% utilization, but for long-running AI cold starts, you can increase this threshold to 80% or 90% via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/scaling-controls"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Scaling Controls&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Concurrency Target&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Increasing this allows you to "pack" more requests into a single warm instance before triggering a scale-out.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CPU Target&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Increasing the CPU target prevents the platform from starting a new instance just because initialization or high-intensity inference spiked the CPU utilization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling &amp;amp; Reliability Strategies&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Gemini_Generated_Image_nc8hjhnc8hjhnc8h.max-1000x1000.png"
        
          alt="Gemini_Generated_Image_nc8hjhnc8hjhnc8h"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="znx9k"&gt;Sometimes the best way to handle a cold start is to avoid it entirely or manage it proactively.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Single-Region "Always-On" Tradeoff&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you are deploying globally, the cost of keeping minimum instances set to 1 in every region adds up. Instead, consider an 'always-on' service in just one region. A 100ms global network delay is a much better user experience than a 20s local cold start.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The 15-Minute Grace Period: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A common question is 'How long will my instance stay warm after a request?' Cloud Run generally keeps instances alive for &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;15 minutes &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;after they become idle (processing zero requests). If your traffic is predictable and comes in every 10–12 minutes, you might not even need an 'always-on' service, the platform’s default shutdown policy will keep a warm instance ready for your next user.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Note&lt;/strong&gt;: While this idle time is "free" for standard request-based services, remember that GPU services require instance-based billing, so you will be billed for the duration the instance remains warm between requests.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The "Wake-Up Call" Strategy &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Sometimes the best way to handle a cold start is to proactively mask it. If your UI can predict an upcoming request, for example, when a user clicks "New Chat" or begins hovering over a text area, you can send a lightweight health check to your service  immediately. By the time the user finishes typing their prompt, the first two phases of the cold start (Infrastructure Provisioning and Container Image Streaming) are already finished in the background. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Pro-Tip: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Non-Inference Endpoints &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To make this "wake-up call" as fast as possible, always use a non-inference endpoint rather than sending a dummy prompt like "hi". &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Why it’s faster:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Non-inference endpoints (like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;/v1/models&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for vLLM or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;/api/tags &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;for Ollama) are handled by the container’s web server the moment it starts. They don’t have to wait for the slow "Phase 4" model loading and VRAM transfer to complete before sending a success response.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;No Chat Pollution: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Because these endpoints don't trigger the model's completion logic, they won't interfere with the user's actual chat history or accidentally trigger session creation in your backend.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Recommended Endpoints:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;vLLM: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /health &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or GET &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /v1/models&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Ollama: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /api/tags&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /api/version&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Tune Startup Probes for VRAM &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI models take significant time to move gigabytes of weights from storage into GPU memory (Phase 4). If your startup check fails too many times, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/healthchecks?content_ref=prevents+the+containers+from+being+shut+down+prematurely+before+the+containers+are+up+and+running"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run will assume your container is broken and kill it&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To prevent this:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Increase the Failure Threshold&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Use a high &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;failureThreshold&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (e.g., 60 or more). Since the total allowed startup time is the product of &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;failureThreshold \times periodSeconds&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, a threshold of 60 with a 5-second period gives your model a healthy 5-minute window to load.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Utilize the 30-Minute Maximum&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: While standard services are limited to 4 minutes, Cloud Run supports a total startup time of up to 30 minutes (1,800 seconds) for intensive workloads.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Avoid False Positives (The Ollama Fix)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Be careful with engines like Ollama, which may open a TCP port as soon as the service starts, but &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;before&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the model is actually in VRAM. Always ensure you are &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;preloading models&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; during the container's entrypoint script to ensure the startup probe only passes once the model is truly ready for inference.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Lessons from Elastic’s strategy&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our NEXT ‘26 session, Ajay Nair highlighted three architectural decisions that allowed Elastic to treat GPUs as fungible compute, rather than infrastructure to manage:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bypass the Compilation Tax: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;By setting&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; enforce_eager=True&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in vLLM, they traded a tiny bit of throughput for cold starts that finish in less than a minute rather than multiple minutes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Standalone Checkpoints: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;They avoided the latency of runtime adapter-switching by pre-merging each LoRA variant into a standalone checkpoint.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;One Workload, One Service:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Each independently-scalable workload — defined by model, task adapter, and traffic shape — is deployed as its own Cloud Run service. This produces 30+ services across ~15 model families, with some models split by task (e.g., v5 retrieval vs. clustering) or by query/passage role.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to get started?&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Optimizing the cold start process&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is the difference between a hobby project and a production-ready application. The best part? Cloud Run handles the NVIDIA driver and CUDA installation for you, starting the instance in about 5 seconds.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For a deeper dive, the official documentation is your best friend:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;Best practices: AI inference on Cloud Run with GPUs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu"&gt;Configure GPU for Cloud Run services&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/cpu#startup-boost" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Startup CPU boost for Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For the full technical breakdown, I highly recommend watching the recording of the &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=7L5gQHcinzE" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;session&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; from Google Cloud Next '26. It provides the most comprehensive blueprint for hosting high-performance open models on serverless infrastructure."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Happy building!&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;Special thanks to Sara Ford and Shane Ouchi from the Cloud Run team and to Zac Li from Elastic for the helpful review and feedback on this article.&lt;/span&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 27 May 2026 17:23:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/a-guide-to-ai-cold-starts-on-cloud-run/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/cold_start.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>A Guide to AI Cold Starts on Cloud Run</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/cold_start.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/a-guide-to-ai-cold-starts-on-cloud-run/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI Engineering, Google Cloud Developer Relations</title><department></department><company></company></author></item><item><title>Shipping features to production just got easier with new feature flags in AppLifecycle Manager</title><link>https://cloud.google.com/blog/products/application-development/new-feature-flags-in-applifecycle-manager/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Many development teams are familiar with the hesitation that comes right before pushing a new feature live. As AI helps developers write code faster, the gap between rapid code generation and safe production deployment continues to grow.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Feature flags offer a practical way to manage this risk by separating the act of deploying code from the act of releasing a feature to users. Instead of a single, high-risk launch event that affects all users simultaneously, teams can ship code to production with new features hidden by default in a controlled manner.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help teams adopt this workflow, we are announcing the public preview of AppLifecycle Manager Feature Flags (ALM FF). This service provides a rule-based solution to manage software behavior across Google Cloud, helping you support rapid development without sacrificing production stability.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Read on to learn four ways these feature flags will help accelerate your deployment.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;1. Decouple for safety and velocity&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The core mission of ALM FF is to increase development velocity by decoupling your feature releases from your code deployments. Traditionally, releasing a feature requires a binary deployment — a high-risk event that affected all users simultaneously.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With ALM FF, you can ship code to production with new features disabled by default. This allows your team to move faster, deploying code continuously while choosing the exact moment to enable a feature via a toggle. If an issue is detected, the flag acts as an instant kill switch, disabling the problematic feature immediately without the need for a full, time-consuming code rollback.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_YPP5fhI.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;2. Gradual enablement with precise targeting&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Safety is  about precision. ALM FF leverages the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Common Expression Language (CEL)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to implement sophisticated logic for gradual feature enablement.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Percentage feature enablement:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Instead of a global launch, you can ramp up a feature to 1%, 5%, or 50% of your traffic. This allows you to monitor system health and performance metrics incrementally, ensuring stability before reaching your entire user base.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Precise allowlisting:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can target specific internal teams, beta testers, or early-access customers by allowlisting their identifiers. This ensures that only the intended audience sees a feature during its initial validation phase.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_AT9I4Zt.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;3. Dynamic configuration for the AI era&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond simple toggles, ALM FF offers a dynamic way to inject configuration into your applications. By using string-type flags, you can update application behavior — such as system prompts for LLM integrations—in real-time. This allows product managers and business owners to tweak AI responses and application logic without requiring any code changes or infrastructure rollouts.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_7li94CT.max-1000x1000.png"
        
          alt="3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;4. Built on open standards&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We believe safety should not mean lock-in. ALM FF is built on the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;OpenFeature&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; standard, utilizing industry-standard SDKs and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;flagd&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; evaluation engine. This ensures your feature management patterns are portable and follow best practices without adding Google-specific dependencies to your core application code.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ALM FF is now in public preview. To take control of your releases, you can:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Review the docs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/saas-runtime/docs/flags/flags-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Public Documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Onboard today:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/saas-runtime/docs/flags/flags-quickstart"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Quickstart Guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Give us feedback:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Help us &lt;/span&gt;&lt;a href="https://forms.gle/boGXCgKyoB7Lr6yd9" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;shape the future of feature management&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Thu, 21 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/application-development/new-feature-flags-in-applifecycle-manager/</guid><category>Developers &amp; Practitioners</category><category>Application Development</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Shipping features to production just got easier with new feature flags in AppLifecycle Manager</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/application-development/new-feature-flags-in-applifecycle-manager/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Erol-Valeriu Chioasca</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Securing Your Gemini and Google API Keys</title><link>https://cloud.google.com/blog/topics/developers-practitioners/api-keys-are-open-secrets/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, AI services rely heavily on API keys. To run AI agents, users provide API keys that signify paid tokens, subscriptions, or paid accounts. While API keys are easy to use, it is just as easy to use them unsafely. The result of a hijacked key is a compromised environment that is misused or abused by perpetrators.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I decided to write this blog post after seeing a thread in the r/googlecloud subreddit asking for a tutorial so users can go and protect themselves. &lt;strong&gt;In this post, you will find a few simple steps you can take to reduce your risks and improve the security of API keys created by Google&lt;/strong&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You use Google API keys to access Gemini and other AI Google products as well as Google Cloud APIs. In fact, a Gemini API key is actually a standard Google API key behind the scenes. While I will be focusing on Google API key security, you can apply some of these recommendations to API keys and product tokens created elsewhere.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Generate a New API Key&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Regardless of where you start, you end up creating a new API key in one of Google Cloud projects. You probably will use &lt;/span&gt;&lt;a href="https://console.cloud.google.com/apis/credentials"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Credentials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; under the "APIs &amp;amp; Services" menu in the Cloud console.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/api_services_credentials.max-1000x1000.png"
        
          alt="api_services_credentials"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Or you may use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys create&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; instead. Or there is some other interface which will create a new Google Cloud API key. Regardless of the path and the interface, you need to do the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create the key in a stand alone project that is not used for any other purpose.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Restrict API access and client applications for the new API key.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These steps limit the potential reach of the key and greatly simplify troubleshooting activities &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;if something goes wrong&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;API Restrictions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;API restrictions define exactly which services can be accessed using a specific API key. To keep your environment secure, always limit this list to the absolute minimum set of services required. While the Google Cloud console now prevents the creation of entirely unrestricted keys, it can still be tempting to add extra APIs to "future-proof" or speed up development. However, we strongly advise against this. By strictly adhering to the principle of least privilege, you significantly reduce the potential damage (or "blast radius") if a key is ever accidentally exposed or hijacked.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It is also important to audit keys generated automatically through integrated developer tools. For example, creating an API key in Firebase restricts the use to 24 APIs including Datastore, Firestore, Cloud SQL Admin and others.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/api_key_restrictions.max-1000x1000.png"
        
          alt="api_key_restrictions"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you use Firebase to store your website you probably will not use most of them. When you create an API key to use with AI Studio, restrict it to only "Gemini API".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Attention points:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If you search for an API that you want to select but it is missing, this API is probably not enabled in the Google Cloud project that you use. Go to the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/apis/library"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;API Library&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in your Cloud console, find the API by name and enable it first.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;You can do all actions using the Cloud console or gcloud CLI. Other interfaces (e.g. Firebase) may not provide you with access to all parameters of the API keys&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Application Restrictions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Similar to API restrictions that limit what services your key can be used for, Application Restrictions limit the applications which can use the key. For example, if you create an API key only for use with Google AI Studio, setting up the application restrictions to the website "&lt;/span&gt;&lt;a href="https://aistudio.google.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://aistudio.google.com/&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;" will prevent using your key by automations that utilize Gemini and consume a high volume of tokens at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can set up one or more restrictions of one of the following types:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Website&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;/&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Web application&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of URLs&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Services&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of IPv4 or IPv6 address or a subnet masks&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;iOS applications&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of Bundle IDs&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Android applications&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of pairs of the package name and certificate fingerprint&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Note that you can restrict the key to a single application type only. Create a designated API key for each application type. Having a key per application type helps when observing the key usage and investigating potentially compromised keys.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Store API key&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I want to reiterate that the API key is not paired with your identity. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;ANYONE&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; can use it. So, storing the key securely is as important as restricting the key use in Step 1.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The rule is simple: NEVER EVER store the key where it can be easily seen.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;If you use an API key in your application&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, store it in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Secret Manager&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or a similar secret management service. Secret Manager allows you to inject your API key into &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/secrets"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/secret-manager-managed-csi-component"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; environments easily. However, to elevate the key protection you may want to read the key in your code instead. See &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/samples/secretmanager-get-secret"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for an example.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;If you use an API key with an external application&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; that asks you to type in the key, take extra steps to explore how the application manages your key. You would need to find out how the key is stored and how it is used in the requests. For Web applications, you may use browser developer tools to inspect application traffic and ensure that the key is never sent in an unencrypted communication channel. For example, Google AI Studio uses encrypted local storage and sends the key via a TLS-encrypted channel.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;If Something Goes Wrong&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What to do if you suspect that your key is compromised?&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The straightforward action is the same as with a credit card. First thing ‒ delete the key. You can do it in the Cloud console or using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys delete&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/delete"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. If you find out that it was a false alarm, you can &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/undelete"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;undelete&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; during the next 30 days.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What if you do not know which key is compromised? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In that case you need to do a two-step investigation:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Find out all API keys in your organization or project(s)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Check the graph of API consumption for APIs this key allowing to access&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Find out all your API keys&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There is more than one way to find your API key resources. You can use &lt;/span&gt;&lt;a href="https://console.cloud.google.com/iam-admin/asset-inventory/dashboard"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Asset Inventory&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the Cloud console and filter the dashboard by the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Resource type&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to check &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;apikeys.Key&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. If you do not see this resource type, find and click on "View more…" to expand the resource type list. Note that the list shows deleted API keys as well.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you favor CLI, and you know specific project(s) you can use the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys list&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/list"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To see all active keys in your organization, you will need to use the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud asset search-all-resources&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/asset/search-all-resources"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and query its JSON output to filter out deleted keys:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud asset search-all-resources \\\r\n  --scope=\&amp;#x27;organizations/123456789012\&amp;#x27; \\\r\n  --asset-types=\&amp;#x27;apikeys.googleapis.com/Key\&amp;#x27; \\\r\n  --read-mask=&amp;quot;name,displayName,versionedResources&amp;quot; \\\r\n  --format=json \\\r\n  --order-by=\&amp;#x27;createTime\&amp;#x27; \\\r\n| jq \&amp;#x27;.[] | select(.versionedResources | all(.resource.data.deleteTime == null))\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579ce9d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Find out API consumption&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There is a way to track the usage of the API key. You can do it using the Cloud Monitoring &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/apis/docs/monitoring#expandable-1"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;metric&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;serviceruntime.googleapis.com/api/request_count&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. This metric shows a number of times different services have been invoked. To see the number of service requests for a particular API key you will need to use the metric's label &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;credential_id&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and filter it by the API key unique ID. You can see the metric data using &lt;/span&gt;&lt;a href="https://console.cloud.google.com/monitoring/metrics-explorer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Metrics explorer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or use the Monitoring API with the following &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/promql"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;PromQL&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; expression:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;sum(\r\n  rate({\r\n    &amp;quot;__name__&amp;quot;=&amp;quot;serviceruntime.googleapis.com/api/request_count&amp;quot;,\r\n        &amp;quot;monitored_resource&amp;quot;=&amp;quot;consumed_api&amp;quot;,\r\n        &amp;quot;credential_id&amp;quot;=&amp;quot;apikey:00000000-0000-0000-0000-000000000000&amp;quot;\r\n  }[${__interval}])\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579cebe0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can further filter this metric by &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;service_name&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; label using API name (e.g. &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;mapstools.googleapis.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In order to find out the API key ID you will need to use one of the following methods:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Using the Cloud console, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;open the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/apis/credentials"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Credentials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; page and select the API key that you want. Inspect URL of the API key page in the browser which will look like: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://console.cloud.google.com/apis/credentials/key/[KEY_ID]?project=[PROJECT_ID&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;]&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Copy the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;[KEY_ID]&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; part.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Using gcloud CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, run the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys list --format='value(displayName,uid)'&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;command and find the key by its display name. Copy the UID next to the display name.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Abnormally high level of API invocations usually indicates that the API key was compromised and used to access API by a malicious party.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: API key management hygiene&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are an engineer, an experienced cloud user or just came to experiment, keeping proper API key hygiene is important to avoid your environment being hijacked from you.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you already use Google API keys do the following right now:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Find out all API keys that you have&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Delete all keys that you no longer use or do not recognize (do not worry, you can restore them during next 30 days)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Restrict API keys to only APIs that you intend to use. Narrow the list of clients that can use the APIs if you can&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If you administer your Google Cloud projects or organization, consider setting up the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/api-keys/docs/custom-constraints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;apikeys.googleapis.com/Key&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; org policy to minimize wrangling API keys&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Consider periodically rotating (refreshing) your API keys by replacing them with newly created ones that share the exact same restrictions. Just be careful to track down and update all places where your existing key is used before deleting it to prevent unexpectedly breaking your application or abruptly losing access to one.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Wrapping up&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Securing API keys is a vital step in protecting your cloud ecosystem. Implementing strict API and application restrictions, utilizing secure storage, and proactively monitoring consumption are highly effective ways to prevent unauthorized access. These practices safeguard your development environment from exploitation and prevent unexpected billing charges.&lt;/p&gt;
&lt;p&gt;To help you implement these practices, here are a few practical tools and resources you can explore next:&lt;/p&gt;
&lt;ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Check more about APIs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Review &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/authentication/api-keys-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Best practices for managing API keys&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and practice &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/search-for-and-select-google-apis#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Search for and use Google APIs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Watch a quick tutorial:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Check out this great Google Cloud Tech video on &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=JIE89dneaGo&amp;amp;t=91s" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Manage your Cloud Run secrets securely with Secret Manager&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to see secure storage concepts in action.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get hands-on with a Codelab:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Practice fetching credentials safely in a guided environment by trying Secret Manager with &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/secret-manager-python#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Python&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or with &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/cloud-spring-cloud-gcp-secret-manager#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Spring Boot&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; codelabs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dive deeper into the docs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Learn about how to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/charts/metrics-selector"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;select metrics&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/charts/metrics-explorer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;create charts&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/alerts"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;set up alerts&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to observe your API consumption.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 21 May 2026 10:19:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/api-keys-are-open-secrets/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_image_aJLug1s.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Securing Your Gemini and Google API Keys</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_image_aJLug1s.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/api-keys-are-open-secrets/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Leonid Yankulin</name><title>Senior Developer Relations Engineer</title><department></department><company></company></author></item><item><title>What Google I/O '26 means for developing agents on Google Cloud</title><link>https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google I/O, we introduced a unified development toolkit featuring &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity 2.0&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed Agents API&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, giving developers better ways to build locally and deploy securely to the cloud on a shared protocol layer. In this blog, we’re going to show you how Gemini Enterprise Agent Platform and the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;new developer tools shared at I/O&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; fit together, unpack the spectrum of choice for building, and share what we’d actually try first.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Following the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;evolution&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of Vertex AI into the Gemini Enterprise Agent Platform – &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;a comprehensive platform to build, scale, govern, and optimize agents&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; with new features like session memory and centralized governance – we are now extending these capabilities directly into your local development tools. Our goal is to bridge the gap between high-speed prototyping and secure, compliant corporate deployment, offering a modular approach where you can choose between quick-start workflows or full production control to fit your stack's specific needs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Here’s how those pieces now lay out across the entire spectrum of choice.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The four rungs: The spectrum of how to build agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We like to think of the agent development ecosystem as four rungs on a ladder, designed to give you a clear slider between out-of-the-box configuration and complete code-first control. They're deliberately additive, meaning that starting fast on the lower rungs above never locks you out of graduating to the deeper customization of the rungs above. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Underneath all four rungs is the &lt;/span&gt;&lt;a href="https://google.github.io/A2A/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This interoperability ensures that an agent built on the first rung can be called as a sub-agent on the fourth rung, allowing your entire architecture to scale seamlessly on the same infrastructure.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/blog-ladder-1.max-1000x1000.png"
        
          alt="blog-ladder-1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung one: Agent Studio (low code)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A visual workspace inside Agent Platform. You discover models in Model Garden, engineer prompts, wire up tools, and ship an agent without writing code. Best for business-facing teams and rapid prototyping. The agent you build here runs on the exact same runtime as everything below it.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung two: Managed Agents API&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;New at I/O, the&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/managed-agents"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Agents API&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is for technical teams who want to &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“manage the mission, not the machine."&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; It allows you to define agentic behavior and let Google Cloud handle the heavy lifting, acting as an agent-as-a-service with nothing to manage.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You use the Managed Agents API to &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;configure&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; your agent, and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Interactions API&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;invoke&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; it. You package your instructions, skills, and tools, POST them, and Gemini builds and runs the agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What makes this deployable is the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud sandbox,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; which is secure by design. The agent harness runs on &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;our&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; servers, and each agent has its own ephemeral sandbox provisioned with your skills, Model Context Protocol (MCP) servers, and server-side tools. Full integration with A2A and Agent Platform governance and security are coming soon.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=eFot-mAWwiw"
      data-glue-modal-trigger="uni-modal-eFot-mAWwiw-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-2_VJ69eVE.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Agents API over A2A with Gemini Enterprise&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-eFot-mAWwiw-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="eFot-mAWwiw"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=eFot-mAWwiw"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung three: Antigravity and friends&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://antigravity.google/blog/introducing-google-antigravity-2-0" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;is our primary solution for developers looking to leverage AI for coding tasks and agent orchestration, enabling teams to transform how apps are built and deployed. We've consolidated our developer-facing coding strategy into this single, powerful harness shared across multiple surfaces.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It’s co-optimized with the Gemini family of models, offering high efficiency to speed up development cycles and reduce costs. Skills you develop with Antigravity are intended to be portable across different surfaces.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity_-_3.max-1000x1000.png"
        
          alt="antigravity - 3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is for development teams who want to utilize Google's advanced reasoning capabilities within their coding workflows, implement custom development loops, and transform how they build, deploy, and manage applications.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are expanding this with new tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity 2.0:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A new standalone desktop application providing a centralized workspace to steer, customize, and orchestrate coding agents. Developers can use this to manage complex tasks, such as orchestrating agents to refactor code, generate unit tests, or even scaffold new service components based on a specification. Agents can spin subagents from a single prompt, while multi-agent orchestration allows tasks to run in parallel. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity CLI:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This brings the full Antigravity experience to the command line: same harness, same agent, same quality of intelligence as Antigravity 2.0, with a product experience tailored for the terminal. It's optimized for speed and lower overhead, and adapts entirely to you. The CLI is tightly integrated with the desktop app, sharing authentication, context, skills, and configurations, providing a consistent experience across both interfaces. Use the Antigravity SDK to build your own runtime.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Enterprise security and compliance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Google Cloud customers can now use Antigravity 2.0 and Antigravity CLI with their Gemini Enterprise Agent Platform project. All you have to do is to log  in with Cloud OAuth, set your Agent Platform Project ID and region. This ensures that all agent inference runs via Agent Platform models within your secure cloud boundary, inheriting Google Cloud’s standard data privacy protections and Terms of Service. This ensures your customer data is in your control , and you can utilize regional model endpoints.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity_-4.max-1000x1000.png"
        
          alt="antigravity -4"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Integrating other coding agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While Antigravity is our recommended agentic coding solution, Google Cloud is designed to work well with any coding agent you choose. Our platform is open, and we provide tools to ensure flexibility:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; allow you to build and interact with agents from various sources, including tools like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Claude Code&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This means developers can often keep their preferred interfaces while running the underlying AI inference on Google Cloud. This approach ensures your workflows benefit from Google Cloud's security, compliance, and infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Our &lt;/span&gt;&lt;a href="https://github.com/google/skills" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Skills for Google products&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, launched at Next, are designed to be compatible with multiple coding tools, enabling you to enhance different agents with a consistent set of capabilities.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This flexibility allows teams to integrate their existing favorite tools and models, ensuring seamless and compliant operation within their established workflows. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung four: Agent Development Kit (ADK 2.0)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Code-first, low floor, high ceiling. If Managed Agents are configuration-first, ADK is &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;engineering-first.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This is for software engineers who want to build custom agent meshes from the ground up - any architecture, any model, unconstrained.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://adk.dev" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; enhancements launched at Google Cloud Next are now available for everyone.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; It introduces a unified graph-based engine that gives you a slider from dynamic, model-led reasoning to strict, deterministic workflows. The framework handles the heavy lifting of multi-agent coordination, managing how sub-agents, tools, and data pass between one another.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Collaborative workflows (Python v2.0.0):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Previously called the Task-based Agent Collaboration API, this is how you build self-managing agent teams. A coordinator delegates to subagents using explicit operating modes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;chat&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: Full user interaction, manual return to parent, this is “handoff conversation to sub-agents”.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;task&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: User interaction for clarifications, automatic return to parent, this is a new “collaborate for this assignment” which is the best of both other options.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;single-turn&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: No user interaction, parallel execution, automatic return, this is “agent as tool”.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dynamic workflows:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Dynamic workflows in ADK allow you to put aside graph-based path structures and use the full power of your chosen programming language to build workflows. With Dynamic workflows, you can create workflows with simple decorators, invoke workflow nodes as functions, and build complex routing logic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;ADK Kotlin (Beta):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "ADK for Android." Kotlin support joins Python, Go, and Java, increasing language coverage so your on-device mobile agents can seamlessly coordinate with your backend Python agents.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agents CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; packages Google's expert skills for ADK, eval, deploy, observability, and publishing - turning any AI coding agent (like Antigravity, Gemini CLI, Claude Code, or Cursor) into an expert at &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;agent app building&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; as well as &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;agent ops&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. It gives your AI Agent skills to understand the Google Cloud agent stack, turning an expansive ecosystem into a seamless assembly line for developers hillclimbing their agent builds. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=GDd-Mhm2gcc"
      data-glue-modal-trigger="uni-modal-GDd-Mhm2gcc-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-3_QY4p08W.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Agents CLI speedrun&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-GDd-Mhm2gcc-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="GDd-Mhm2gcc"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=GDd-Mhm2gcc"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What we'd actually try first&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If we were starting today, here's the order we'd reach for things:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Start with the &lt;/strong&gt;&lt;a href="https://antigravity.google/blog/introducing-google-antigravity-2-0" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity 2.0 desktop app&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Explore the interface, add a pre-built agent, and interact with it to understand the core functionality. This provides a more intuitive entry point before diving into API specifics.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build a mesh: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Feel free to explore Managed Agents API through the &lt;/span&gt;&lt;a href="https://github.com/google/skills/tree/main/skills/cloud/gemini-agents-api" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agents API skill&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://github.com/google/skills/tree/main/skills/cloud/gemini-interactions-api" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Interactions API skill&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When you start hitting routing decisions you want to make explicit, or need complex multi-agent orchestration, port your logic to &lt;/span&gt;&lt;a href="http://adk.dev" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; 2.0. The graph model is worth the learning curve as soon as you have more than two branching paths. Don't worry about stringing together a bunch of separate pieces to make this happen - this is exactly where the &lt;/span&gt;&lt;a href="https://github.com/google/agents-cli" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agents CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; shines. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Govern and reuse shared domain logic: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Check out &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/skill-registry"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Skill Registry&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;(public preview):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A centralized catalog to govern and promote the reuse of packaged domain logic. Skills are accessible via the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/managed-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Agents API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;SDK, and ADK (via &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;SkillToolset&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;). Skill Registry will be part of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-registry/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Registry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; shortly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Evaluate:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use the Gemini Enterprise Agent Platform &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/agent-evaluation"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;evaluation suite&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to move beyond basic text-matching vibe checks. Leverage synthetic user simulation to auto-generate multi-turn testing scenarios and safely mock API environments to pressure-test tool resilience. Finally, utilize its LLM-based autoraters and trace logging to evaluate complex logic, group failures, and continuously optimize your agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure the pipeline:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Leverage Gemini Enterprise Agent Platform governance capabilities like &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/agent-identity-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Identity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/gateways/agent-gateway-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Agent Security, and Agent Registry to secure your deployment. Once CodeMender releases, add it to your CI/CD to proactively secure the code your human (and AI) developers are pushing.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Note: You can do this whole loop on a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/starter-tier"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Starter Tier&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; account without a billing account attached. First two app deployments are on us.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;We’re excited and hope you are, too&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agent space is evolving rapidly. Agent Platform offers a secure and adaptable foundation. Core components like the Agent Gateway, identity management, and the Skill Registry work together to ensure a robust and controlled environment for your agents, enabling you to innovate flexibly without vendor lock-in.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Pick the rung that fits the project. Bring whatever coding agent your team prefers. The platform you graduate to is the same one either way, and the data stays inside your Cloud project the whole time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you only read one set of docs after this post, make it the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agents overview in the Agent Platform documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. If you build something interesting, show us - the best examples will land in the next round of templates.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We can’t wait to see what you build!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 19 May 2026 17:45:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What Google I/O '26 means for developing agents on Google Cloud</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Addy Osmani</name><title>Director, Google Cloud AI</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Alan Blount</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Gemini Live Agent Challenge: Announcing the winners and highlights</title><link>https://cloud.google.com/blog/topics/developers-practitioners/winners-and-highlights-of-the-gemini-live-agent-challenge/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Gemini Live Agent Challenge is officially in the books! We challenged developers worldwide to break out of the traditional 'text box' paradigm by building next-generation AI agents. From our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/training-certifications/join-the-gemini-live-agent-challenge?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;initial announcement&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to amassing 11,878 participants and &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;1,536&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; submitted projects from &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;151 &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;countries, the results were nothing short of spectacular.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The mission was to seamlessly integrate multimodal capabilities—building agents that help you see, hear, speak, and create in real time — using the Gemini Live API, the Agent Development Kit (ADK), and the robust infrastructure of Google Cloud. Participants pushed the boundaries of interactive AI across three distinct categories: The Live Agent, The Creative Storyteller, and The UI Navigator.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Congratulations to the builders who took home the top prizes! These winning teams combined technical precision with bold imagination, completely redefining how users can interact with and experience agents. Two of these standout developers were even recognized in person at Google Cloud Next 2026. Here’s a look at their experience, alongside the complete list of winning agents.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Celebrating our category winners at Google Cloud Next ‘26&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Category winners Jeremiah Somoine and Bryen Param were invited to attend Google Cloud Next 2026 in Las Vegas, where they shared their experiences and insights with the broader developer community. Both winners presented Lightning Talks at the Developer Theatre on the expo floor and sat down for exclusive interviews in the Creator Studio Pod at the GDE and Certified Lounge. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;During his time at the event, Bryen discussed the core inspiration behind &lt;/span&gt;&lt;a href="https://devpost.com/software/drone-copilot" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;drone-copilot&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. He explained that his project was driven by the question of "what if a model could interact with the real world?", showcasing how multimodal capabilities can bridge the gap between AI and physical environments. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/bryen.max-1000x1000.jpg"
        
          alt="bryen"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Jeremiah, currently a college student, reflected on the development process behind &lt;/span&gt;&lt;a href="https://devpost.com/software/sankofa-y47f9p" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Sankofa&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, noting that "the best response to a technical limitation was a creative one." When asked what advice he would give to other students looking to build the next generation of AI applications, he emphasized the importance of jumping at any opportunity to get hands-on with the technology. "The best way to learn is by doing," he said, encouraging aspiring developers to simply dive in and start building.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/jeremiah_edited.max-1000x1000.jpg"
        
          alt="jeremiah edited"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Winners&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Grand Prize winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/orion-operating-room-intelligent-orchestration-node" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ORION - Operating Room Intelligent Orchestration Node&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Aditya Shukla&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ORION, or Operating Room Intelligent Orchestration Node, is a voice-directed surgical co-pilot for robotic surgery. Surgeons can speak naturally and instantly receive answers, live data on display, and real-time visual assistance - all without breaking scrub.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=AnxII9COzjo"
      data-glue-modal-trigger="uni-modal-AnxII9COzjo-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_0lhMev0.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Orion - Voice Directed Surgical AI Assistant | Gemini Live Agent Hackathon&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-AnxII9COzjo-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="AnxII9COzjo"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=AnxII9COzjo"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;The Live Agent winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/drone-copilot" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;drone-copilot&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Bryen Param&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Drone-copilot transforms how users interact with hardware by enabling natural, real-time conversations with a drone instead of using a joystick or complex menus. Simply by speaking, users can instruct the drone to navigate, perform autonomous visual inspections, or describe its surroundings, while the drone verbally responds and confirms its actions in real time.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=_FCgmYjGCVs"
      data-glue-modal-trigger="uni-modal-_FCgmYjGCVs-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_C6lpyed.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Drone Copilot: Voice-Controlled Drone + Autonomous Inspection with Gemini Live API&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-_FCgmYjGCVs-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="_FCgmYjGCVs"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=_FCgmYjGCVs"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Creative Storyteller winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/sankofa-y47f9p" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Sankofa&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Jeremiah Somoine&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Sankofa acts as a multimodal AI "griot"—a traditional West African storyteller—transforming fragmented family histories into deeply immersive narratives. Based on just a few user details, it weaves together rich voice narration, watercolor imagery, and ambient soundscapes into a historical story, allowing users to engage in a real-time voice conversation with the storyteller to explore their roots further.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=urV3ckRYRC8"
      data-glue-modal-trigger="uni-modal-urV3ckRYRC8-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-1_1ApjCQc.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Sankofa Demo Video&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-urV3ckRYRC8-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="urV3ckRYRC8"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=urV3ckRYRC8"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;UI Navigator winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/moonwalk-tojsay" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Moonwalk&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Enaiho Uwas Paul and Aman Kumar Sah&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Moonwalk is a conversational, hands-free desktop assistant that helps users intuitively navigate their computer and complete complex tasks using just their voice. By remembering personal preferences and past interactions, it acts as an intelligent co-pilot that can seamlessly control your mouse and keyboard to execute everyday workflows—like booking flights or managing spreadsheets—while you simply sit back and speak.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=u3QoaT3pIMs"
      data-glue-modal-trigger="uni-modal-u3QoaT3pIMs-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-2_djltYYE.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Moonwalk Demo Video #geminiliveagentchallenge&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-u3QoaT3pIMs-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="u3QoaT3pIMs"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=u3QoaT3pIMs"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Best multimodal integration and user experience winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/wand-a-live-agent-that-sees-browses-and-clicks-with-you" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wand&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: David Li&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wand is a voice-first, pointer-aware browser assistant that helps you seamlessly navigate and interact with any website using a combination of natural speech and hand gestures. By simply pointing at your screen and speaking — like asking to "play this video" or "zoom in here"—this live agent helps you instantly execute clicks, searches, and commands without ever needing to touch a mouse or keyboard.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=t9dyesmxlH8"
      data-glue-modal-trigger="uni-modal-t9dyesmxlH8-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-3_EsDTsNv.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Wand -- A live agent that sees, browses, and clicks with you&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-t9dyesmxlH8-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="t9dyesmxlH8"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=t9dyesmxlH8"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Best technical execution and agent architecture winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/johnkeats-ai" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JohnKeats.AI&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Matthew Keats&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;JohnKeats.AI is a voice-first emotional companion designed to actively listen and hold space for users without rushing to offer solutions. By processing subtle vocal cues like pitch, pacing, and tone, it reacts naturally to a user's emotional state in real time to provide a deeply reflective and empathetic conversational experience.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=zNKhR3e2ym4"
      data-glue-modal-trigger="uni-modal-zNKhR3e2ym4-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-4_DmxDSNY.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;JohnKeats.AI — The First AI Agent Built to Know When to Shut Up&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-zNKhR3e2ym4-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="zNKhR3e2ym4"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=zNKhR3e2ym4"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Best innovation and thought leadership winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/rayan-memory" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Rayan Memory&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Yusuf Elnady&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Rayan Memory tackles the universal problem of forgetting by turning your daily learnings into a fully explorable 3D "memory palace." A background agent passively listens to your real-world audio to extract important ideas as physical artifacts, allowing you to walk through themed virtual rooms and converse with a dedicated AI companion to easily retrieve your exact memories.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=G05WfE5Zcsg"
      data-glue-modal-trigger="uni-modal-G05WfE5Zcsg-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-5_rlthVRd.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Rayan - A 3D memory palace that listens, remembers, and speaks back&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-G05WfE5Zcsg-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="G05WfE5Zcsg"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=G05WfE5Zcsg"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://devpost.com/software/nagardrishti" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;NagarDrishti&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Nikita Dongre and Omkar Dongre&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;NagarDrishti tackles dangerous road conditions by allowing citizens to safely report potholes and waterlogging using a hands-free voice assistant while driving. These real-time reports instantly populate an interactive dashboard, where city officials can use natural language to easily identify hazard hotspots and manage critical repairs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=Rn7eJxBdWe4"
      data-glue-modal-trigger="uni-modal-Rn7eJxBdWe4-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-6_LY4Wry4.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;NagarDrishti&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-Rn7eJxBdWe4-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="Rn7eJxBdWe4"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=Rn7eJxBdWe4"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/970955-ekaette" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ekaette&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Bassey John&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ekaette revolutionizes customer service by replacing frustrating hold queues with a conversational, multimodal AI assistant that operates across live phone calls and text messaging. Customers can speak naturally with the agent over a standard phone line while seamlessly sharing photos, reviewing product options, or completing payments via WhatsApp, c&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=0BeLDppNGks"
      data-glue-modal-trigger="uni-modal-0BeLDppNGks-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-7_WUG5wng.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Ekaette - A multimodal AI Voice and Messaging Assistant&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-0BeLDppNGks-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="0BeLDppNGks"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=0BeLDppNGks"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/949057-vibecat" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VibeCat&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Sejun Kim and Michael Chang&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;VibeCat is a proactive macOS desktop companion that continuously watches your screen, understands your context, and suggests helpful actions before you even ask. Instead of waiting for a command, it speaks up first — like offering to fix a missing line of code or execute a terminal command — and completes the task only after receiving your permission.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=j1zzfoDr7qA"
      data-glue-modal-trigger="uni-modal-j1zzfoDr7qA-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-8_FyBBOlB.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;vibeCat - Your proactive desktop companion&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-j1zzfoDr7qA-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="j1zzfoDr7qA"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=j1zzfoDr7qA"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/945801-call-my-parts" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Call My Parts&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Sugam Palav, Nikhil Lohar, Siddhant Panday, and Vishal Parekh&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Call My Parts automates the tedious, time-consuming process of sourcing used vehicle parts by doing the research and vendor outreach for you. Users simply speak their part request, and the AI agent autonomously searches vendor websites, calls suppliers to check pricing and inventory, and compiles the best options into a ranked, easy-to-read dashboard.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=8pcRbVBRMqw"
      data-glue-modal-trigger="uni-modal-8pcRbVBRMqw-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-9.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Call My Parts AI Tool : Hackathon Gemini Live 2026&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-8pcRbVBRMqw-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="8pcRbVBRMqw"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=8pcRbVBRMqw"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/967879-relay-real-time-voice-vision-lab-tutor-for-electronics" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Relay&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Faith Ogundimu&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Relay is an interactive AI lab partner that uses your webcam to watch and guide your physical electronics projects in real time. It provides step-by-step voice instructions to help you build circuits, catches wiring mistakes before they happen, and reinforces your skills with a built-in 3D simulation sandbox and adaptive quizzes.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=lTwos-2TW_A"
      data-glue-modal-trigger="uni-modal-lTwos-2TW_A-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-10.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Relay — Real-Time Voice &amp;amp; Vision AI Tutor for Electronics | Gemini Live API + Google Cloud&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-lTwos-2TW_A-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="lTwos-2TW_A"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=lTwos-2TW_A"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Keep the momentum going&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Inspired by these incredible projects? Start building and stay connected with the community through our latest programs and events:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Join &lt;/span&gt;&lt;a href="https://developers.google.com/program/gear?utm_source=cgc-blog&amp;amp;utm_medium=blog&amp;amp;utm_campaign=FY-26-Q2-GEAR-sign-up&amp;amp;utm_content=hackathon-winner-promo&amp;amp;utm_term=-" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Ready (GEAR)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, designed to help developers and decision-makers build and deploy production-ready AI agents.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Catch up on Google Cloud Next 2026: We just wrapped up an amazing Google Cloud Next! If you weren't able to join us in person — or simply want to relive the energy — take a look at our &lt;/span&gt;&lt;a href="https://www.instagram.com/reels/DXxFTSjiTmM/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;social&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=N7N0TU9tkzw" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;livestream&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; recaps to catch up on some of the exciting developer activations straight from the expo floor.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Tune in on Tuesdays: Want to be the first to hear about new tools, product updates, and upcoming hackathons? Join us for our &lt;/span&gt;&lt;a href="https://goo.gle/GoogleCloudTech" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;weekly livestream&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; every Tuesday 9:00 A.M. PDT / 12:00 P.M. EDT for the latest in all things Google Cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Congratulations again to all of our winners and participants. We can't wait to see what you build next!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 15 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/winners-and-highlights-of-the-gemini-live-agent-challenge/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Landscape_16x9_rxRY4RH.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Gemini Live Agent Challenge: Announcing the winners and highlights</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Landscape_16x9_rxRY4RH.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/winners-and-highlights-of-the-gemini-live-agent-challenge/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Dilasha Panigrahi</name><title>Product Marketing Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Ship code within minutes with the Gemini CLI DevOps Extension</title><link>https://cloud.google.com/blog/topics/developers-practitioners/ship-code-within-minutes-with-the-gemini-cli-devops-extension/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With AI coding tools like Antigravity and Claude Code, I can build a working web app in record time. But deploying it? That's where I'd historically lose the rest of the afternoon to Dockerfiles, IAM bindings, and YAML. So I'd take the shortcut most developers take: I just wouldn't do it. The app would stay on my laptop, and my work would never ship.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is the classic tension between the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/transform-your-developer-experience-with-google-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;inner loop&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: the fast, local cycle of writing and testing code, and the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;outer loop&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: containerization, CI/CD pipelines, and production infrastructure. Most developers are productive in one but not the other, and the gap between them is where projects stall.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://github.com/gemini-cli-extensions/cicd" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI Extension for CI/CD&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; bridges this gap. It handles both quick deployments and full pipeline generation from a single terminal interface. Let me show you how.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Building the Cosmic Guestbook App&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To demonstrate this workflow, we need an app. Let's start from an empty directory and use our agent to "vibe code" a brand new project: the &lt;a href="https://github.com/kweinmeister/cosmic-guestbook" rel="noopener" target="_blank"&gt;Cosmic Guestbook&lt;/a&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We want a full-stack architecture: a React frontend and a Node.js Express backend API. Instead of scaffolding this by hand, we can ask our agent to jumpstart the app:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;&amp;quot;Build a \&amp;#x27;Cosmic Guestbook\&amp;#x27; web app. I need a dynamic Node.js Express backend and a React frontend utilizing Vite. Make the frontend look like a beautiful, glassmorphic sci-fi interface.&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c5850&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Within moments, our agent scaffolds the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;backend/&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory with &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;server.js&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;frontend/&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory with a fully styled React app. We now have a functioning, two-tier web app sitting on our laptop.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/guest_book.max-1000x1000.png"
        
          alt="guest_book"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Installing the Extension&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But code on a laptop isn't shipping. To get this guestbook online, we need to equip our chosen environment with the CI/CD extension. Regardless of your setup, start by ensuring that you have the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/docs/install-sdk"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gcloud CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; installed and authenticate using Application Default Credentials: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud auth application-default login&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, install the extension in your preferred development environment:&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Gemini CLI&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Run the following command directly in your terminal:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gemini extensions install https://github.com/gemini-cli-extensions/cicd&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c5b50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Claude Code&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Add the marketplace and install the plugin directly from the terminal:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# 1. Add the Marketplace\r\nclaude plugin marketplace add https://github.com/gemini-cli-extensions/cicd.git\r\n\r\n# 2. Install the Plugin\r\nclaude plugin install cicd&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c5430&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Antigravity and agents supported by npx skills&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can enable the extension's MCP Server as custom MCP and add skills to your workspace:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Add the Skills\r\nnpx skills add https://github.com/gemini-cli-extensions/cicd --global --all --agent antigravity&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c54f0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;How It Works&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The CI/CD extension is a powerful three-tier system designed to translate your intent into secure, production-ready infrastructure in all these agent environments:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Skills&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Specialized AI skills like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-cicd-deploy&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-cicd-pipeline-design&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; are defined in the extension. These instruct your AI agent (Gemini CLI, Claude Code, or Antigravity) on how to think—helping it analyze your code, ask the right questions, and handle errors gracefully.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CI/CD MCP server&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Running in the background is a specialized Go-based &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; server. This server provides a suite of tools that gives your agent the hands it needs to actually manipulate Google Cloud: everything from scanning for secrets to provisioning Cloud Run services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Local knowledge base&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To ensure the most accurate answers, the system includes a pre-indexed retrieval-augmented generation (RAG) database containing verified architecture patterns, which lets the agent ground its design decisions in the source of truth.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Your chosen AI assistant orchestrates these tools and patterns into a cohesive deployment lifecycle.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Inner Loop&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you're building a prototype or testing a new feature, you don't need a massive, multi-environment CI/CD pipeline. You just need a public URL to test your webhook or show a stakeholder. This is the inner loop, and it needs to be fast.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The traditional approach involves manually writing a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, authenticating with a container registry, building the image, pushing it, and finally deploying it. The CI/CD extension turns this into a single natural language prompt: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini "Deploy this application to Google Cloud using the google-cicd-deploy skill"&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. If you're using Claude Code, you can prompt it exactly the same way via &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;claude -p "Deploy this application..."&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and in Antigravity, simply type your deployment request.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you run this prompt, your AI agent analyzes your local workspace to figure out the best deployment approach.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Pre-Deployment Security Scan&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Leaked secrets are one of the most common and expensive security failures in software. GitGuardian's &lt;/span&gt;&lt;a href="https://www.gitguardian.com/state-of-secrets-sprawl-report-2025" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2025 State of Secrets Sprawl&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; report found 23.8 million new credentials exposed on public GitHub in a single year; 70% of secrets that were leaked in 2022 are still active today. It happens fast: you hardcode a database password during local testing, forget to remove it, and push.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The extension catches this before it becomes a problem. Before any code leaves your machine, it runs a secret check across your workspace. If it finds a Stripe API key or a database credential sitting in your source, the agent halts the deployment and warns you. No secrets ship to the cloud by accident.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you accidentally commit a database password or a Stripe API key while hacking on localhost, your assistant catches it and prevents the deployment. This is what true &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;shift-left&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; security looks like in practice.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Application Analysis &amp;amp; Containerization&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Next, your agent checks your &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;package.json&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;go.mod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to figure out your framework. It automatically decides whether to use &lt;/span&gt;&lt;a href="https://cloud.google.com/storage"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Storage&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for static sites) or &lt;/span&gt;&lt;a href="https://cloud.google.com/run"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for dynamic services).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you're building a dynamic service that doesn't have a Dockerfile, the extension leverages &lt;/span&gt;&lt;a href="https://cloud.google.com/docs/buildpacks/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud's buildpacks&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to containerize it for you on the fly.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Clarification and Deployment&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Conversational agents don't blindly guess how you want your app configured. Before making any changes to your Google Cloud environment, the assistant pauses the flow to ask prioritizing questions based on what it found during its analysis.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For our Cosmic Guestbook, I didn't specify a region or whether the service should be public. The assistant paused and asked me about it:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"I have completed the analysis and prepared the application for a unified deployment to Cloud Run. Since the Node.js buildpack requires a single entry point, I created a root &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;package.json&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with a custom build script that builds the React frontend and allows the Express backend to serve it statically.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Additionally, the secret scan completed successfully, and no sensitive credentials were found.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before we deploy using the Cloud Run source deployment option (buildpacks), I need to clarify a few deployment parameters:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Region/Location:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The gcloud defaults do not specify a location. Which Google Cloud region would you like to deploy to (e.g., &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;europe-west1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;)?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Access:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Should this application be publicly accessible or private?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Service Name:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; I propose naming the service &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cosmic-guestbook&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Does that sound good to you?"&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This conversational pause ensures that even in the fast inner loop, you retain complete control over your cloud architecture. After you confirm the details, the agent pushes the code live and returns the public URL:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Final Output:\r\nYour application is now live and publicly accessible at the following URL:\r\nhttps://cosmic-guestbook-xxxxxxxx-uc.a.run.app&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c4a90&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Behind the scenes, the deployment is handled automatically via &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudrun.deploy_to_cloud_run_from_source&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Outer Loop&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A scrappy deployment prompt is perfect for a Tuesday afternoon prototype, but you can't run a production system from your laptop. Eventually, you need the rigors of the outer loop: automated testing, source control integration, and formal continuous deployment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Writing &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; files and provisioning the necessary infrastructure (like &lt;/span&gt;&lt;a href="https://cloud.google.com/artifact-registry"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Artifact Registry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; repositories or GitHub connections through &lt;/span&gt;&lt;a href="https://cloud.google.com/developer-connect/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Developer Connect&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) is notoriously tedious and error-prone. With the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-cicd-pipeline-design&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; skill, your AI agent acts as your personal platform engineering consultant.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of writing YAML from scratch, you have a conversation. Your agent will ask you about your testing strategy and where you want to deploy, and then it autonomously provisions the required Google Cloud infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Architectural Design &amp;amp; Feedback&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You start the process directly in your conversational interface:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Prompt your agent to kick off the design process:\r\ngemini &amp;quot;Design a CI/CD pipeline using the google-cicd-pipeline-design skill&amp;quot;\r\n# OR\r\nclaude -p &amp;quot;Design a CI/CD pipeline using the google-cicd-pipeline-design skill&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c4c40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Your assistant doesn't work in a black box. It retrieves common CI/CD patterns from its knowledge base. With the most relevant knowledge in hand, it proposes a concrete plan in YAML for you to review.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Infrastructure Provisioning&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;After you approve the plan, the assistant works sequentially through the required infrastructure steps. For example, it might first create a registry for your containers.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;// Example MCP call to provision the registry\r\n{\r\n  &amp;quot;name&amp;quot;: &amp;quot;create_artifact_repository&amp;quot;,\r\n  &amp;quot;arguments&amp;quot;: {\r\n    &amp;quot;repository_id&amp;quot;: &amp;quot;demo-app-repo&amp;quot;,\r\n    &amp;quot;location&amp;quot;: &amp;quot;us-central1&amp;quot;,\r\n    &amp;quot;format&amp;quot;: &amp;quot;DOCKER&amp;quot;\r\n  }\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c4550&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It might then set up a Git connection so that &lt;/span&gt;&lt;a href="https://cloud.google.com/build"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Build&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; can read your source code.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Pipeline Generation &amp;amp; Trigger&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, the agent generates the actual &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file that defines the pipeline stages (test, build, deploy). Here's a snippet of a generated configuration from the repository that highlights the initial build steps:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;steps:\r\n  # Step 1: Install tools (like the linter) and clean the cache.\r\n  - name: \&amp;#x27;golang:1.24\&amp;#x27;\r\n    id: \&amp;#x27;Install Tools\&amp;#x27;\r\n    entrypoint: \&amp;#x27;sh\&amp;#x27;\r\n    args:\r\n      - \&amp;#x27;-c\&amp;#x27;\r\n      - |\r\n        set -e\r\n        export PATH=/workspace/bin:$$PATH\r\n        echo &amp;quot;Installing golangci-lint...&amp;quot;\r\n        go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.64.8\r\n        echo &amp;quot;Cleaning module cache...&amp;quot;\r\n        go clean -modcache\r\n    env:\r\n      - \&amp;#x27;GOPATH=/workspace\&amp;#x27;\r\n    dir: \&amp;#x27;devops-mcp-server\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f48579c4af0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With the pipeline defined, we need a way to execute it automatically. The agent finishes by creating a &lt;/span&gt;&lt;a href="https://cloud.google.com/build/docs/automating-builds/create-manage-triggers"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Build trigger&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. The trigger acts as the glue between your GitHub repository and Cloud Build, ensuring that every push to the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;main&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; branch automatically fires off the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; steps.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;// Example MCP call setting the trigger\r\n{\r\n  &amp;quot;name&amp;quot;: &amp;quot;create_build_trigger&amp;quot;,\r\n  &amp;quot;arguments&amp;quot;: {\r\n    &amp;quot;trigger_name&amp;quot;: &amp;quot;main-branch-deploy&amp;quot;,\r\n    &amp;quot;filename&amp;quot;: &amp;quot;cloudbuild.yaml&amp;quot;,\r\n    &amp;quot;branch_pattern&amp;quot;: &amp;quot;^main$&amp;quot;\r\n  }\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f4857f7d700&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Security And Control&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI-assisted infrastructure generation sounds incredible, but it's reasonable to ask: is it safe?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The extension operates strictly within the permissions of your local &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/authentication/application-default-credentials"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Application Default Credentials (ADC)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. It can't do anything that you can't do. Because it uses the &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, every action that it takes, from creating an Artifact Registry to modifying a Cloud Build trigger, runs through strongly typed, verifiable tools.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you don't like a step in the proposed pipeline, you tell your agent to change it. You're always the "Editor-in-Chief" of your infrastructure. We strongly recommend that you adhere to the &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/Principle_of_least_privilege" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;principle of least privilege&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for both your local ADC and any service accounts that are used by the generated pipelines.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;When Dev and Ops Converge&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The friction between wanting to write code and needing to ship it is finally dissolving. We're moving past the era where deep expertise in YAML formatting was a prerequisite for putting an app on the internet.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By handling the boilerplate of both the scrappy inner loop and the automated outer loop, conversational AI lets developers focus on the business logic that actually matters.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you want to experience this convergence yourself, here are your immediate next steps:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get the tools&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Install the &lt;/span&gt;&lt;a href="https://github.com/gemini-cli-extensions/cicd" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;CI/CD Extension for Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deploy the inner loop&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Take an existing side project (or ask your chosen agent to scaffold a new one like our &lt;a href="https://github.com/kweinmeister/cosmic-guestbook" rel="noopener" target="_blank"&gt;Cosmic Guestbook&lt;/a&gt;) and prompt it to deploy to Google Cloud to instantly see it live on Cloud Run or Cloud Storage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate the outer loop&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Run a design command against a repository that you're ready to productionize, and watch your agent generate your &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and provision your infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stop wrestling with configuration files and start shipping. Let me know what you build by reaching out on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/karlweinmeister/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://x.com/kweinmeister" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, or &lt;/span&gt;&lt;a href="https://bsky.app/profile/kweinmeister.bsky.social" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Bluesky&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 08 May 2026 19:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/ship-code-within-minutes-with-the-gemini-cli-devops-extension/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/gemini_cli_devops_final.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Ship code within minutes with the Gemini CLI DevOps Extension</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/gemini_cli_devops_final.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/ship-code-within-minutes-with-the-gemini-cli-devops-extension/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Karl Weinmeister</name><title>Director, Developer Relations</title><department></department><company></company></author></item><item><title>How BASF manages thousands of supply chain decisions with AlphaEvolve’s agentic algorithms</title><link>https://cloud.google.com/blog/products/ai-machine-learning/how-basf-manages-thousands-of-supply-chain-decisions-with-alphaevolve/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agricultural and crop protection supply chain is one of the most intricate networks in the world. It takes up to two years to turn active ingredients into the final products farmers need, and a single change in weather or regulations can disrupt everything. Planners at &lt;/span&gt;&lt;a href="https://agriculture.basf.com/global/en" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BASF Agricultural Solutions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; navigate this reality daily across 180 production sites. To understand how local decisions ripple across their entire global network, BASF turned to &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/alphaevolve-on-google-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AlphaEvolve on Google Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to build a digital twin of their supply chain.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Planning across a two-year lead time&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;BASF Agricultural Solutions manages a network with over 5,000 distinct value chains. Creating a single end product requires a bill of materials that can be over 30 levels deep, moving across different production sites and regions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Currently, human planners make thousands of local decisions every day. They decide what to produce, when to produce it, and how much safety stock to hold. Because the network is so large, a planner can’t easily see how a localized decision affects the rest of the global supply chain. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This scale can lead to additional working capital and inventory and or cause production imbalances. Traditional mathematical models struggle to capture the dynamic reality of the network that planners navigate based on years of experience.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Building a foundation for decision support&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AlphaEvolve&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is an evolutionary coding agent that generates and refines algorithms autonomously. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;In collaboration with Google Cloud and prognostica GmbH&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; BASF’s objective was not to replace human decision-making, but to establish a new model for decision support that helps planners handle the real-world complexity of the production network.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The team gave AlphaEvolve a foundational "seed" program. This initial code established a standard planning logic that translated demand forecasts into production schedules, serving as a functional baseline before introducing dynamic, network-wide coordination. From there, they fed the model three years of historical data, including inventory levels, market demand, and actual production outputs. AlphaEvolve then generated variations of the code, mutating the logic to see if it could simulate a supply chain that matched the real-world historical data.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Measuring what good looks like in initial tests&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For AlphaEvolve to improve, it needed a specific goal. The evaluation function scored every new piece of generated code on one primary metric: how closely the simulated inventory levels and production decisions matched the actual historical reality recorded by BASF.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The latest AlphaEvolve runs delivered more than 80% relative improvement in accuracy compared to the initial seed model. With further adjustments, the team expects to push performance even higher — bringing the model to a level of accuracy not achieved with other approaches and making it actionable for operational use.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The results&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The evolved planning logic delivered immediate, measurable improvements over the initial seed model. The final algorithm successfully mirrored the actual historical performance of the supply chain, significantly reducing the error rate compared to the initial seed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“We had several attempts to build a digital twin for our complex supply network using deterministic models, and all of them failed,” said Dr. Goetz Krabbe, vice president for global supply chain at BASF. “By using AlphaEvolve, we cannot only map the complex network based on system data, but at the same time understand and copy the human decisions that drive our daily operations. This gives us a highly accurate and easy to maintain data driven digital twin of the entire network. Using it we can optimize our inventory levels and respond to market volatility with confidence while avoiding stockouts."&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What the evolved algorithm actually does&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By running thousands of experiments, AlphaEvolve developed a clear, human-readable algorithm that explains how the BASF network truly operates. It automatically discovered factually correct, domain-specific supply chain rules that explain the observed production outputs and inventory levels for the tested product value chain:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Production consolidation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The algorithm learned to group production amounts together, accurately mapping how planners optimize plant time.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dynamic safety stocks:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; It introduced safety stock parameters to handle volatile and seasonal demand patterns, helping to strictly manage capital costs while preventing out-of-stock situations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Network-wide coordination:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The model successfully mapped the dependencies between different production tiers, providing a clear foundation for optimizing asset utilization globally.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What's next&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The initial simulations showed that evolutionary AI can accurately model large-scale, dynamic supply chains. BASF’s objective is to create a digital twin of their entire global production network as a new foundation for simulation, decision support, scenario forecasting and optimization. This will allow the team to continuously simulate operations, identify hidden bottlenecks before they affect throughput, and optimize asset utilization across all global facilities.&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sub&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;This project was a collaboration between the BASF SE team including: Benjamin Priese, Michael Arlt, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Debora Morgenstern and Tobias Hausen as well as Manuel Doerr and Thomas Christ from Prognostica GmbH Würzburg, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;and the AI for Science team at Google Cloud including (but not limited to): Kartik Sanu, Laurynas Tamulevičius, Nicolas Stroppa, Chris Page, Srikanth Soma, John Semerdjian, Skandar Hannachi, Vishal Agarwal and Anant Nawalgaria as well as &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Christoph Tittelbach from&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; the Google account team and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;partners at Google DeepMind&lt;/span&gt;&lt;/sub&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 07 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/how-basf-manages-thousands-of-supply-chain-decisions-with-alphaevolve/</guid><category>Data Analytics</category><category>Customers</category><category>Developers &amp; Practitioners</category><category>Google Cloud in Europe</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_BFm5ksn.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How BASF manages thousands of supply chain decisions with AlphaEvolve’s agentic algorithms</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_BFm5ksn.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/how-basf-manages-thousands-of-supply-chain-decisions-with-alphaevolve/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Benjamin Priese</name><title>Senior Digital SC Manager, BASF Agricultural Solutions</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Anant Nawalgaria</name><title>Group Product Manager &amp; AI engineer, Google</title><department></department><company></company></author></item><item><title>Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX</title><link>https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI coding agents are rapidly becoming ubiquitous across the software industry, fundamentally changing how developers write, test, and debug daily code. While these tools excel at localized, self-contained tasks, applying them to massive, systemic codebase migrations requires an entirely new approach.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google is already addressing this challenge by incorporating AI into many migration workflows: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/systems/using-ai-and-automation-to-migrate-between-instruction-sets"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;x86 to ARM&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (enabling workloads on Google Axion processors); &lt;/span&gt;&lt;a href="https://dl.acm.org/doi/10.1145/3696630.3728542" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;int32 to int64&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; identifiers (to avoid running out of ids); &lt;/span&gt;&lt;a href="https://arxiv.org/abs/2501.06972" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JUnit3 to JUnit4&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for testing); and &lt;/span&gt;&lt;a href="https://arxiv.org/abs/2501.06972" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Joda-Time to java.time&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (a modern time library). However, AI model migration represents a whole new level of complexity that requires even more advanced methods for AI-assisted migration. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Translating a production-grade machine learning model from one framework to another, for example, from TensorFlow (TF) to JAX, is not a simple syntax update. It is a long-horizon task that requires untangling thousands of lines of code, managing complex states across multiple files, and preserving precise mathematical equivalence. Generic, single-agent coding assistants typically struggle under this weight — they frequently lose context over long workflows, hallucinate APIs, or fail to produce buildable code across an entire repository.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google’s AI and Infrastructure team has pioneered a new approach to this industry-wide problem. The result is 6x faster model migration, a milestone Sundar highlighted in the recent &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=11PBno-cJ1g&amp;amp;t=384s" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Next keynote&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. In this post, we share how we deployed specialized, multi-agent AI systems to migrate some of Google’s largest-scale production models from TF to JAX.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Accelerating the transition from TF to JAX&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For many teams at Google — and across the industry — the future of scalable machine learning is being built on JAX. Designed around a functional, stateless paradigm, JAX is heavily optimized for modern Tensor Processing Unit (TPU) infrastructure and XLA compilation, making it the bedrock of the modern AI stack.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Evolving to this future presents a monumental challenge. Thousands of production models are built on TensorFlow, a framework characterized by object-oriented, stateful layer initialization and static execution graphs. Manually migrating these models to JAX requires a fundamental rethinking of how layers interact, and how state is explicitly managed. Across large organizations, this type of migration alone represents hundreds (if not thousands) of software engineering (SWE) years — time better spent on researching new architectures and driving product innovation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Overcoming this challenge with AI started as an ambitious experiment within Google’s AI and Infrastructure team, but has evolved into a repeatable blueprint for addressing complex engineering problems across the company.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Moving beyond single-agent coding&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our early experiments with agentic code translation showed promise for simple models. However, when faced with the realities of a Google-scale migration — complex, production-grade models spanning multiple files and thousands of lines of code — generic, single-agent setups struggled. They could not balance high-level structural rules with low-level execution details, resulting in a variety of failures, such as overwriting critical files or skipping necessary functionality. To overcome these common challenges inherent to enterprise migrations, we developed a highly specialized multi-agent architecture that consists of:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Planner agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using deterministic, compiler-based static analysis, the Planner maps out the codebase's entire dependency tree. It then works alongside other agents to break the migration down into a discrete, step-by-step plan, helping ensure the migration happens logically from the "leaf nodes" (layers without unmigrated dependencies) upward.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Orchestrator agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This agent acts as the project manager. It dynamically groups plan steps into manageable chunks to keep the context window focused, injects the necessary domain knowledge, and handles failure recovery if a step doesn't build.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;The Coder agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Built as a reasoning and acting agent, the Coder is the workhorse. Integrated directly into our internal IDE tools, it has the ability to read files, write code, run builds, and execute unit tests. Crucially, it operates in a "test-and-fix" loop, self-correcting until it produces a compilable, verifiable component in the target language.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_-_System_diagram.max-1000x1000.jpg"
        
          alt="2 - System diagram"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="013zu"&gt;Figure: Multi-agent AI system for complex code migrations. Process diagram describing the multi-agent system used to migrate legacy model code to JAX. Image generated with Gemini Nano Banana 2.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Scalable validation and dynamic Playbooks&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Generative AI models are only as good as the context they are provided. Because source and target architectures rarely map 1-to-1, we engineered a scalable, hierarchical system of Playbooks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These Playbooks range from general repository instructions to highly specific "golden examples" distilled from successful manual migrations. By feeding the Orchestrator a client-specific Playbook (for instance, one tailored to YouTube's unique ranking model infrastructure), the system avoids generic hallucinations and strictly adheres to internal coding standards. This Playbook architecture is framework-agnostic, meaning it can be adapted to guide migrations between any two programming languages or frameworks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Furthermore, we instituted rigorous quality metrics to ensure the generated code is actually production-ready:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Quantitative verification:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; For each unit of code, we verify correctness mathematically. In the case of the TF-to-JAX migration, the system utilizes algorithmic gradient ascent to find the maximum error between the original TF layer and the new JAX layer, mathematically verifying functional equivalence.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Qualitative evaluation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We also evaluate the migrated code against a set of qualitative standards. In the case of the TF-to-JAX migration, we deploy a blind-audit LLM Judge that scores the migrated code against a framework-agnostic architectural checklist, so that critical, domain-specific logic is completely captured.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Redefining migration velocity&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By deploying this multi-agent system, we dramatically alter the economics of software migration.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our evaluations on real-world, highly complex YouTube models (featuring thousands of lines of code, hundreds of layers, and deep metric dependencies), the multi-agent system achieved a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;6.4x to 8x speedup&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; over performing the migration manually. What traditionally took several  SWE-months can now be reduced to only a few weeks of AI-assisted code generation, followed by expert human review.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The system effectively handles the boilerplate, identifies target idioms, maps the dependencies, and generates the unit tests, allowing engineers to act as reviewers and architects rather than manual translators.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Looking ahead into the AI-assisted era&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI is transforming the pace of technological innovation. Without using AI to accelerate our ability to conduct large-scale migrations, it will become increasingly difficult for organizations to adopt the latest breakthroughs and maintain the security, reliability, and performance of their systems.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our work migrating machine learning implementations from one ML framework to another demonstrates that by combining deterministic static analysis, strict testing loops, and specialized multi-agent architectures, we can safely automate some of the most complex software engineering challenges in the industry. A detailed description of the process is published in our &lt;/span&gt;&lt;a href="https://arxiv.org/abs/2603.27296" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;technical paper&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.  &lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sub&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;This work is the result of collaboration across Google. We thank key contributors: Stoyan Nikolov, Niyati Parameswaran, Bernhard Konrad, Moritz Gronbach, Niket Kumar, Ann Yan, Varun Singh, Yaning Liang, Antoine Baudoux, Xevi Miró Bruix, Daniele Codecasa, Madhura Dudhgaonkar, Elian Dumitru, Alex Ivanov, Christopher Milne-O’Grady, Ahmed Omran, Ivan Petrychenko, Assaf Raman, Stefan Schnabl, Yurun Shen, Maxim Tabachnyk, Niranjan Tulpule, Amin Vahdat, and Jeff Zhou.&lt;/span&gt;&lt;/em&gt;&lt;/sub&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 06 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Hero_Image.max-600x600_4hJcig4.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Hero_Image.max-600x600_4hJcig4.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Jamie Rogers</name><title>Head of Product, Domain Applied Machine Learning, AI and Infrastructure</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Parthasarathy Ranganathan</name><title>Google Fellow &amp; Vice President, AI and Infrastructure</title><department></department><company></company></author></item></channel></rss>