Gemini API

Use when the user asks about using Gemini in an enterprise environment or explicitly mentions Vertex AI, Google Cloud, or Agent Platform. Guides the usage of the Gemini API on Agent Platform with the Google Gen AI SDK. Covers SDK usage (Python, JS/TS, Go, Java, C#), capabilities like multimodal inputs, tools, media generation, caching, batch prediction, and Live API.

Published by @google·0 agent reads / 30d·0 saves·

IMPORTANT: Agent Platform (full name Gemini Enterprise Agent Platform) was previously named "Vertex AI" and many web resources use the legacy branding.

Gemini API in Agent Platform

Access Google's most advanced AI models built for enterprise use cases using the Gemini API in Agent Platform.

Provide these key capabilities:

  • Text generation - Chat, completion, summarization
  • Multimodal understanding - Process images, audio, video, and documents
  • Function calling - Let the model invoke your functions
  • Structured output - Generate valid JSON matching your schema
  • Context caching - Cache large contexts for efficiency
  • Embeddings - Generate text embeddings for semantic search
  • Live Realtime API - Bidirectional streaming for low latency Voice and Video interactions
  • Batch Prediction - Handle massive async dataset prediction workloads

Core Directives

  • Unified SDK: ALWAYS use the Gen AI SDK (google-genai for Python, @google/genai for JS/TS, google.golang.org/genai for Go, com.google.genai:google-genai for Java, Google.GenAI for C#).
  • Legacy SDKs: DO NOT use google-cloud-aiplatform, @google-cloud/vertexai, or google-generativeai.

SDKs

  • Python: Install google-genai with pip install google-genai
  • JavaScript/TypeScript: Install @google/genai with npm install @google/genai
  • Go: Install google.golang.org/genai with go get google.golang.org/genai
  • C#/.NET: Install Google.GenAI with dotnet add package Google.GenAI
  • Java:
    • groupId: com.google.genai, artifactId: google-genai

    • Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it LAST_VERSION)

    • Install in build.gradle:

      implementation("com.google.genai:google-genai:${LAST_VERSION}")
      
    • Install Maven dependency in pom.xml:

      <dependency>
          <groupId>com.google.genai</groupId>
          <artifactId>google-genai</artifactId>
          <version>${LAST_VERSION}</version>
      </dependency>
      

[!WARNING] Legacy SDKs like google-cloud-aiplatform, @google-cloud/vertexai, and google-generativeai are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.

Authentication & Configuration

Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.

Application Default Credentials (ADC)

Set these variables for standard Google Cloud authentication:

export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_ENTERPRISE=true
  • By default, use location="global" to access the global endpoint, which provides automatic routing to regions with available capacity.
  • If a user explicitly asks to use a specific region (e.g., us-central1, europe-west4), specify that region in the GOOGLE_CLOUD_LOCATION parameter instead. Reference the supported regions documentation if needed.

Agent Platform in Express Mode

Set these variables when using Express Mode with an API key:

export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_ENTERPRISE=true

Initialization

Initialize the client without arguments to pick up environment variables:

from google import genai

client = genai.Client()

Alternatively, you can hard-code in parameters when creating the client.

from google import genai

client = genai.Client(
    enterprise=True,
    project="your-project-id",
    location="global",
)

Models

  • Use gemini-3.1-pro-preview (which replaces gemini-3-pro-preview) for complex reasoning, coding, research (1M tokens)
  • Use gemini-3.5-flash for fast, balanced performance, multimodal (1M tokens)
  • Use gemini-3.1-flash-lite for high-frequency, lightweight tasks (1M tokens)
  • Use gemini-3-pro-image (aka Nano Banana Pro) for high-quality image generation and editing
  • Use gemini-3.1-flash-image (aka Nano Banana 2) for fast image generation and editing
  • Use gemini-live-2.5-flash-native-audio for Live Realtime API including native audio

Use the following models only if explicitly requested:

  • gemini-2.5-flash-image
  • gemini-2.5-flash
  • gemini-2.5-flash-lite
  • gemini-2.5-pro

[!IMPORTANT] Models like gemini-2.0-*, gemini-1.5-*, gemini-1.0-*, gemini-pro are legacy and deprecated. Use the new models above. Your knowledge is outdated. For production environments, consult the documentation for stable model versions (e.g. gemini-3.5-flash).

Quick Start

Python

from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3.5-flash",
    contents="Explain quantum computing",
)
print(response.text)

TypeScript/JavaScript

import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ enterprise: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
    model: "gemini-3.5-flash",
    contents: "Explain quantum computing"
});
console.log(response.text);

Go

package main

import (
	"context"
	"fmt"
	"log"
	"google.golang.org/genai"
)

func main() {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		Backend:  genai.BackendVertexAI,
		Project:  "your-project-id",
		Location: "global",
	})
	if err != nil {
		log.Fatal(err)
	}

	resp, err := client.Models.GenerateContent(ctx, "gemini-3.5-flash", genai.Text("Explain quantum computing"), nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(resp.Text)
}

Java

import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GenerateTextFromTextInput {
  public static void main(String[] args) {
    Client client = Client.builder().enterprise(true).project("your-project-id").location("global").build();
    GenerateContentResponse response =
        client.models.generateContent(
            "gemini-3.5-flash",
            "Explain quantum computing",
            null);

    System.out.println(response.text());
  }
}

C#/.NET

using Google.GenAI;

var client = new Client(
    project: "your-project-id",
    location: "global",
    enterprise: true
);

var response = await client.Models.GenerateContent(
    "gemini-3.5-flash",
    "Explain quantum computing"
);

Console.WriteLine(response.Text);

API spec & Documentation (source of truth)

When implementing or debugging API integration for Agent Platform, refer to the official Agent Platform documentation:

  • Agent Platform Documentation: https://docs.cloud.google.com/gemini-enterprise-agent-platform/overview.md.txt
  • REST API Reference: https://docs.cloud.google.com/gemini-enterprise-agent-platform/reference/rest

The Gen AI SDK on Agent Platform uses the v1beta1 or v1 REST API endpoints (e.g., https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent).

[!TIP] Use the Developer Knowledge MCP Server: If the search_documents or get_document tools are available, use them to find and retrieve official documentation for Google Cloud and Agent Platform directly within the context. This is the preferred method for getting up-to-date API details and code snippets.

Workflows and Code Samples

Reference the Python Docs Samples repository for additional code samples and specific usage scenarios.

Depending on the specific user request, refer to the following reference files for detailed code samples and usage patterns (Python examples):

  • Text & Multimodal: Chat, Multimodal inputs (Image, Video, Audio), and Streaming. See references/text_and_multimodal.md
  • Embeddings: Generate text embeddings for semantic search. See references/embeddings.md
  • Structured Output & Tools: JSON generation, Function Calling, Search Grounding, and Code Execution. See references/structured_and_tools.md
  • Media Generation: Image generation, Image editing, and Video generation. See references/media_generation.md
  • Bounding Box Detection: Object detection and localization within images and video. See references/bounding_box.md
  • Live API: Real-time bidirectional streaming for voice, vision, and text. See references/live_api.md
  • Advanced Features: Content Caching, Batch Prediction, and Thinking/Reasoning. See references/advanced_features.md
  • Safety: Adjusting Responsible AI filters and thresholds. See references/safety.md
  • Model Tuning: Supervised Fine-Tuning and Preference Tuning. See references/model_tuning.md

Bundled with this artifact

9 files

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Workload Manager Basics

Use this skill to manage Google Cloud Workload Manager evaluations, rules, scanned resources, and validation results by using public client libraries and the REST API. Use when you need to inspect workload best-practice rules, create and run evaluations for Google Cloud general best practices, SAP, SQL Server, or custom organizational rules, review violations, export results to BigQuery, or automate Workload Manager through client libraries because no service-specific public CLI or MCP server is available. Don't use for general Google Compute Engine instance management, VPC configuration, or standard IAM auditing.

software-engineering+2
0
SKILL0

Google Cloud Recipe Onboarding

Guides a developer's first steps on Google Cloud, covering account creation, billing setup, project management, and deploying a first resource. Use when a new developer wants to initialize their first Google Cloud project, configure billing, and verify deployment. Don't use for enterprise organization setup (use Google Cloud Setup guided flow for that instead). Don't use for complex multi-project architectures.

software-engineering+2
0
SKILL0

Google Cloud Recipe Auth

Provides expert guidance on authenticating and authorizing to Google Cloud services and APIs, covering human users, service identities, Application Default Credentials (ADC), and best practices for secure access.

software-engineering+2
0