Skip to main content
Version: 2026 R1

Google Vertex AI

Google Vertex AI provides access to the Gemini family of models - advanced multimodal models supporting text, images, audio, and video. Gemini models stand out with large context windows (up to 2M tokens in some versions), fast performance, and competitive pricing.

When to choose Google Vertex AI

Large documents and long context:

  • Analysis of multi-page documents
  • Processing long conversations and histories
  • Context spanning entire knowledge bases

Multimedia processing:

  • Image analysis and object detection
  • Audio transcription and analysis
  • Video processing

Cost optimization:

  • Gemini Flash models offer the best price-to-performance ratio
  • Lower costs for large query volumes

GCP integration:

  • Already using Google Cloud Platform
  • Need RAG features with Vertex AI Search

Requirements

  • Google Cloud Platform (GCP) account
  • GCP project with enabled Vertex AI API
  • Service Account Key (JSON file)
  • Google Cloud Storage Bucket (for file processing)

Step 1: Google Cloud preparation

1. Create Service Account

  1. Go to Google Cloud Console
  2. Select a project or create a new one
  3. Go to IAM & Admin > Service Accounts
  4. Click Create Service Account
  5. Assign a name (e.g., aiproxy-vertex)
  6. Assign roles:
    • Vertex AI User
    • Storage Object Admin (for bucket)
  7. Click Create key > JSON and download the file

2. Enable APIs

  1. Go to APIs & Services > Library
  2. Enable the following APIs:
    • Vertex AI API
    • Cloud Storage API

3. Create Storage Bucket

  1. Go to Cloud Storage > Buckets
  2. Click Create bucket
  3. Assign a name (e.g., aiproxy-files)
  4. Select region (e.g., us-central1)
  5. Click Create
Bucket is required

Storage Bucket is needed for processing files (images, audio, documents) by Gemini models.

Step 2: AI Proxy Configuration

Example aiconfiguration.json

{
"ProviderConnections": {
"GoogleVertex": {
"Description": "Google Vertex AI Connection",
"Type": "Gemini",
"ProviderConfiguration": {
"ApiKey": "your-google-api-key-if-available",
"ServiceAccount": "{\"type\":\"service_account\",\"project_id\":\"your-project\",\"private_key_id\":\"...\",\"private_key\":\"-----BEGIN PRIVATE KEY-----\\n...\\n-----END PRIVATE KEY-----\\n\",\"client_email\":\"aiproxy-vertex@your-project.iam.gserviceaccount.com\",\"client_id\":\"...\",\"auth_uri\":\"https://accounts.google.com/o/oauth2/auth\",\"token_uri\":\"https://oauth2.googleapis.com/token\"}",
"ProjectId": "your-gcp-project-id",
"Region": "us-central1",
"BucketName": "aiproxy-files"
}
}
},
"ProviderModels": [
{
"ConnectionName": "GoogleVertex",
"Priority": 100,
"Name": "Gemini Flash",
"Description": "",
"TextModel": {
"ModelName": "gemini-2.0-flash-exp"
},
"ImageModel": {
"ModelName": "imagen-3.0-fast-generate-001"
},
"AudioModel": {
"ModelName": "gemini-2.0-flash-exp"
},
"EmbeddingModel": {
"ModelName": "text-embedding-004"
}
}
],
"MethodTypesConfiguration": {
"ConciergePrompt": [ "Gemini Flash" ],
"ConciergeExecuteTool": [ "Gemini Flash" ]
}
}
Important
  • ServiceAccount - paste the entire contents of the downloaded JSON file as a string (with escaped quotes)
  • ProjectId - project ID from Google Cloud
  • Region - region where you have Vertex AI enabled (e.g., us-central1, europe-west1)
  • BucketName - name of the created Storage Bucket
Gemini Models

Recommended models are gemini-flash or gemini-flash-lite - they are multimodal (support text, images, audio) and performant enough for most use cases.

Example docker-compose.yml

name: aiproxy_containers
services:
ai-proxy:
image: webconbps/aiproxy:1.0.0.235
container_name: ai-proxy
restart: unless-stopped
ports:
- "5298:8080"
- "7033:8081"
environment:
- ASPNETCORE_ENVIRONMENT=Production
- AppConfiguration__SelfHosted__Certificate__Path=/app/https/certificate.pem
- Logging__LogLevel__Default=Information
- Logging__LogLevel__Microsoft=Warning
volumes:
- ./certificates/certificate.pem:/app/https/certificate.pem:ro
- ./aiconfiguration.json:/app/aiconfiguration.json:ro

Step 3: Startup

# Make sure you have prepared files:
# - ./certificates/certificate.pem
# - ./aiconfiguration.json (with full Service Account JSON contents)

# Run container
docker-compose up -d

# Check logs
docker-compose logs -f ai-proxy

Troubleshooting

Error: Permission denied / 403 Forbidden

Causes:

  • Service Account doesn't have required permissions
  • APIs are not enabled in the project

Solution:

# Check if Service Account has roles:
# - Vertex AI User
# - Storage Object Admin

# Check if APIs are enabled:
# - Vertex AI API
# - Cloud Storage API

# Restart container
docker-compose restart ai-proxy

Error: Invalid Service Account JSON

Causes:

  • Invalid JSON format in ServiceAccount
  • Quotes are not escaped

Solution:

# ServiceAccount must be a JSON string with escaped quotes
# Example of correct format:
# "ServiceAccount": "{\"type\":\"service_account\",\"project_id\":\"my-project\"...}"

# You can use online tools to escape JSON string

Error: Bucket not found

Causes:

  • Bucket with the given name doesn't exist
  • Service Account doesn't have access to bucket

Solution:

# Check if bucket exists in Cloud Storage
# Make sure Service Account has Storage Object Admin role
# Check if BucketName in configuration is correct

Recommended models for use with AI Proxy:

Text/multimodal models:

  • gemini-2.0-flash-exp - newest, fast, multimodal (text, images, audio)
  • gemini-2.0-flash-lite - lighter version, still multimodal
  • gemini-1.5-flash - proven version, multimodal
  • gemini-1.5-pro - larger model, more capabilities

Embedding models:

  • text-embedding-004 - newest embedding model
  • text-multilingual-embedding-002 - multi-language support

Image models:

  • imagen-3.0-fast-generate-001 - fast image generation
  • imagen-3.0-generate-001 - higher quality
Multimodal models

Models from the gemini-flash family are multimodal - they support text, images, and audio in one model, which simplifies configuration.