Google Vertex AI
Google Vertex AI provides access to the Gemini family of models - advanced multimodal models supporting text, images, audio, and video. Gemini models stand out with large context windows (up to 2M tokens in some versions), fast performance, and competitive pricing.
When to choose Google Vertex AI
Large documents and long context:
- Analysis of multi-page documents
- Processing long conversations and histories
- Context spanning entire knowledge bases
Multimedia processing:
- Image analysis and object detection
- Audio transcription and analysis
- Video processing
Cost optimization:
- Gemini Flash models offer the best price-to-performance ratio
- Lower costs for large query volumes
GCP integration:
- Already using Google Cloud Platform
- Need RAG features with Vertex AI Search
Requirements
- Google Cloud Platform (GCP) account
- GCP project with enabled Vertex AI API
- Service Account Key (JSON file)
- Google Cloud Storage Bucket (for file processing)
Step 1: Google Cloud preparation
1. Create Service Account
- Go to Google Cloud Console
- Select a project or create a new one
- Go to IAM & Admin > Service Accounts
- Click Create Service Account
- Assign a name (e.g.,
aiproxy-vertex) - Assign roles:
Vertex AI UserStorage Object Admin(for bucket)
- Click Create key > JSON and download the file
2. Enable APIs
- Go to APIs & Services > Library
- Enable the following APIs:
- Vertex AI API
- Cloud Storage API
3. Create Storage Bucket
- Go to Cloud Storage > Buckets
- Click Create bucket
- Assign a name (e.g.,
aiproxy-files) - Select region (e.g.,
us-central1) - Click Create
Storage Bucket is needed for processing files (images, audio, documents) by Gemini models.
Step 2: AI Proxy Configuration
Example aiconfiguration.json
{
"ProviderConnections": {
"GoogleVertex": {
"Description": "Google Vertex AI Connection",
"Type": "Gemini",
"ProviderConfiguration": {
"ApiKey": "your-google-api-key-if-available",
"ServiceAccount": "{\"type\":\"service_account\",\"project_id\":\"your-project\",\"private_key_id\":\"...\",\"private_key\":\"-----BEGIN PRIVATE KEY-----\\n...\\n-----END PRIVATE KEY-----\\n\",\"client_email\":\"aiproxy-vertex@your-project.iam.gserviceaccount.com\",\"client_id\":\"...\",\"auth_uri\":\"https://accounts.google.com/o/oauth2/auth\",\"token_uri\":\"https://oauth2.googleapis.com/token\"}",
"ProjectId": "your-gcp-project-id",
"Region": "us-central1",
"BucketName": "aiproxy-files"
}
}
},
"ProviderModels": [
{
"ConnectionName": "GoogleVertex",
"Priority": 100,
"Name": "Gemini Flash",
"Description": "",
"TextModel": {
"ModelName": "gemini-2.0-flash-exp"
},
"ImageModel": {
"ModelName": "imagen-3.0-fast-generate-001"
},
"AudioModel": {
"ModelName": "gemini-2.0-flash-exp"
},
"EmbeddingModel": {
"ModelName": "text-embedding-004"
}
}
],
"MethodTypesConfiguration": {
"ConciergePrompt": [ "Gemini Flash" ],
"ConciergeExecuteTool": [ "Gemini Flash" ]
}
}
- ServiceAccount - paste the entire contents of the downloaded JSON file as a string (with escaped quotes)
- ProjectId - project ID from Google Cloud
- Region - region where you have Vertex AI enabled (e.g.,
us-central1,europe-west1) - BucketName - name of the created Storage Bucket
Recommended models are gemini-flash or gemini-flash-lite - they are multimodal (support text, images, audio) and performant enough for most use cases.
Example docker-compose.yml
name: aiproxy_containers
services:
ai-proxy:
image: webconbps/aiproxy:1.0.0.235
container_name: ai-proxy
restart: unless-stopped
ports:
- "5298:8080"
- "7033:8081"
environment:
- ASPNETCORE_ENVIRONMENT=Production
- AppConfiguration__SelfHosted__Certificate__Path=/app/https/certificate.pem
- Logging__LogLevel__Default=Information
- Logging__LogLevel__Microsoft=Warning
volumes:
- ./certificates/certificate.pem:/app/https/certificate.pem:ro
- ./aiconfiguration.json:/app/aiconfiguration.json:ro
Step 3: Startup
# Make sure you have prepared files:
# - ./certificates/certificate.pem
# - ./aiconfiguration.json (with full Service Account JSON contents)
# Run container
docker-compose up -d
# Check logs
docker-compose logs -f ai-proxy
Troubleshooting
Error: Permission denied / 403 Forbidden
Causes:
- Service Account doesn't have required permissions
- APIs are not enabled in the project
Solution:
# Check if Service Account has roles:
# - Vertex AI User
# - Storage Object Admin
# Check if APIs are enabled:
# - Vertex AI API
# - Cloud Storage API
# Restart container
docker-compose restart ai-proxy
Error: Invalid Service Account JSON
Causes:
- Invalid JSON format in ServiceAccount
- Quotes are not escaped
Solution:
# ServiceAccount must be a JSON string with escaped quotes
# Example of correct format:
# "ServiceAccount": "{\"type\":\"service_account\",\"project_id\":\"my-project\"...}"
# You can use online tools to escape JSON string
Error: Bucket not found
Causes:
- Bucket with the given name doesn't exist
- Service Account doesn't have access to bucket
Solution:
# Check if bucket exists in Cloud Storage
# Make sure Service Account has Storage Object Admin role
# Check if BucketName in configuration is correct
Popular Gemini models
Recommended models for use with AI Proxy:
Text/multimodal models:
- gemini-2.0-flash-exp - newest, fast, multimodal (text, images, audio)
- gemini-2.0-flash-lite - lighter version, still multimodal
- gemini-1.5-flash - proven version, multimodal
- gemini-1.5-pro - larger model, more capabilities
Embedding models:
- text-embedding-004 - newest embedding model
- text-multilingual-embedding-002 - multi-language support
Image models:
- imagen-3.0-fast-generate-001 - fast image generation
- imagen-3.0-generate-001 - higher quality
Models from the gemini-flash family are multimodal - they support text, images, and audio in one model, which simplifies configuration.