Switch between cloud providers and models to avoid rate limits and optimize costs
Andi AIRun extends Claude Code with multi-provider support. Switch between your Claude subscription, AWS Bedrock, Google Vertex AI, Azure, local models, and more — all using the same ai command.
The most common reason. Claude Pro has rate limits that can block you for hours:
Copy
# Working with Claude Pro, hit rate limitclaude# "Rate limit exceeded. Try again in 4 hours 23 minutes."# Immediately continue with AWSai --aws --resume
You can continue your exact conversation on a different provider without losing context.
# Use cheap Haiku for simple analysisai --aws --haiku analyze-logs.md# Use expensive Opus for complex reasoningai --aws --opus review-architecture.md# Use free local model for experimentationai --ollama --model qwen3-coder
The --resume flag picks up your previous conversation on a different provider:
Copy
# Start with Claude Proai# [work for a while...]# "Rate limit exceeded"# Resume on AWS with same contextai --aws --resume
What --resume does:
Loads the most recent conversation session
Continues from where you left off
Preserves all context and history
Works across any provider switch
Use cases:
Rate limit recovery
Copy
ai # Hit rate limitai --aws --resume # Continue immediately
Cost optimization mid-task
Copy
ai --opus # Start with powerful modelai --haiku --resume # Switch to cheap model for simple tasksai --opus --resume # Back to powerful for complex work
Local to cloud escalation
Copy
ai --ollama # Try local firstai --aws --opus --resume # Escalate to cloud for hard problem
Testing providers
Copy
ai --aws --resume # Try AWS implementationai --vertex --resume # Compare Vertex responseai --azure --resume # Test Azure behavior
Use ai-sessions to view your active conversation sessions and see which one would be resumed.
If you frequently use a specific provider, save it as your default:
Copy
# Set AWS + Opus as defaultai --aws --opus --set-default# Now 'ai' with no flags uses AWS + Opusaiai --resume# Clear saved defaultai --clear-default# Back to auto-detection (Claude Pro if logged in)ai
Saved defaults are stored in ~/.ai-runner/default.conf.Flag precedence (highest to lowest):
ai --aws --opus # AWS Bedrock + Opusai --vertex --sonnet # Vertex AI + Sonnet ai --azure --haiku # Azure + Haiku
Use --haiku for cost savings on simple tasks. Use --opus for complex architecture decisions or challenging debugging. Use --sonnet (default) for everyday coding.
# Claude Pro with teamsai --team# AWS with teamsai --aws --opus --team# Local Ollama with teamsai --ollama --team
Teams coordination uses Claude Code’s internal task list and mailbox — it’s provider-independent. Token usage scales with team size (5 teammates ≈ 5× tokens).
Agent teams only work in interactive mode — not supported in shebang/piped script modes.
# 1. Start working with Claude Proai# 2. Hit rate limit mid-task# "Rate limit exceeded. Try again in 4 hours 23 minutes."# 3. Immediately switch to APIai --aws --resume# 4. Continue working# [Complete the task...]# 5. Later, switch back to Pro for free tierai --pro --resume
# Use free local model for explorationai --ollama# [Prototype and experiment...]# Switch to cloud for production-quality codeai --aws --opus --resume# [Final implementation...]# Use cheap Haiku for documentationai --aws --haiku --resume# [Generate docs...]
Active AIRun Sessions:1. AWS Bedrock + Sonnet 4.6 Started: 2 hours ago Location: ~/projects/backend/ Status: Active2. Claude Pro + Opus 4.6 Started: 15 minutes ago Location: ~/projects/frontend/ Status: Active3. Ollama (qwen3-coder) Started: 1 day ago Location: ~/experiments/ Status: Idle