name: switchailocal description: Unified LLM proxy for AI agents. Route all model requests through http://localhost:18080/v1. Provides FREE access to Gemini CLI, Claude CLI, Codex, and Vibe via your existing subscriptions. Use when: (1) making LLM calls using provider prefixes, (2) switching between CLI/Local/Cloud providers, (3) needing to attach local files/folders to prompts via CLI, (4) requiring intelligent routing between models, or (5) needing to monitor provider health and analytics.

switchAILocal Proxy

Unified LLM proxy for AI agents. Always use http://localhost:18080/v1 as your base URL.

The killer feature: Use your paid CLI subscriptions (Gemini Pro, Claude Pro, etc.) via the API - it's FREE because you already pay for the subscription!

Quick Start

1. Make a request (FREE with CLI)

curl http://localhost:18080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "geminicli:",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

2. Configure Python Client

from openai import OpenAI
client = OpenAI(base_url="http://localhost:18080/v1", api_key="sk-test-123")
response = client.chat.completions.create(model="geminicli:", messages=[{"role": "user", "content": "Hi!"}])

🗺️ Skill Files

File	Description
SKILL.md (this file)	Core workflow and endpoint reference
references/routing.md	Intelligent routing and matrix setup
references/multimodal.md	Vision and image processing
references/examples.md	Real-world agentic use cases
references/management-api.md	Full Monitoring & Operations API
references/steering.md	Conditional routing rules
references/hooks.md	Automation and event hooks
references/memory.md	Analytics and history

⚠️ Critical: Model Format

NEVER use bare model names. Format is ALWAYS provider: or provider:model.

❌ Wrong	✅ Correct	Why
`gemini-2.5-pro`	`geminicli:gemini-2.5-pro`	Needs provider prefix
`claude-3-5-sonnet`	`claudecli:`	`claudecli:` uses default
`llama3`	`ollama:llama3`	Needs provider prefix

🏗️ Provider Reference

1. CLI Providers (FREE!)

Uses your human's CLI subscriptions. Best for agents.

Prefix	CLI	Subscription Required
`geminicli:`	`gemini`	Google AI Premium/Pro
`claudecli:`	`claude`	Claude Pro/Max
`codex:`	`codex`	OpenAI Plus
`vibe:`	`vibe`	Mistral Le Chat

2. Local & Cloud

Prefix	Source	Cost
`ollama:`	Local Ollama	FREE
`auto`	Local Cortex	FREE (Requires plugin)
`switchai:`	Traylinx Cloud	Per-token
`groq:`	Groq Cloud	Per-token

🚀 Core Features

CLI Attachments & Flags

Pass local context and control autonomy via CLI extensions.

{
  "model": "geminicli:",
  "messages": [{"role": "user", "content": "Fix this code"}],
  "extra_body": {
    "cli": {
      "attachments": [{"type": "folder", "path": "./src"}],
      "flags": {"auto_approve": true, "yolo": true}
    }
  }
}

Streaming

Add "stream": true to any request for SSE token streaming.

🌲 Decision Tree

What do you need?
├─ FREE + Powerful + Files
│   └─ CLI Providers (geminicli:, claudecli:)
├─ FREE + Private + Fast
│   └─ Local Ollama (ollama:llama3.2)
├─ Ultra-Fast Production
│   └─ Groq Cloud (groq:llama-3.3-70b)
└─ I don't know, you pick
    └─ Intelligent Routing (auto)

🛠️ Troubleshooting & Best Practices

Problem	Fix
Connection error	Check if server is running on port 18080
Model not found	Ensure you used the `provider:` prefix
401 Unauthorized	Check API key in `config.yaml`

Best Practices

Prefer CLI Providers: They are free and support file attachments.
Check Status: Use GET /v1/providers to see what is active.
Use auto: For simple tasks, let the router pick the best model.
Local for Privacy: Use ollama: for confidential data.

Route wisely. Save tokens. Use CLI. 🚀

switchailocal

SKILL.md