Route requests to GPT-4o, Claude 4, Gemini, Llama, and 100+ models through a single unified endpoint.
Access every major foundation model through one API. Switch providers in one line of code.
GPT-4o · GPT-4o-mini · o1 · o3-mini
Claude Opus 4.6 · Claude Sonnet 4.6 · Claude Haiku 4.5
Gemini 2.5 Pro · Gemini 2.5 Flash · Gemma 3
Llama 4 · Llama 4 Maverick · Llama 3.3 70B
Mistral Large · Mistral Medium · Codestral
DeepSeek-R1 · DeepSeek-V3 · DeepSeek-R1-Lite
Qwen3 · Qwen3-Coder · QwQ-32B
GLM-4-Plus · GLM-4-Long · GLM-Z1
Cohere · Yi · MiniMax · Moonshot · Baichuan · and more
Drop-in compatible with OpenAI SDK. Switch models by changing a single parameter.
from openai import OpenAI
client = OpenAI(
base_url="https://api.your-gateway.com/v1",
api_key="your-api-key"
)
# Use any model — just change the name
response = client.chat.completions.create(
model="gpt-4o", # or "claude-opus-4-6", "gemini-2.5-pro"
messages=[{
"role": "user",
"content": "Hello, how are you?"
}]
)
print(response.choices[0].message.content)
curl https://api.your-gateway.com/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.your-gateway.com/v1',
apiKey: 'your-api-key',
});
// Route to any model instantly
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);
Enterprise-grade infrastructure for production AI applications.
Automatically route requests to the best model based on cost, latency, and quality requirements.
One API key to access all providers. Manage credentials, permissions, and budgets in one place.
Track usage, costs, latency, and error rates across all models with granular dashboards.
Automatic failover between providers. If one model is down, requests route to alternatives instantly.
Per-user, per-model rate limits with token bucket algorithms. Protect your budget and infrastructure.
Full SSE streaming support with consistent response format across all providers and models.
Pay only for what you use. No hidden fees. No minimums.
Get your API key in 30 seconds. No credit card required.