# OfoxAI > Unified LLM API Gateway — one API for 100+ models including GPT-5.2, Claude Opus 4.5, Gemini 3, DeepSeek V3.2. OpenAI/Anthropic/Gemini protocol compatible. 99.9% SLA. China direct connect. Works with CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode. This is the complete documentation for OfoxAI. For a concise overview with links, see [llms.txt](https://ofox.ai/llms.txt). @doc-version: 2.0.0 @last-updated: 2026-02-24 @language: en, zh @canonical: https://ofox.ai/llms-full.txt @see-also: https://ofox.ai/llms.txt @documentation: https://docs.ofox.ai --- # Overview OfoxAI is a unified LLM API gateway designed for developers building AI-powered applications. We aggregate 100+ top AI models from leading providers and offer them through standard API interfaces (OpenAI, Anthropic, Google compatible). ## Why OfoxAI? 1. **Single Integration**: One API key, 100+ models 2. **Cost Savings**: Pay-as-you-go, open-source models up to 70% off 3. **Reliability**: Enterprise-grade infrastructure with 99.9% SLA 4. **Flexibility**: Switch between models without code changes 5. **China Ready**: Dedicated China endpoints with HK express routes, no VPN needed 6. **Tool Ecosystem**: Native support for CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode, Windsurf --- # API Reference ## Authentication All API requests require an API key passed in the `Authorization` header: ``` Authorization: Bearer ``` ## Base URLs | Protocol | Endpoint | |----------|----------| | OpenAI | https://api.ofox.ai/v1 | | Anthropic | https://api.ofox.ai/anthropic | | Gemini | https://api.ofox.ai/gemini | All endpoints work globally and are optimized for China access via HK express routes. For complete API documentation, SDKs, and integration guides, visit [https://docs.ofox.ai](https://docs.ofox.ai). ## Chat Completions (OpenAI Compatible) ```python from openai import OpenAI client = OpenAI( base_url="https://api.ofox.ai/v1", api_key="" ) response = client.chat.completions.create( model="openai/gpt-5.2", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ], temperature=0.7, max_tokens=1000 ) print(response.choices[0].message.content) ``` ## Messages API (Anthropic Compatible) ```python import anthropic client = anthropic.Anthropic( base_url="https://api.ofox.ai/anthropic", api_key="" ) message = client.messages.create( model="anthropic/claude-opus-4.5", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude!"} ] ) print(message.content[0].text) ``` ## Gemini API (Google Compatible) ```python import google.generativeai as genai genai.configure( api_key="", transport="rest", client_options={"api_endpoint": "https://api.ofox.ai/gemini"} ) model = genai.GenerativeModel("gemini/gemini-3-pro") response = model.generate_content("Hello!") print(response.text) ``` ## Node.js Example ```javascript import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.ofox.ai/v1", apiKey: "", }); const response = await client.chat.completions.create({ model: "openai/gpt-5.2", messages: [{ role: "user", content: "Hello!" }], }); console.log(response.choices[0].message.content); ``` ## cURL Example ```bash curl https://api.ofox.ai/v1/chat/completions \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.2", "messages": [{"role": "user", "content": "Hello!"}] }' ``` --- # Pricing ## Plans ### Free Tier - 10+ free models available - Access to all models - 3 RAG knowledge bases - Basic MCP tools - Community support ### Pro Plan (Pay as you go) - Pay per token, no monthly fees - Flagship models 20% off - Open-source models up to 70% off - Unlimited RAG knowledge bases - Full MCP tools access - Unlimited team members - Ticket + online support - 99.9% SLA ### Enterprise - Custom tiered pricing - Private cloud / hybrid cloud deployment - Custom RAG solutions - Custom MCP integrations - SSO/SAML support - Dedicated success manager - 24/7 support hotline - 99.99% SLA ## Payment Methods - Alipay, WeChat Pay (China) - Credit card (Global) --- # Supported Models ## Flagship Models ### OpenAI - GPT-5.2, GPT-5.2 Mini - o3, o4-mini - GPT-4o, GPT-4o Mini ### Anthropic - Claude Opus 4.5 - Claude Sonnet 4 - Claude Haiku 4 ### Google - Gemini 3 Pro - Gemini 2.5 Flash, Gemini 2.5 Pro ### xAI - Grok 4, Grok 4.1 ## Open Source Models ### DeepSeek - DeepSeek-V3.2 - DeepSeek-R1 ### Alibaba (Qwen) - Qwen3-Max, Qwen3-Plus ### Moonshot (Kimi) - Kimi-K2 ### Meta - Llama 4 ### ByteDance (Doubao) - Doubao Pro ## AIGC Models ### Image Generation - Flux 2 Max, Seedream 4.5, Nano-Banana - Z-Image, Wan 2.5, Qwen-Image-Turbo ### Video Generation - Sora 2 Pro, Veo 3.1, Kling 2.6 Pro ### Voice & Digital Human - ElevenLabs, IndexTTS2, Gemini TTS - Seedance V1 Pro, Whisper V3 - HeyGen 5.0, Synthesia, Pika 2.1 --- # Vibe Coding - Unlimited AI API for Coding Tools @keywords: claude code rate limit, cursor api alternative, claude unlimited, vibe coding, ai coding tools, anthropic api proxy, CherryStudio, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode, Windsurf ## Overview Vibe Coding is the developer workflow: describe what you want in natural language, and AI writes the code. Tools like CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, and OpenCode have made this possible — but rate limits keep breaking the flow. OfoxAI solves this with unlimited Claude/GPT API access through native protocol compatibility. ## Problem 1. **Rate Limits**: Claude/GPT API enforces RPM/TPM limits that interrupt long coding sessions 2. **4xx Errors**: Frequent 429 (Too Many Requests) errors during heavy usage 3. **China Access**: High latency or blocked access for developers in China ## Solution Simple 2-line configuration change: ```bash # Add to ~/.zshrc or ~/.bashrc export ANTHROPIC_BASE_URL=https://api.ofox.ai/anthropic export ANTHROPIC_API_KEY= ``` For OpenAI-compatible tools: ```bash export OPENAI_BASE_URL=https://api.ofox.ai/v1 export OPENAI_API_KEY= ``` ## Supported Tools | Tool | Compatibility | Setup | |------|---------------|-------| | CherryStudio | Full | Settings > API Provider | | Claude Code | Full | Environment variable | | Cursor | Full | Settings > Models > Override API URL | | Chatbox | Full | Settings > API Provider | | Cline | Full | Extension settings | | Codex | Full | Environment variable | | Zed | Full | Settings > AI Provider | | OpenClaw | Full | Settings > API | | Kilo | Full | Settings > Provider | | OpenCode | Full | Environment variable | | Windsurf | Full | Environment variable | | Aider | Full | Environment variable | ## Key Benefits 1. **Unlimited RPM/TPM**: No request or token limits 2. **Native Protocol**: 100% Anthropic/OpenAI/Gemini API compatible, zero code changes 3. **99.9% SLA**: Enterprise-grade reliability 4. **China Optimized**: HK express routes with low latency 5. **All Models**: Claude Opus 4.5, Sonnet 4, GPT-5.2 and full lineup supported ## FAQ Q: Will my existing Claude Code/Cursor setup work? A: Yes, just change the base URL. All features including tool use, vision, streaming, and extended thinking work identically. Q: Is there really no rate limit? A: Correct. OfoxAI aggregates capacity across multiple accounts and regions to provide unlimited access. Q: How is latency compared to official API? A: Global: ~50ms overhead. China: significantly faster due to HK express infrastructure. Q: Which tools are supported? A: CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode, Windsurf, Aider — any tool that supports custom API base URL. --- # Features ## RAG (Retrieval-Augmented Generation) Build knowledge bases from your documents and query them with any LLM: - Support for PDF, Word, Markdown, HTML - Automatic chunking and embedding - Semantic search with vector databases - Context injection into LLM prompts ## MCP (Model Context Protocol) Extend LLM capabilities with external tools: - Web search integration - Code execution sandboxes - Database queries - API integrations - Custom tool development ## Agent Apps Build autonomous AI agents: - Multi-step reasoning - Tool orchestration - Memory and context management - Human-in-the-loop workflows ## Observability Full-stack monitoring: - Request tracing and cost analytics - Anomaly alerts and error tracking - Per-model performance metrics - Usage dashboards --- # China Zone Dedicated infrastructure for China users: ## Features - HK express routes to Aliyun/Volcengine/Huawei Cloud/Tencent Cloud - No VPN needed — direct connect from mainland China - Low latency API access - Alipay/WeChat Pay supported - PIPL compliant, data stays in China - Lark, DingTalk, WeCom native integration ## Compatible Tools in China CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode — all work seamlessly in China via OfoxAI endpoints. ## Endpoints - `https://api.ofox.ai/v1` (OpenAI) - `https://api.ofox.ai/anthropic` (Anthropic) - `https://api.ofox.ai/gemini` (Gemini) --- # Company OfoxAI is building the infrastructure layer for AI applications. Our mission is to make AI accessible to every developer. ## Contact - Website: https://ofox.ai - Documentation: https://docs.ofox.ai - GitHub: https://github.com/ofoxai - Twitter: https://x.com/ofoxai ## Legal - Terms of Service: https://ofox.ai/terms - Privacy Policy: https://ofox.ai/privacy