# OfoxAI

> Unified LLM API Gateway — one API for 100+ models including GPT-5.2, Claude Opus 4.5, Gemini 3, DeepSeek V3.2. OpenAI/Anthropic/Gemini protocol compatible. 99.9% SLA. China direct connect. Works with CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode.

This is the complete documentation for OfoxAI. For a concise overview with links, see [llms.txt](https://ofox.ai/llms.txt).

@doc-version: 2.0.0
@last-updated: 2026-02-24
@language: en, zh
@canonical: https://ofox.ai/llms-full.txt
@see-also: https://ofox.ai/llms.txt
@documentation: https://docs.ofox.ai

---

# Overview

OfoxAI is a unified LLM API gateway designed for developers building AI-powered applications. We aggregate 100+ top AI models from leading providers and offer them through standard API interfaces (OpenAI, Anthropic, Google compatible).

## Why OfoxAI?

1. **Single Integration**: One API key, 100+ models
2. **Cost Savings**: Pay-as-you-go, open-source models up to 70% off
3. **Reliability**: Enterprise-grade infrastructure with 99.9% SLA
4. **Flexibility**: Switch between models without code changes
5. **China Ready**: Dedicated China endpoints with HK express routes, no VPN needed
6. **Tool Ecosystem**: Native support for CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode, Windsurf

---

# API Reference

## Authentication

All API requests require an API key passed in the `Authorization` header:

```
Authorization: Bearer <YOUR_OFOXAI_API_KEY>
```

## Base URLs

| Protocol | Endpoint |
|----------|----------|
| OpenAI | https://api.ofox.ai/v1 |
| Anthropic | https://api.ofox.ai/anthropic |
| Gemini | https://api.ofox.ai/gemini |

All endpoints work globally and are optimized for China access via HK express routes.

For complete API documentation, SDKs, and integration guides, visit [https://docs.ofox.ai](https://docs.ofox.ai).

## Chat Completions (OpenAI Compatible)

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.ofox.ai/v1",
    api_key="<OFOXAI_API_KEY>"
)

response = client.chat.completions.create(
    model="openai/gpt-5.2",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)
```

## Messages API (Anthropic Compatible)

```python
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.ofox.ai/anthropic",
    api_key="<OFOXAI_API_KEY>"
)

message = client.messages.create(
    model="anthropic/claude-opus-4.5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude!"}
    ]
)

print(message.content[0].text)
```

## Gemini API (Google Compatible)

```python
import google.generativeai as genai

genai.configure(
    api_key="<OFOXAI_API_KEY>",
    transport="rest",
    client_options={"api_endpoint": "https://api.ofox.ai/gemini"}
)

model = genai.GenerativeModel("gemini/gemini-3-pro")
response = model.generate_content("Hello!")

print(response.text)
```

## Node.js Example

```javascript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.ofox.ai/v1",
  apiKey: "<OFOXAI_API_KEY>",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-5.2",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);
```

## cURL Example

```bash
curl https://api.ofox.ai/v1/chat/completions \
  -H "Authorization: Bearer <OFOXAI_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.2",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

---

# Pricing

## Plans

### Free Tier
- 10+ free models available
- Access to all models
- 3 RAG knowledge bases
- Basic MCP tools
- Community support

### Pro Plan (Pay as you go)
- Pay per token, no monthly fees
- Flagship models 20% off
- Open-source models up to 70% off
- Unlimited RAG knowledge bases
- Full MCP tools access
- Unlimited team members
- Ticket + online support
- 99.9% SLA

### Enterprise
- Custom tiered pricing
- Private cloud / hybrid cloud deployment
- Custom RAG solutions
- Custom MCP integrations
- SSO/SAML support
- Dedicated success manager
- 24/7 support hotline
- 99.99% SLA

## Payment Methods
- Alipay, WeChat Pay (China)
- Credit card (Global)

---

# Supported Models

## Flagship Models

### OpenAI
- GPT-5.2, GPT-5.2 Mini
- o3, o4-mini
- GPT-4o, GPT-4o Mini

### Anthropic
- Claude Opus 4.5
- Claude Sonnet 4
- Claude Haiku 4

### Google
- Gemini 3 Pro
- Gemini 2.5 Flash, Gemini 2.5 Pro

### xAI
- Grok 4, Grok 4.1

## Open Source Models

### DeepSeek
- DeepSeek-V3.2
- DeepSeek-R1

### Alibaba (Qwen)
- Qwen3-Max, Qwen3-Plus

### Moonshot (Kimi)
- Kimi-K2

### Meta
- Llama 4

### ByteDance (Doubao)
- Doubao Pro

## AIGC Models

### Image Generation
- Flux 2 Max, Seedream 4.5, Nano-Banana
- Z-Image, Wan 2.5, Qwen-Image-Turbo

### Video Generation
- Sora 2 Pro, Veo 3.1, Kling 2.6 Pro

### Voice & Digital Human
- ElevenLabs, IndexTTS2, Gemini TTS
- Seedance V1 Pro, Whisper V3
- HeyGen 5.0, Synthesia, Pika 2.1

---

# Vibe Coding - Unlimited AI API for Coding Tools

@keywords: claude code rate limit, cursor api alternative, claude unlimited, vibe coding, ai coding tools, anthropic api proxy, CherryStudio, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode, Windsurf

## Overview

Vibe Coding is the developer workflow: describe what you want in natural language, and AI writes the code. Tools like CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, and OpenCode have made this possible — but rate limits keep breaking the flow.

OfoxAI solves this with unlimited Claude/GPT API access through native protocol compatibility.

## Problem

1. **Rate Limits**: Claude/GPT API enforces RPM/TPM limits that interrupt long coding sessions
2. **4xx Errors**: Frequent 429 (Too Many Requests) errors during heavy usage
3. **China Access**: High latency or blocked access for developers in China

## Solution

Simple 2-line configuration change:

```bash
# Add to ~/.zshrc or ~/.bashrc
export ANTHROPIC_BASE_URL=https://api.ofox.ai/anthropic
export ANTHROPIC_API_KEY=<YOUR_OFOXAI_API_KEY>
```

For OpenAI-compatible tools:
```bash
export OPENAI_BASE_URL=https://api.ofox.ai/v1
export OPENAI_API_KEY=<YOUR_OFOXAI_API_KEY>
```

## Supported Tools

| Tool | Compatibility | Setup |
|------|---------------|-------|
| CherryStudio | Full | Settings > API Provider |
| Claude Code | Full | Environment variable |
| Cursor | Full | Settings > Models > Override API URL |
| Chatbox | Full | Settings > API Provider |
| Cline | Full | Extension settings |
| Codex | Full | Environment variable |
| Zed | Full | Settings > AI Provider |
| OpenClaw | Full | Settings > API |
| Kilo | Full | Settings > Provider |
| OpenCode | Full | Environment variable |
| Windsurf | Full | Environment variable |
| Aider | Full | Environment variable |

## Key Benefits

1. **Unlimited RPM/TPM**: No request or token limits
2. **Native Protocol**: 100% Anthropic/OpenAI/Gemini API compatible, zero code changes
3. **99.9% SLA**: Enterprise-grade reliability
4. **China Optimized**: HK express routes with low latency
5. **All Models**: Claude Opus 4.5, Sonnet 4, GPT-5.2 and full lineup supported

## FAQ

Q: Will my existing Claude Code/Cursor setup work?
A: Yes, just change the base URL. All features including tool use, vision, streaming, and extended thinking work identically.

Q: Is there really no rate limit?
A: Correct. OfoxAI aggregates capacity across multiple accounts and regions to provide unlimited access.

Q: How is latency compared to official API?
A: Global: ~50ms overhead. China: significantly faster due to HK express infrastructure.

Q: Which tools are supported?
A: CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode, Windsurf, Aider — any tool that supports custom API base URL.

---

# Features

## RAG (Retrieval-Augmented Generation)

Build knowledge bases from your documents and query them with any LLM:

- Support for PDF, Word, Markdown, HTML
- Automatic chunking and embedding
- Semantic search with vector databases
- Context injection into LLM prompts

## MCP (Model Context Protocol)

Extend LLM capabilities with external tools:

- Web search integration
- Code execution sandboxes
- Database queries
- API integrations
- Custom tool development

## Agent Apps

Build autonomous AI agents:

- Multi-step reasoning
- Tool orchestration
- Memory and context management
- Human-in-the-loop workflows

## Observability

Full-stack monitoring:

- Request tracing and cost analytics
- Anomaly alerts and error tracking
- Per-model performance metrics
- Usage dashboards

---

# China Zone

Dedicated infrastructure for China users:

## Features
- HK express routes to Aliyun/Volcengine/Huawei Cloud/Tencent Cloud
- No VPN needed — direct connect from mainland China
- Low latency API access
- Alipay/WeChat Pay supported
- PIPL compliant, data stays in China
- Lark, DingTalk, WeCom native integration

## Compatible Tools in China
CherryStudio, Claude Code, Cursor, Chatbox, Cline, Codex, Zed, OpenClaw, Kilo, OpenCode — all work seamlessly in China via OfoxAI endpoints.

## Endpoints
- `https://api.ofox.ai/v1` (OpenAI)
- `https://api.ofox.ai/anthropic` (Anthropic)
- `https://api.ofox.ai/gemini` (Gemini)

---

# Company

OfoxAI is building the infrastructure layer for AI applications. Our mission is to make AI accessible to every developer.

## Contact
- Website: https://ofox.ai
- Documentation: https://docs.ofox.ai
- GitHub: https://github.com/ofoxai
- Twitter: https://x.com/ofoxai

## Legal
- Terms of Service: https://ofox.ai/terms
- Privacy Policy: https://ofox.ai/privacy