Changelog
Every step of OfoxAI — new models, new features, new experience. Updated weekly.
Analytics · Jun 16, 2026
📊 Usage and Cost, in One Report
Usage and Cost used to live on separate pages, so reconciling meant bouncing back and forth. They’re now merged into a single Analytics page — how much you used and how much you spent, all in one place, at a glance.
- One overview — Call volume, spend, tokens, and other key metrics together in a single view
- Drill down any way — Switch between model, member, API key, and app views in one click
- Flexible filtering — Custom date ranges plus combined filters surface your heaviest-used models and cost breakdown at a glance
Entry point: Analytics .
v1.2.6-20260605
🔐 API Key IP Allowlists
Bind an API key to trusted source IPs — even if the key leaks, requests from non-allowlisted addresses won’t go through.
- Supports single IPs and CIDR ranges, up to 50 entries per key
- Requests from non-allowlisted sources get a straight
403 - Leave it empty = no restriction — existing keys are unaffected
Entry point: API Key Management → individual key details.
v1.2.3-20260603
🟢 One-Click Google Login
Our second social login after GitHub — sign in, sign up, or link an account. The same email links automatically, and we remember your last login method for next time.
🎮 Playground Is Live
A new Playground entry in the console sidebar — try models, tune parameters, and compare results right in your browser, without writing a line of code. Open it at chat.ofox.ai .
🌐 Your Language Follows Your Account
Your language preference is saved to your account, so it carries across devices — and even system emails go out in the language you chose. A new language card now lives in Settings → Account .
New Models · Jun 2, 2026
🤖 New Models
- MiniMax M3 (MiniMax) — MiniMax’s new-generation flagship
- Qwen3.7 Plus (Alibaba Bailian) — The Qwen3.7 Plus tier, with direct support across all three protocols
- The xAI Grok family is live — Grok 4.3 and other xAI models have landed in the Model Plaza
Campaign · Jun 1, 2026
🎁 15% Off All GPT in June
The entire GPT lineup is 15% off all month, Jun 1 – Jul 1. No coupon code — the discount applies automatically at checkout. Catalog: GPT models .
v1.1.9-20260529
🌏 Japanese UI
The platform interface adds Japanese (ja), bringing us to four languages: English / 简体中文 / Русский / 日本語. The language switcher now uses a 🌐 icon and shows each language’s full native name, so it’s easier to find yours.
🤖 New Models
- Claude Opus 4.8 (Anthropic) — Anthropic’s new-generation flagship, another step up in reasoning and writing
New Models · May 22, 2026
🤖 New Models
- Qwen3.7 Max (Alibaba Bailian) — Qwen3.7’s top tier, with direct connectivity across the OpenAI, Anthropic, and Gemini protocols
New Models · May 20, 2026
🤖 New Models
- Gemini 3.5 Flash (Google) — The high-speed tier of Gemini 3.5
- Gemini 3.1 Flash Lite (Google) — A lighter, more economical Flash Lite tier
v1.1.6-20260519
🧾 Edit Your Own Invoice Details — Updated Instantly
Invoices and receipts get an upgrade — your billing details, your call.
- Fill in your own header — Maintain your company name, tax ID, address, and other billing info, and invoices pick it up automatically; edit and regenerate, and the new invoice updates instantly
- Real payment methods — Receipts show the method actually used, like
Visa ····4242orWeChat Pay - Cross-currency detail — The real charge currency and exchange rate are spelled out (e.g.
1 SGD = 5.5654 CNY) - Export anytime — Invoice and receipt links stay valid long-term; print to PDF straight from your browser
Entry point: Wallet → Orders → View Invoice. Manage your billing header under Settings → Organization .
v1.1.5-20260514
🔐 Authorize Third-Party Apps with Your OfoxAI Account (OAuth)
Third-party apps and AI agents can now connect to your OfoxAI account through standard OAuth — no more handing them your API key directly.
- Authorize once, call securely — Once authorized, an app can call models on your behalf and look up your balance, usage, and limits
- Precise attribution — Every call maps to a specific app, so usage and spend are crystal clear
- Revoke anytime — Manage authorized apps from the console and pull access with one click
- CLI-friendly — Device-code authorization means command-line and terminal tools sign in smoothly too
A unified login and authorization foundation for the growing ecosystem of tools and agents built on OfoxAI.
New Capabilities · May 7, 2026
🎙️ Audio Transcription (Speech-to-Text)
New OpenAI audio transcription models turn recordings and speech straight into text — call them with the same OpenAI-compatible protocol you already use: GPT-4o Mini Transcribe and GPT-4o Transcribe Diarize (with speaker diarization).
v1.1.4-20260502
🎁 GPT May Bonanza
Spend-and-get-back across the entire GPT lineup — six tiers, up to $250 back.
- Campaign window — May 1 – May 15
- Redemption window — May 16 – May 18
- Coverage — The full lineup, including GPT-5.5, the entire GPT-5.4 family, and GPT Image 2
- Teams — Member spend is pooled automatically to reach higher tiers together
Campaign page: GPT May Bonanza .
v1.1.0-20260428
💰 Budget Controls — Across Team, Member, and API Key
Turn “how much we spend” from a verbal agreement into a system-enforced limit. A single organization can now set spend caps along three dimensions × three time windows:
| Dimension | Use case |
|---|---|
| Team (Organization) | Company- or project-wide budget |
| Member (User) | Per-employee monthly quota |
| API Key | Independent budget for a specific app or service |
Each dimension supports daily / monthly / lifetime caps independently. Requests that would exceed any cap are rejected automatically.
Progress bars surface three warning levels:
- 🟢 40% — usage is healthy
- 🟡 80% — approaching the limit
- 🔴 110% — exceeded (a buffer prevents bursty traffic from instantly tripping the cap)
Hierarchy is validated for you: API Key cap ≤ Member cap ≤ Team cap. The UI shows the parent quota in real time so you can’t accidentally misconfigure.
Entry point: Settings → Quotas
⏱️ Team-Level RPM Quota
Introducing team-level rate limits (RPM) to stop multiple API keys from collectively blowing past your upstream provider’s limits.
- RPM is aggregated across the entire team, not measured per key
- Default is 100 RPM — contact [email protected] for higher limits
- Excess requests get an automatic
429 Too Many Requests
Useful for: bursty CI/CD traffic, runaway batch jobs, and unifying limits across collaborative teams.
🪙 Balance OpenAPI
A new endpoint, GET /v1/user/balance, returns the account’s available balance, lifetime credits, and lifetime spend — using any OfoxAI API key.
curl https://api.ofox.ai/v1/user/balance \
-H "Authorization: Bearer $OFOX_API_KEY"The response shape is compatible with third-party tools like cc-switch , so you can plug OfoxAI in as a balance provider directly.
🧰 cc-switch Integration
OfoxAI now works natively with cc-switch — switch to OfoxAI inside cc-switch and you’ll see your live balance, no extra glue code required.

Set it up in four steps:
- Open the usage-query config — click the 📊 icon in the top-right of the OfoxAI provider card
- Enable usage queries — flip the toggle on
- Paste your API key — any user-level OfoxAI API key works (create one in the Dashboard )
- Endpoint — choose “Generic Template” and set the URL to
https://api.ofox.ai/v1
Save, and the provider card immediately shows live status like Remaining: 64.77 USD.
Full walkthrough: cc-switch Integration Guide.
New Models · Apr 24, 2026
🤖 New Models
- GPT-5.5 (OpenAI) — A new flagship for complex, professional workloads. 1M+ token context (922K input / 128K output), with end-to-end gains in reasoning reliability and token efficiency over GPT-5.4
- DeepSeek V4 Pro (DeepSeek) — A 1.6T-parameter MoE flagship with 49B active params and 1M token context, optimized for advanced reasoning, code, and long-running agent workflows
- DeepSeek V4 Flash (DeepSeek) — A 284B-parameter / 13B-active MoE accelerator with 1M token context, built for high throughput and low latency at an aggressive price point
New Models · Apr 21, 2026
🤖 New Models
- Kimi K2.6 (Moonshot AI) — Moonshot’s smartest Kimi yet, with across-the-board upgrades to code, reasoning, and visual understanding
- GPT Image 2 (OpenAI) — Next-generation image model with richer, more accurate detail
New Models · Apr 16, 2026
🤖 New Models
- Claude Opus 4.7 (Anthropic) — Anthropic’s new flagship — another step up in reasoning and writing quality
Campaign · Apr 15, 2026
🎁 GPT April Rebate — Up to $250 Back
- Window — Apr 15 – Apr 25, 11 days only
- Rebate — Flat 25% back across the GPT lineup, six tiers, up to $250
- Redemption — Credits never expire; redeem in one click after the campaign ends
- Teams — Member spend is pooled automatically to unlock higher tiers
Campaign page: GPT April Rebate .
v1.0.55-20260407
🎁 Gift Card System
Enter a gift card code on the Wallet page — balance credits instantly. The most elegant way to give someone AI as a gift.
- Privacy by default — Transaction records show only the last four digits of the card
- Safe by design — Multi-layer anti-abuse protection with end-to-end encryption
🔍 Model Verify Tool
First, let’s set the record straight: OfoxAI is not a reseller gateway.
- Entity — Operated by NICE TALK PTE. LTD. (a global LLM platform)
- Licensing — Official authorization from model providers
- Compute — Azure, AWS, Google Cloud, Alibaba Cloud, Z.AI, Moonshot, Volcano Engine — direct from the cloud providers
- Routing — Edge CDN straight to each provider, no repackaging, no model swapping
So users can verify model authenticity on any LLM gateway, we’ve released a free tool. Point it at any API base + key, and it tells you whether the model has been substituted.
Tool: Model Verify . Works on any platform, not just OfoxAI.
v1.0.54-20260403
💳 Payments and Top-Ups, Upgraded
- Airwallex, alongside Stripe — more choice for international payments
- USD, CNY, or SGD — settle in the currency you already think in
- Top-up cap raised to $10,000 — headroom for larger customers
- $3 first-top-up bonus via partner referral — users referred by a partner get $3 credit on their first top-up, automatically
🏢 Enterprise Page — Spend More, Save More
Automatic rebates when your monthly spend hits a threshold. No application. No sales call. Credit lands on the first of next month.
| Tier | Monthly Spend | Rebate |
|---|---|---|
| Bronze | $1,000+ | 3% |
| Silver | $5,000+ | 4% |
| Gold | $10,000+ | 5% |
| Platinum | $20,000+ | 7% |
Stacks with these enterprise capabilities:
- 0% platform fee — pay the model provider’s list price
- Global edge routing — Tokyo / Singapore / Frankfurt POPs
- 99.99% availability SLA — multi-region redundancy with auto-failover
- Zero content retention — prompts and responses are not logged, not used for training
See: Enterprise .
🤖 New Models
- GLM-5V-Turbo (Zhipu) — Turbo-accelerated variant of GLM’s multimodal line
- Qwen3.6 Plus (Alibaba Bailian) — Latest Plus tier of Qwen3.6
v1.0.47-20260327
🏷️ One Model, Many Names
Short names, legacy IDs — call a model however you want. Migration becomes a no-op. The router normalizes aliases automatically.
A few examples:
| Canonical ID | Aliases |
|---|---|
anthropic/claude-opus-4.7 | claude-opus-4.7 · claude-opus-4-7 · claude-opus-4-7-20260416 |
anthropic/claude-sonnet-4.6 | claude-sonnet-4.6 · claude-sonnet-4-6 · claude-sonnet-4-6-20260217 |
openai/gpt-5.4-pro | gpt-5.4-pro |
openai/gpt-5.4 | gpt-5.4 |
moonshotai/kimi-k2.6 | kimi-k2.6 |
z-ai/glm-5.1 | glm-5.1 |
Fetch the full alias list via GET https://api.ofox.ai/v1/models — every model carries its aliases array in the response.
🖼️ Per-Image Billing
The Images API now bills per generated image, with transparent pricing. Standard sizes map to each provider’s native dimensions automatically — no client-side changes required.
📊 Image Usage, Fully Visible
Image generation is now a first-class dimension on the dashboard, usage, cost, and rankings pages. Monthly image spend is visible at a glance.
🤖 New Models
- GLM 5.1 (Zhipu) — Next-generation GLM with across-the-board capability upgrades
🔗 Shorter Invitation Links
Invitation links shortened to /x/your-code. Easier to remember, easier to share.
v1.0.39-20260320
🔄 Model Fallback — Automatic on Upstream Errors
When the primary model returns a 4xx or 5xx, the gateway automatically tries up to three fallback models. Works across OpenAI, Anthropic, and Gemini. Zero client-side changes. See the Fallback docs.
⚔️ OfoxAI vs OpenRouter, Side by Side
OpenRouter charges 5.5% per top-up. We don’t. Same 100+ models, and you keep 10%+ more once you pass $1,000/month in spend. Full breakdown: OfoxAI vs OpenRouter .
🤖 New Models
- GLM-5-Turbo (Zhipu) — Turbo-accelerated variant of GLM-5
- GPT-5.4 Mini / Nano (OpenAI) — Lightweight pair of GPT-5.4, dramatically lower cost per call
- MiniMax M2.7 / M2.7 Highspeed — MiniMax’s next-gen M2.7; Highspeed is tuned for low latency
v1.0.36-20260313
🎊 March Claude Rebate
A clean 20% rebate across every tier. Copy the coupon OFOXAI2603 with one click from the campaign modal.
| Top-Up | Rebate | You Get |
|---|---|---|
| $20 | $4 | $24 |
| $50 | $10 | $60 |
| $100 | $20 | $120 |
| $200 | $40 | $240 |
| $500 | $100 | $600 |
Campaign page: Claude Spring, Round 2 .
🤖 New Models
- GPT-5.4 / GPT-5.4 Pro (OpenAI) — New flagship pair; Pro offers a higher reasoning ceiling
- Gemini Embedding 2 Preview (Google) — Google’s next-generation multimodal embedding
🖼️ Embeddings, Every Modality
Gemini Embedding now handles text, image, audio, and video across all four modalities. Direct integrations with Qwen and Volcengine multimodal embeddings ship simultaneously.
⚡ Usage Data, Fresh by the Hour
Usage aggregation moved from daily to hourly. Spend shows up on the dashboard shortly after it happens.
💰 Clearer Coupons
Every order now shows discount and gift amounts at a glance.
v1.0.32-20260303
🎉 March Claude Campaign Goes Live
The dashboard gains a campaign banner and a live spend-progress bar. Coupon errors are now localized in English and Chinese. Campaign page: Claude Spring, Round 1 .
🤖 New Models
- GPT-5.3 Chat (OpenAI) — Conversation-tuned variant of GPT-5.3
- Gemini 3.1 Flash Lite Preview (Google) — Gemini 3.1’s lightweight preview
- Nano Banana 2 (Google) — Gemini 3.1 Flash Image Preview — next-generation image generation
🏷️ Navigation Refresh
- “My Billing” → “My Wallet” — a closer match to how users think about the page
- “Models” → “Model Plaza” — framed as a catalog to browse
- Blog link added to the header
v1.0.30-20260226
🔒 One-Click GitHub Login
A new GitHub OAuth option on the sign-in page. The system remembers your last login method for next time. Settings supports binding, unbinding, and GitHub profile sync.
🤖 New Models
- The full Qwen3.5 family, all at once (Alibaba Bailian) — Flash / 27B / 35B A3B / 122B A10B / 397B A17B
- GPT-5.3 Codex (OpenAI) — GPT-5.3 purpose-built for code
- Gemini 3.1 Pro Preview (Google) — Gemini 3.1 Pro preview release
- Qwen3 Coder Next (Alibaba Bailian) — Qwen’s new code-specialized model
📱 Mobile-Responsive Console
Users, Organizations, and Orders modules are now fully mobile-responsive. Collapsible sidebar, smart column hiding, and a touch-friendly experience on small screens.
v1.0.27-20260217
📊 Your Analytics Dashboard
Three interactive charts for Usage, Cost, and Requests. See monthly trends, rank your models, and combine filters across Provider, Model, User, API Key, and time range. Which model is doing the heavy lifting? Now it’s obvious.
🤖 New Models
- Claude Sonnet 4.6 (Anthropic) — Sonnet’s latest, the pragmatic value pick
- Qwen3.5 Plus (Alibaba Bailian) — Qwen3.5 Plus tier live
- Doubao Seed 2.0, all four tiers (Volcengine) — Code / Lite / Mini / Pro — the full Seed 2.0 lineup online
🌐 Aligned with OpenAI
chat/completions without stream now defaults to non-streaming — exactly like OpenAI. Your code? Unchanged.
v1.0.24-20260212
🤖 New Models
- GLM-5 (Zhipu) — GLM’s new-generation flagship
- MiniMax M2.5 / M2.5 Lightning — MiniMax pair; Lightning tuned for low latency
🎊 First-Login Welcome
On first login, the welcome modal presents all three API endpoints — OpenAI, Anthropic, Gemini — with one-click copy. Paired with a burst of confetti, because first impressions matter to developers too.
🧠 Provider Affinity Cache
When the same user switches between different models, the gateway prefers the same underlying provider. Prompt cache hit rate climbs, responses get faster, costs come down.
🎟️ Angel Referral Program
Full referral system shipped: card-based UI, one-click join dialog, and usage-history table. Two-way rewards for both inviter and invitee, plus one-click personal invite poster generation.
v1.0.20-20260206
🤖 New Models
- Claude Opus 4.6 (Anthropic) — Anthropic’s new flagship, raising the bar on reasoning and writing once more
🌍 English / Chinese Parity
Over 1,100 translation keys shipped. Full English / Chinese parity across the platform. Language preference is remembered via cookie.
🔍 Web Search Billing
Web Search tool calls across OpenAI, Anthropic, and Gemini are now accurately billed, per invocation.
📊 Dashboard Refresh
- Personalized greeting by username, instead of a generic “Hi”
- Weekly usage stats replace the single-day view
- API Key display, three modes: none, masked, or full
💵 Clearer Pricing Display
$0.6000 automatically drops trailing zeros, showing as $0.6. Low-balance error messages are now in dollar format — easier to read, no mental math.
📚 Documentation Site Launched
- Full OpenAI / Anthropic / Gemini protocol references
- Integration guides for 10+ tools — Claude Code, Codex, Gemini CLI, Zed, Cline, Cherry Studio, OpenClaw, OpenCode, and more — with end-to-end setup coverage
v1.0.1 ~ v1.0.9 · Jan 20 – Feb 1, 2026 — Two Weeks of Laying the Foundation
We didn’t take a breath after launch. Every release in these two weeks made the platform more stable, more precise, and easier to plug in.
💻 Claude Code, First-Class
We build with Claude Code ourselves. On Jan 21, the gateway shipped full Claude Code compatibility — point the API base at OfoxAI, swap the sk-*** key, and every Claude model just works.
🧠 Thinking Blocks
Thinking blocks — the model’s reasoning chain — now flow through end-to-end for Claude and Gemini. You see how the model thinks, not just the answer.
🌐 Native Gemini Protocol
Beyond OpenAI compatibility — Gemini’s native generateContent API is live. Google’s official SDK connects directly, with no translation loss.
💵 Multi-Currency Stripe
CNY, SGD, and more — in addition to USD. Exchange-rate snapshots are stored per order. Asia-Pacific users can now pay in their local currency.
🎯 Billing Precision to 6 Decimals
NanoDollar-level precision. Even an API call that costs a fraction of a cent is recorded and billed accurately. No rounding away large customer savings. No shortchanging small ones.
v1.0.0 · Jan 16, 2026 — The Gateway Goes Live
“From today on, one hundred models. One key.”
This is the day the OfoxAI platform opened to the public.
🚀 Day-One Capabilities
- Three protocols, one surface — OpenAI, Anthropic, and Gemini, all natively compatible. Zero code changes
- 100+ models — Claude , GPT , Gemini , DeepSeek , Qwen , and more — unified behind a single key. Full catalog: Model Plaza
- Smart routing — Provider × Model level routing chooses the fastest, steadiest path automatically. See Provider Routing
- API keys, self-serve — Create, rotate, and observe usage from the Dashboard
- Pay-as-you-go — The model provider’s list price. Zero platform fee. See Pricing
- Stripe checkout — Credit-card top-ups, balance tracked in real time
- Global edge — Tokyo, Singapore, and Frankfurt points of presence
🌐 The Infrastructure Underneath
Not a reseller gateway. A platform. Requests flow through edge CDN straight to Azure, AWS, Google Cloud, Alibaba Cloud, Z.AI, Moonshot, and Volcano Engine.
Day 1 · Dec 27, 2025 — How It Began
“Give developers the simplest way to reach the smartest models in the world.”
🦊 The First Line of Code
Late December 2025, a single small commit laid down the first line of OfoxAI’s code:
feat: initialize ofox-studio monorepo⚡ The Moment We Knew
Three days later, we got two things working at the same time: Claude on AWS Bedrock, and GPT on Azure — two hyperscalers, two top-tier models, directly connected, no reseller in the middle.
When both responses landed in the terminal at the same moment, we knew: this is going to work.
This wasn’t a demo-grade adapter. This was real multi-cloud direct connectivity. Google Cloud, Alibaba Cloud, Z.AI, Moonshot, and Volcano Engine followed, one after another. “Not a reseller gateway, a platform” — that principle was set in stone from Day 3.
🌱 The Starting Point
commit 0001
One line of code, one direction. Make the world’s smartest intelligence accessible to anyone.
Engines, ignite.