MiniMax M2.7 API Pricing 2026: Free Tier, Setup, and How It Stacks Against DeepSeek and Kimi

MiniMax M2.7 API Pricing 2026: Free Tier, Setup, and How It Stacks Against DeepSeek and Kimi

TL;DR — MiniMax M2.7 sits in the open-weight sweet spot: $0.30/$1.20 per million tokens, 205K context, and a 50 on Artificial Analysis’s Intelligence Index — close to closed-source flagships at roughly a tenth of their price. It is now the smarter default than the original M2 (released October 2025) and slots in cleanly next to DeepSeek V4 and Kimi K2.6 for Chinese-vendor coverage. Free trial credits are available at signup, and a recently updated license permits free personal use — commercial deployment still requires paid API or separate authorization.

For builders running long-horizon agents on a budget, MiniMax M2.7 is the first open-weight model that hits Claude Opus territory on real benchmarks without Claude Opus pricing — at $1.20 per million output tokens, you’re paying roughly 5% of Opus 4.7’s $25/M output rate for an Intelligence Index score within striking distance.

This is a comparison piece — not the solo MiniMax M2.5/M2.7 access guide. Here we focus on whether MiniMax M2.7 actually wins for your workload once you stack it against the other two Chinese contenders developers are seriously evaluating in mid-2026.

MiniMax M2.7 at a glance

MiniMax M2.7 is a 230-billion-parameter Mixture-of-Experts model with 10B active parameters at inference time, released March 18, 2026. It is positioned as an “agentic” model — trained to handle long-running tool-use workflows like multi-step debugging, document generation, and root cause analysis rather than one-shot chat.

SpecMiniMax M2.7
ReleasedMarch 18, 2026
Context window205,000 tokens
Parameters230B total / 10B active (MoE)
Input price (official API)$0.30 / M tokens
Output price (official API)$1.20 / M tokens
Artificial Analysis Intelligence Index50
Output speed~48 tokens/sec (standard); ~100 tok/s (HighSpeed variant)
LicenseOpen weights, Modified-MIT (commercial use requires authorization)

A few things worth flagging from those numbers:

  • The Intelligence Index jump from M2 (36) to M2.7 (50) is large. That moves it from “decent open-weight” to “competitive with closed-source flagships on most reasoning tasks.” For context, Claude Opus 4.7 sits in the high 50s to low 60s on the same index.
  • ~48 tokens/sec is below average for open-weight models its size — well below the ~96 t/s median for comparable reasoning models. If your workload is latency-sensitive (interactive coding agents, voice frontends), evaluate the M2.7 HighSpeed variant (~100 t/s) or benchmark before committing.
  • 205K context is real, but effective accuracy past 128K still degrades on most open models. Treat the upper third as room for retrieval, not as a place you actually expect the model to reason across.
  • The “open weights” label needs an asterisk. M2.7 ships under a Modified-MIT license that explicitly bans commercial deployment without prior authorization — a break from MiniMax’s previous MIT releases (M2, M2.5). Personal, research, and self-hosted experimentation are free; running it as a service for paying customers is not.

Free tier and signup credits

MiniMax M2.7 has three free-access paths in 2026:

  1. Direct trial credits via platform.minimax.io. New accounts get credits at signup — the exact amount has shifted across promotions, so check the dashboard after activating your developer account. Tied to phone or email verification.
  2. Free personal use under the updated M2.7 license. Self-hosted weights from Hugging Face are free for individual programming, research, and non-commercial application building. Commercial production deployment still requires the paid API or a separate license grant.
  3. Hugging Face Spaces hosted demo — fine for kicking the tires on prompts, not a real API endpoint for production. OpenRouter also exposes M2.7 with pay-as-you-go credits if you’d rather not register a MiniMax account.

If you want to test M2.7 alongside competitors without juggling separate accounts, the aggregator route is simpler. ofox.ai’s unified API carries M2.5 and M2.7 plus DeepSeek, Kimi, Qwen, and the closed-source flagships through a single OpenAI-compatible endpoint. One key, one base URL, swap the model field to compare.

Setup via the OpenAI SDK (works direct or through ofox)

MiniMax exposes an OpenAI-compatible chat completions endpoint, so if you’ve already written code against openai-python or openai-node, swap the base URL and model field — nothing else changes.

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://api.ofox.ai/v1",  # or MiniMax direct
)

response = client.chat.completions.create(
    model="minimax/minimax-m2.7",
    messages=[{"role": "user", "content": "Refactor this Python script for async I/O."}],
)
print(response.choices[0].message.content)

The model identifier convention on ofox follows <vendor>/<model> — check GET /v1/models for the current list, since model strings shift when vendors rename versions. The OpenAI SDK migration guide covers the auth and base URL swap in more detail if you’re moving an existing project.

For tool use and agent workflows — which is where M2.7’s training shines — wire it up with the standard OpenAI tools parameter. The function calling complete guide has the schema patterns; nothing M2.7-specific is required.

MiniMax M2.7 vs DeepSeek V4 vs Kimi K2.6

This is the comparison most developers actually need. All three are Chinese-vendor models, all three sit in the “cheap-but-capable” tier, and the differences are not obvious from the marketing pages.

DimensionMiniMax M2.7DeepSeek V4 Flash / ProKimi K2.6
Input price (official)$0.30 / M$0.14 / M (Flash) · $1.74 / M (Pro list)$0.95 / M
Output price (official)$1.20 / M$0.28 / M (Flash) · $3.48 / M (Pro list)$4.00 / M
Context window205K1,000K260K
ReleasedMarch 2026April 24, 2026April 20, 2026
MultimodalText onlyText onlyText + image + video
Strongest atLong-horizon agents, multi-step workflowsMath, code, sheer cost-per-token (Flash); flagship reasoning (Pro)Coding, UI/UX generation, tool use
Output speed~48 tok/s (standard)FastModerate
Cache pricingStandard98% off cache-hit input ($0.0028/M on Flash)Provider-dependent
LicenseModified-MIT (non-commercial)Open weightsOpen weights

Read it dimension by dimension:

Raw price-per-token. DeepSeek V4 Flash demolishes everyone on raw cost — $0.14 in / $0.28 out per million tokens, with a 1M context window. If your workload is output-heavy (generating documents, long answers) and you don’t need M2.7’s agent training, V4 Flash is the cheaper hammer by a wide margin. DeepSeek’s cache-hit discount (98% off on repeated context — $0.0028/M cached input on Flash) cuts costs further on RAG-style workloads where the retrieved context is stable. V4 Pro is the flagship comparison if you need reasoning depth: list price $1.74/$3.48, currently on a 75%-off promotion through May 31, 2026 ($0.435/$0.87).

Context window. DeepSeek V4 ships with a 1M-token window, easily the largest of the three. Kimi K2.6 sits at 260K, MiniMax M2.7 at 205K. Worth pairing this with our long-context benchmarks article — none of these is great past 128K once you measure actual retrieval accuracy rather than nominal window, so don’t over-index on the numeric ceiling.

Modality. Only Kimi K2.6 handles image and video input. If your stack is text-only, this is irrelevant. If you’re doing UI-from-screenshot or document-with-figures workflows, Kimi is the only one of the three you can actually use.

Agent workloads. MiniMax M2.7 was explicitly trained for multi-agent collaboration, live debugging, and document generation across multi-step pipelines. Kimi K2.6 was trained for long-horizon coding and multi-agent orchestration. Both are agent-strong; the practical difference is that MiniMax tends to maintain coherence over longer plans, while Kimi is faster to iterate within a tight code-edit loop. DeepSeek V4 Pro is positioned for long-horizon agent workflows too, but V4 Flash is best treated as a low-cost reasoning workhorse rather than a fully autonomous agent driver.

Tooling maturity. All three expose OpenAI-compatible APIs at this point. DeepSeek’s tool-calling implementation is the most battle-tested (it’s been around longest), Kimi’s is solid, MiniMax’s is newer and you’ll occasionally hit edge cases with deeply nested tool schemas.

Which one should you pick?

A simple decision rule that actually holds up after a few months of using all three:

  • Pick MiniMax M2.7 if you’re building long-horizon agents — research assistants, multi-step debugging, financial modeling pipelines — where the model needs to maintain a coherent plan across dozens of tool calls. The agentic training shows up most on workloads longer than 5 minutes of continuous reasoning. Just make sure the Modified-MIT license fits your commercial posture.
  • Pick DeepSeek V4 Flash if your workload is high-volume reasoning or code completion with stable context (RAG, document Q&A) and you want the lowest dollar-per-task cost. The 98% cache-hit discount is the kicker — on workloads where 80% of your context is the same retrieved chunks, you pay close to nothing for input. Step up to DeepSeek V4 Pro if you need more reasoning depth and can absorb the higher list price.
  • Pick Kimi K2.6 if you need multimodal input (screenshots, diagrams, video) or you’re doing coding-driven UI generation. K2.6 is the only one of the three that handles non-text input natively, and it’s been clocking competitive SWE-bench scores against Claude Opus on real coding workflows.

For mixed workloads — which is most production teams — running all three behind an aggregator and routing per-task-type is the right move. We cover that pattern in detail in the LLM API selection decision matrix and the hybrid routing pattern guide for production setups.

What changed from M2 to M2.7

If you’re already on M2 (the October 2025 release), the upgrade decision is mostly straightforward — with one caveat:

  • Intelligence Index: 36 → 50. That’s a meaningful capability jump, not a polish release. M2.7 handles reasoning chains that M2 visibly struggles with.
  • Pricing: $0.30 input, $1.20 output. M2.7 lands at roughly the same per-token rate as M2 on the official API — the capability delta is essentially free.
  • Context: ~205K, unchanged. Don’t upgrade for context — both M2 and M2.7 sit in the same window.
  • Agentic training. This is the real upgrade — M2.7 was explicitly post-trained for multi-agent and long-horizon workflows. M2’s agent behavior was an emergent capability; M2.7’s is intentional.
  • License caveat. M2 was MIT (commercial OK). M2.7 is Modified-MIT and bars commercial use without authorization. If your deployment was relying on the MIT terms for self-hosted commercial inference, you’ll either need to stay on M2/M2.5, route through the paid API, or get separate authorization from MiniMax.

There’s no migration code work. The model identifier is the only thing that changes on your side. Run a sample of your real prompts through both to confirm the quality bump shows up on your specific workload before flipping production over.

Common gotchas

A few rough edges worth knowing before you commit:

  1. Output speed. ~48 tok/s on the standard model is noticeably slower than DeepSeek V4 or Kimi K2.6 at default settings. If you’re streaming to a user, plan for that — or evaluate the M2.7 HighSpeed variant (~100 tok/s) at roughly 2x the price.
  2. Tool schema edge cases. Deeply nested or recursive JSON schemas in function definitions occasionally trip M2.7. The standard workaround is flattening to one or two levels of nesting.
  3. License surprise. The “open weights” framing is genuine, but the Modified-MIT license bars commercial use without prior authorization. Read the actual LICENSE file in MiniMax’s Hugging Face repo before you ship anything that runs the weights for paying customers.
  4. Effective context. Like every model in this tier, accuracy past 128K is materially worse than at 32K. Use the 205K window as room for retrieval, not as a license to dump your entire codebase into the prompt and expect perfect recall.
  5. Region and compliance. MiniMax is a China-headquartered vendor. If your compliance posture requires keeping data out of China-located inference, route through an aggregator that gives you a non-China inference path or pick DeepSeek/Kimi via a Western-hosted aggregator. The LLM API gateway choice guide covers the routing options.

Bottom line

MiniMax M2.7 has earned its slot as one of the three Chinese-vendor models worth seriously evaluating in mid-2026. The pricing is competitive (closer to DeepSeek V4 Flash than to Kimi on input, well under Kimi on output), the agentic training is the real selling point, and the free trial credits plus free-personal-use license are generous enough to actually finish a meaningful proof of concept. The version-number bump from M2 to M2.7 is bigger than the decimal suggests — if you’re still on M2 and don’t depend on the original MIT license terms, upgrade now.

The honest test for any open-weight model in 2026 isn’t “can it match Claude Opus on benchmarks?” — it’s “can it run your workload at a tenth of the cost without making you babysit it?” MiniMax M2.7 is the first model in this class where the answer is plausibly yes.

If you’re picking one for an agent stack, M2.7 is now the default Chinese-vendor pick over Kimi K2.6 for text-only workloads and over DeepSeek V4 Flash for anything resembling autonomous tool use. For broader model selection beyond just the Chinese-vendor tier, our LLM leaderboard and the Claude vs GPT vs Gemini comparison guide cover the full picture against closed-source flagships.


Sources and references