Markup
25%
over direct provider price
Credit purchase fee
0%
pay what you see
Free signup credit
$2.00
≈ 10M tokens on DeepSeek V4-Flash
Minimum topup
$20.00
no monthly commitment

Per-model pricing

All prices in USD per 1 million tokens. Input = your prompt, Output = model's response. Prices already include our 25% markup.

Model Input (per 1M) Output (per 1M) Context
DeepSeek V4-Flash Popular
deepseek-v4-flash · 1M context, 384K output, dual reasoning modes
$0.14$0.175 $0.28$0.35 1M
DeepSeek V4-Pro Flagship
deepseek-v4-pro · 1.6T params, SWE-bench 80.6%, approaches Claude Opus
$0.4264$0.533 $0.8528$1.066 1M
DeepSeek Chat (legacy)
deepseek-chat · Alias of v4-flash non-thinking · deprecates 2026-07-24
$0.14$0.175 $0.28$0.35 128K
DeepSeek Reasoner (legacy)
deepseek-reasoner · Alias of v4-flash thinking · deprecates 2026-07-24
$0.4264$0.533 $0.8528$1.066 128K
Qwen Plus New
qwen3.6-plus · Alibaba flagship, 78.8% SWE-Bench, 1M context
$0.40$0.50 $1.20$1.50 1M
Qwen Max Flagship
qwen3.6-max-preview · #1 on 6 coding benchmarks, closed weights
$1.60$2.00 $6.40$8.00 260K
Qwen Turbo
qwen-turbo · Fast and cheap, good for high-volume
$0.05$0.0625 $0.20$0.25 131K
GLM-5.1 New
glm-5.1 · Zhipu flagship, #1 on SWE-Bench Pro
$1.00$1.25 $4.00$5.00 200K
GLM-5 New
glm-5 · 744B MoE, approaches Claude Opus-level coding
$1.00$1.25 $4.00$5.00 200K
GLM-4.7
glm-4.7 · 73.8% SWE-Bench, best value for coding
$0.15$0.1875 $0.60$0.75 128K
GLM-4.7 Flash Low cost
glm-4.7-flash · Free tier, good for simple completions
$0.05$0.0625 $0.05$0.0625 203K
Kimi K2.6 New
kimi-k2.6 · Long-context coding stability, 256K window
$0.60$0.75 $3.00$3.75 256K
Kimi K2.5
kimi-k2.5 · 1T MoE, 32B active, proven stable
$0.60$0.75 $3.00$3.75 256K
MiniMax M2.7
MiniMax-M2.7 · 230B MoE, strong on software engineering
$0.30$0.375 $1.20$1.50 245K

Strikethrough shows the direct provider price. XinoAPI price is what you pay. All prices in USD per 1 million tokens.

Real-world cost examples

What common workloads actually cost at XinoAPI prices.

Chat Assistant (500 users/day)

$11/mo
ModelDeepSeek V4-Flash
Daily requests500
Avg tokens/req2K in, 1K out
Monthly input30M tok
Monthly output15M tok

Code Assistant (10 devs)

$19/mo
ModelDeepSeek V4-Flash
Daily requests300
Avg tokens/req8K in, 2K out
Monthly input72M tok
Monthly output18M tok

Document Summary (bulk)

$70/mo
ModelQwen 3.6 Plus
Documents/month10,000
Avg tokens/doc8K in, 500 out
Monthly input80M tok
Monthly output5M tok

Reasoning Workflow

$35/mo
ModelDeepSeek V4-Pro
Daily requests100
Avg tokens/req5K in, 8K out
Monthly input15M tok
Monthly output24M tok

Free to start, no credit card

Every new account gets $2.00 in credits on signup. That's roughly 10 million tokens on DeepSeek V4-Flash — enough to build and ship a working prototype.

$2.00

Production billing

No token-price discounting. Larger customers get operational guarantees, billing support, and dedicated routing instead of hidden price cuts.

$20 minimum
PAYG
Cards via Stripe
$500+
Invoice
ACH / wire available
$5K+/mo
SLA
Priority routing and support
Enterprise
Custom
Dedicated channels and compliance

For enterprise volumes (>$5K/month), invoicing, or dedicated routing, contact sales.

Data and compliance notes

Pricing is not the whole decision. XinoAPI is designed for users outside mainland China and routes requests to third-party model providers with different data policies.

Control Current policy Why it matters
Mainland China access Not permitted for registration, purchase, dashboard access, or API use. Maintains a clear cross-border service boundary for Chinese LLM inference export.
Prompt/response storage No plaintext content retention by default; billing uses metadata such as model, tokens, status, and timestamps. Reduces data exposure for production agent and application workloads.
Provider terms Users must comply with each upstream provider's terms, data policy, and regional restrictions. XinoAPI is a gateway, not the developer or operator of upstream models.
Sensitive data Use the Privacy SDK for local PII and secret redaction before sending prompts. Provider-side policies vary, especially for models operated in mainland China.

See the Compliance Center and Security Whitepaper for the full policy.

Pricing FAQ

Is XinoAPI cheaper than using DeepSeek or Qwen directly?

No — direct provider prices are lower by 25%. You pay a markup for unified billing, no-KYC access, sub-200ms latency from Singapore, and the included Privacy SDK. If you're based in China and only use one model, direct providers will be cheaper. If you're outside China, need multiple models, or want built-in security features, XinoAPI is usually the better economic choice despite the markup.

Are there any hidden fees?

No. You pay per token at the published rate. There's no API request fee separate from tokens, no inactivity fee, and credits never expire. The minimum Stripe top-up is $20 to keep card processing overhead sustainable.

How is this different from OpenRouter's pricing?

OpenRouter charges 0% token markup + 5.5% fee on credit purchases (minimum $0.80). XinoAPI charges 25% token markup + 0% purchase fee. At typical usage (~$50-100/month), OpenRouter ends up 10-15% cheaper on pure token cost. XinoAPI's value is in the security features, Chinese model specialization, and no-KYC requirement — not price competition.

What payment methods are accepted?

Credit and debit cards (Visa, Mastercard, AmEx) via Stripe. Bank transfers (ACH, wire) for topups above $500. Cryptocurrency payments coming soon. We do not accept Alipay or WeChat Pay — use the direct provider APIs if you need these payment methods.

Do credits expire?

No. Credits never expire on any plan. Unused credits remain in your account indefinitely.

Can I get a refund for unused credits?

Yes, within 30 days of purchase and provided no more than 10% of the credit has been consumed. Contact support@xinoapi.com with your account email and order ID.

How many tokens will $1 buy me?

Depends on the model. On DeepSeek V4-Flash input, $1 buys about 5.7 million tokens (roughly 4 million words). On DeepSeek V4-Pro output, $1 buys about 940,000 tokens. Use the cost calculator for exact estimates based on your expected request shape.

Do you charge for failed requests?

No. If an upstream provider returns a 5xx error or the request fails before the model generates output, you're not charged. Rate limit errors (429) and your own malformed requests (4xx) also don't consume credits.

Is the Privacy SDK free?

Yes. The XinoAPI Privacy SDK is MIT-licensed and free for any use, including with other LLM providers. Install from PyPI with pip install xinoapi-privacy.

Do you offer a free tier for students / open-source projects?

Yes. Open-source maintainers and students can apply for $20/month in free credits by emailing community@xinoapi.com with a link to your project or student ID.

What's the price of DeepSeek V4 on XinoAPI?

DeepSeek V4-Flash costs $0.175 per 1M input tokens and $0.35 per 1M output tokens. DeepSeek V4-Pro costs $0.533 input and $1.066 output per 1M tokens, based on Tencent Cloud's June 2026 V4-Pro price cut converted at Stripe's 7.0380 CNY/USD rate with a 25% XinoAPI markup. Both models support 1M context and 384K max output.

Which DeepSeek model should I use?

Use deepseek-v4-flash for general tasks (chat, RAG, code completion) — it's the successor to V3.2 at the same price. Use deepseek-v4-pro when you need flagship reasoning performance (SWE-bench 80.6%, approaches Claude Opus 4.6). Note that deepseek-chat and deepseek-reasoner are now aliases of V4-Flash and will be deprecated on 2026-07-24 — migrate to explicit V4 model IDs before then.

What's the cheapest model on XinoAPI?

GLM-4.7 Flash is free (limited quota). Among paid models, Qwen 3.5 Turbo at $0.0625 input / $0.25 output per 1M tokens is the cheapest. For quality-to-cost ratio, DeepSeek V4-Flash ($0.175 / $0.35) with 1M context is usually the best choice.

Start building with $2 free credits

No credit card required. 5 Chinese LLMs. Unified OpenAI-compatible API.