TL;DR
Use direct provider APIs if you're based in China and only need one model. Use OpenRouter if you mainly use OpenAI/Anthropic/Google and occasionally need Chinese models. Use XinoAPI if Chinese models are your primary workload, you need no-KYC access from outside China, or you need a provider with built-in PII redaction and router-integrity verification.

Feature comparison

All data verified April 2026. Prices in USD per million tokens.

XinoAPI OpenRouter LiteLLM (self-hosted) Direct (DeepSeek/etc)
Pricing model 25% token markup 0% markup + 5.5% on credit purchase Self-hosted (infra cost only) Provider's native price
KYC required No No You set policy Yes (Chinese phone / ID)
Chinese models 5 providers, all latest Some, via aggregators Bring your own keys 1 provider per API key
Unified billing Yes Yes No (per-provider) No
OpenAI SDK compatible Yes Yes Yes Most (not all)
Client-side PII redaction SDK Included No No No
Response integrity (HMAC signing) Yes No No No
Audit log (hash-chained) Yes Usage only Self-configured Usage only
Streaming (SSE) Yes Yes Yes Yes
Failover / fallback routing Yes Yes Yes N/A
Free tier on signup $2.00 credits Rate-limited free models N/A Varies by provider
Payment methods Card, Stripe Card, crypto N/A Alipay, WeChat, Card
Open source SDK only No Yes No

Price comparison

Effective price per million tokens after all fees. Assumes $50 credit purchase.

Model Direct price XinoAPI (1.25x) OpenRouter (effective)
DeepSeek V4-Flash (input) $0.14 $0.175 $0.15 (5.5% credit fee)
DeepSeek V4-Flash (output) $0.28 $0.35 $0.30
DeepSeek V4-Pro (input) $0.4264 $0.533 $1.84
DeepSeek V4-Pro (output) $0.8528 $1.066 $3.67
Qwen Plus (input) $0.40 $0.50 $0.29
Qwen Plus (output) $1.20 $1.50 $1.74
GLM-5 (input) $1.00 $1.25 $1.06
GLM-5 (output) $4.00 $5.00 $3.38
GLM-5.1 (input) $1.00 $1.25 N/A
Kimi K2.6 (input) $0.60 $0.75 $1.00
Kimi K2.6 (output) $3.00 $3.75 $4.22
MiniMax M2.7 (input) $0.30 $0.375 N/A

Effective OpenRouter price assumes models are routed without additional aggregator hops. In practice some Chinese models go through 2-3 hops, which means your prompt is seen in plaintext by each one. See the security whitepaper.

Latency comparison

Time to first token (TTFT), measured from Singapore, April 2026. Averaged over 100 requests per model.

Model XinoAPI (Singapore) OpenRouter Direct provider
DeepSeek V4-Flash 180ms 320ms 140ms (from CN only)
Qwen 3.6 Plus 160ms 285ms 120ms (from CN only)
GLM-5 210ms 355ms 150ms (from CN only)
Kimi K2.6 195ms 410ms 145ms (from CN only)
MiniMax M2.7 230ms Not available 180ms (from CN only)

Direct provider latency is measured from mainland China. Accessing these same providers from outside China without a VPN typically fails entirely or adds 500ms+ of routing overhead.

Which one should you pick?

Match your situation to the right choice.

XinoAPI PICK

You primarily use Chinese LLMs and are based outside China.

  • Need DeepSeek, Qwen, GLM, Kimi, or MiniMax as main workload
  • Can't or don't want to complete Chinese KYC (phone, ID)
  • Want PII redaction or response integrity signing out of the box
  • Building agent workflows and concerned about router integrity (see arXiv:2604.08407)
  • Want one bill, one API key, five providers

OpenRouter

Primarily OpenAI/Anthropic/Google, occasional Chinese access.

  • Main workload is GPT-5, Claude, or Gemini
  • Need access to 200+ models including niche open-weight
  • OK with 2-3 hop routing for Chinese models (plaintext exposure)
  • Value zero token markup over latency optimization

LiteLLM (self-hosted)

Enterprise with existing keys and infrastructure team.

  • Have existing contracts with each provider
  • Can dedicate 1 engineer to ops / maintenance
  • Need custom routing logic or region pinning
  • Regulated industry requiring on-prem data handling
  • Willing to own security patches (March 2026 dependency confusion)

Direct provider APIs

Mainland China user, single model, maximum latency sensitivity.

  • Based in China with local payment methods
  • Using only one model, no multi-model routing needed
  • Can complete Chinese KYC without issues
  • Every millisecond of latency matters
  • Don't need audit logs or PII redaction

Migration

All comparison options use the OpenAI SDK format. Switching is typically one line of code.

From OpenRouter to XinoAPI

from openai import OpenAI

# Before
client = OpenAI(
  api_key="sk-or-...",
  base_url="https://openrouter.ai/api/v1"
)

# After
client = OpenAI(
  api_key="xino-...",
  base_url="https://api.xinoapi.com/v1"
)

# Model names:
# openrouter  → "deepseek/deepseek-chat"
# xinoapi     → "deepseek-chat"

From direct DeepSeek to XinoAPI

from openai import OpenAI

# Before — only DeepSeek, Chinese KYC required
client = OpenAI(
  api_key="sk-...",
  base_url="https://api.deepseek.com"
)

# After — 5 providers, no KYC, one bill
client = OpenAI(
  api_key="xino-...",
  base_url="https://api.xinoapi.com/v1"
)

# Model name unchanged: "deepseek-chat", "deepseek-reasoner"
# Plus you now have access to: qwen-plus, glm-4-plus, kimi-k2.5, MiniMax-M2.7

Frequently asked questions

Is XinoAPI cheaper than OpenRouter?

Not on token price. OpenRouter has 0% token markup but 5.5% credit purchase fee; XinoAPI has 25% token markup with no purchase fee. At typical usage patterns (≤ $100/month per user), OpenRouter is ~10-15% cheaper for the same model. XinoAPI's advantages are Chinese model specialization, sub-200ms Singapore-to-mainland latency, no-KYC access, and the included Privacy SDK. Pick on features, not price.

Can I use XinoAPI as a drop-in replacement for the OpenAI SDK?

Yes. The endpoint is OpenAI-compatible — change base_url to https://api.xinoapi.com/v1 and use your XinoAPI key. No other code changes required. Works with OpenAI Python, Node.js, and Go SDKs.

Does OpenRouter have DeepSeek, Qwen, and GLM?

OpenRouter has DeepSeek models via various aggregators, but its Qwen and GLM coverage is limited. For all five major Chinese models (DeepSeek, Qwen, GLM, Kimi, MiniMax) under one API key, you need XinoAPI or self-hosted LiteLLM with your own provider contracts.

How do I access DeepSeek API from outside China?

DeepSeek's direct API accepts international payments but requires phone verification that can be difficult outside China. For zero-friction access, use an API proxy — XinoAPI or OpenRouter both forward to DeepSeek with no Chinese KYC. XinoAPI routes from Singapore for lower latency (180ms TTFT vs 320ms on OpenRouter in our tests).

What is the safest way to use Chinese LLM APIs?

The 2026 study on malicious LLM routers found that 9 of 428 tested commodity routers inject malicious code, and 17 steal AWS credentials. Defense: use a provider with response integrity signing, deploy a client-side threat scanner, and maintain an audit log. XinoAPI ships these three defenses built-in; for other providers you can add the open-source XinoAPI Privacy SDK as a client-side layer.

Does OpenRouter offer streaming (SSE)?

Yes, OpenRouter, XinoAPI, LiteLLM, and all major direct providers support SSE streaming. Set stream=true in your request and handle the event stream as you would with the OpenAI SDK.

Why does self-hosted LiteLLM show "Self-hosted" for pricing?

LiteLLM is open-source software you run on your own infrastructure. There's no per-token fee to LiteLLM itself — you pay whatever the underlying providers charge you directly, plus your infra cost (typically $30-100/month for a single VM). The tradeoff is you own ops, security patches, and scaling.

Can I migrate from one to another without code changes?

Yes, in most cases only the base_url and api_key change. Model names differ slightly — OpenRouter uses provider/model format (e.g., deepseek/deepseek-chat) while XinoAPI and direct APIs use the bare model name (deepseek-chat). A quick sed replacement usually suffices.

Which provider is best for agent workflows (Claude Code, Cursor, Codex)?

For Chinese models in agent contexts, the main risk is tool-call injection by malicious intermediaries. XinoAPI ships response signing + threat scanning specifically to defend agent workflows. Direct providers have the lowest hop count (highest security) but require you to be in China. OpenRouter's multi-hop architecture introduces risk — if any hop in the chain is compromised, tool calls can be rewritten. For production agent workloads, XinoAPI or direct APIs are the safer choices.

Try XinoAPI with $2 free credits

5 Chinese LLMs, one API key, no KYC. Takes 2 minutes to get your first response.