Welcome to SiliconFlow!

🎉 GLM-4.5 on SiliconFlowOne Platform 🎉

All Your AI Inference Needs
One Platform
All Your AI Inference Needs
One Platform
All Your AI Inference Needs

From small dev teams to large enterprises: unified serverless, reserved, or private‐cloud inference—no fragmentation.

Get Started for Free Contact Sales

Models

Model Description
FLUX.1 Kontext [pro] ...
FLUX.1 Kontext [max] ...
FLUX 1.1 [pro] ...
Ultra ...
FLUX.1 Kontext [pro] ...
FLUX.1 Kontext [max] ...
FLUX 1.1 [pro] ...
Ultra ...
Wan2.1-T2V-14B (Turbo) ...
Wan2.1-T2V-14B ...
Wan2.1-I2V-14B-720P (Turbo) ...
Wan2.1-T2V-14B (Turbo) ...
Wan2.1-T2V-14B ...
Wan2.1-I2V-14B-720P (Turbo) ...
Wan2.1-T2V-14B (Turbo) ...
Wan2.1-T2V-14B ...
Wan2.1-I2V-14B-720P (Turbo) ...

MULTIMODAL

High-Speed Inference for Image, Video, and Beyond

From image generation to visual understanding, our platform accelerates multimodal models with unmatched performance.

Get Started

MULTIMODAL

High-Speed Inference for Image, Video, and Beyond

From image generation to visual understanding, our platform accelerates multimodal models with unmatched performance.

Get Started

MULTIMODAL

High-Speed Inference for Image, Video, and Beyond

From image generation to visual understanding, our platform accelerates multimodal models with unmatched performance.

Get Started

LLMs

Run Powerful LLMs Faster, Smarter, at Any Scale

Serve open and commercial LLMs through our optimized stack. Lower latency, higher throughput, and predictable costs.

LLMs

Run Powerful LLMs Faster, Smarter, at Any Scale

Serve open and commercial LLMs through our optimized stack. Lower latency, higher throughput, and predictable costs.

LLMs

Run Powerful LLMs Faster, Smarter, at Any Scale

Serve open and commercial LLMs through our optimized stack. Lower latency, higher throughput, and predictable costs.

Model
DeepSeek-R1
DeepSeek-R1
DeepSeek-V3
DeepSeek-V3
GLM-4.5
GLM-4.5
GLM-4.5-Air
GLM-4.5-Air
Qwen3-235B-A2
Qwen3-235B-A2
Qwen3-235B-2507
Qwen3-235B-2507
Kimi-K2-Instruct
Kimi-K2-Instruct
GLM-4.1V-9B
GLM-4.1V-9B
ERNIE-4.5-300B
ERNIE-4.5-300B
Hunyuan-A13B-Instruct
Hunyuan-A13B-Instruct
MiniMax-M1-80k
MiniMax-M1-80k
Qwen3-30B-A3B
Qwen3-30B-A3B
Qwen3-32B
Qwen3-32B
Qwen3-14B
Qwen3-14B
Qwen3-8B
Qwen3-8B
Qwen3-Reranker-8B
Qwen3-Reranker-8B
Qwen3-Embedding-8B
Qwen3-Embedding-8B
Qwen3-Reranker-4B
Qwen3-Reranker-4B
Qwen3-Embedding-4B
Qwen3-Embedding-4B
Qwen3-Reranker-0.6B
Qwen3-Reranker-0.6B

Explore More

products

Flexible Deployment Options, Built for Every Use Case

Run models serverlessly, on dedicated endpoints, or bring your own setup.

Get Started

products

Flexible Deployment Options, Built for Every Use Case

Run models serverlessly, on dedicated endpoints, or bring your own setup.

Get Started

products

Flexible Deployment Options, Built for Every Use Case

Run models serverlessly, on dedicated endpoints, or bring your own setup.

Get Started

Serverless

Run any model instantly — no setup, no scaling headaches. Just call the API and pay only for what you use.

Learn More

Fine-tuning

Easily adapt base models to your data. Fine-tune with built-in monitoring and elastic compute, without managing infrastructure.

Learn More

Reserved GPUs

Lock in GPU capacity for stable performance and predictable billing. Ideal for high-volume or scheduled inference jobs.

Learn More


Ready to accelerate your AI development?

Get Started for Free


Under Construction New! Generate your own 90s page here! Under Construction