Python-native Serverless GPU

端流 DuanFlow

像调用本地函数一样调用云端 GPU。DuanFlow 自动匹配实时最优算力，把 Python 代码部署成可扩缩的 AI Endpoint。

Call cloud GPUs from Python. No clusters, no YAML, no idle machines.

< 3s 冷启动目标

0 idle 空闲成本

H100 按需调度

app.py

import duanflow as df

app = df.App("my-ai-app")

@app.function(
    gpu="H100",
    memory=80,
    optimize="price"
)
def run_inference(prompt):
    # 这里写你的 AI 业务代码
    return model.generate(prompt)

run_inference.remote("解释端流是什么")

Core Demo / 核心演示

从函数到 Endpoint，只点一次

Mock 演示 DuanFlow 如何读取代码需求、选择实时最优 GPU，并完成云端部署。

Code Editor Python SDK

inference.py

import duanflow as df
app = df.App("my-ai-app")

@app.function(gpu="H100", memory=80)
def run_inference(prompt):
    # 这里写你的 AI 业务代码
    return "生成结果..."

Deploy Control 智能调度面板

Ready

正在智能匹配最优算力资源… Scanning live GPU inventory

已找到当前最优 GPU（¥0.98/小时）→ 调度 2 张 H100 Scheduling across the lowest available price

部署完成！Endpoint 已就绪，可直接调用 POST https://api.duanflow.cn/my-ai-app/run_inference

Endpoint Waiting for deployment...

当前最优 GPU ¥0.98/小时 2 x H100 · 自动高亮

其他可用资源 ¥1.12 /小时 · A100 pool

其他可用资源 ¥1.28 /小时 · H100 backup

其他可用资源 ¥1.35 /小时 · premium zone

Docs / 文档

把基础设施写进代码

Mock 文档入口采用双语结构，方便演示 SDK、部署、Endpoint 和 GPU 调度能力。

Quickstart / 快速开始

Install the SDK, create an app, deploy your first GPU function in minutes.

pip install duanflow Read guide

Functions / 云函数

Use decorators to define CPU, memory, GPU type, timeout, and autoscaling policy.

@app.function(gpu="H100") Read guide

Endpoints / API 部署

Turn Python functions into secure HTTPS APIs with logs and versioned releases.

duanflow deploy app.py Read guide

GPU Router / 智能调度

Route workloads by price, latency, region, or GPU availability across resource pools.

optimize="price" Read guide

Examples / 示例

常见 AI 工作负载模板

每个示例都用同一种 SDK 心智模型：写函数、声明资源、远程调用。

LLM Inference / 大模型推理

Serve Qwen on H100

@app.endpoint(gpu="H100", memory="80Gi")
def chat(prompt):
    # Load model once, autoscale on demand
    return qwen.generate(prompt)

View full example

Batch Embeddings / 批量向量化

Map thousands of documents

@app.function(gpu="A100", concurrency=128)
def embed(doc):
    # 自动扩容到多容器执行
    return encoder.encode(doc)

vectors = embed.map(documents)

View full example

Fine-tuning / 模型微调

Launch training jobs

@app.job(gpu="H100", gpu_count=4)
def finetune(dataset):
    # 训练完成后自动释放算力
    trainer.train(dataset)
    return trainer.metrics

View full example