Docs

DOCUMENTATION

Get your first response in under 2 minutes.

Quick Start

Get your first response in 3 steps:

Create a free account — $2 credit, no card required.
Copy your API key from the dashboard.
Make your first API call:

bash

curl https://api.llmdiscount.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen3.5-flash","messages":[{"role":"user","content":"Hello!"}]}'

Authentication

QwenBridge uses Bearer token authentication — the same format as OpenAI. Include your API key in every request:

Authorization: Bearer YOUR_QWENBRIDGE_KEY

Get your key at api.qwenbridge.com → Dashboard → API Keys.

Models

Available models and their IDs:

Model ID	Name	Input/1M	Output/1M
qwen3.6-plus	Qwen3.6-Plus	$0.31	$1.85
qwen3-max	Qwen3-Max	$0.66	$3.31
qwen3.5-plus	Qwen3.5-Plus	$0.34	$2.04
qwq-plus	QwQ-Plus	$0.50	$1.50
qwen3.5-flash	Qwen3.5-Flash	$0.06	$0.30

SDK Examples

The OpenAI SDK works without modification — just change the base URL and API key.

python

from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.llmdiscount.com/v1",
    api_key="YOUR_QWENBRIDGE_KEY"
)
 
response = client.chat.completions.create(
    model="qwen3.6-plus",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello, how are you?"}
    ]
)
 
print(response.choices[0].message.content)

Cursor

Drop-in replacement for OpenAI in Cursor's model config.

Open Cursor → Settings → Models
Add model: "qwen3.6-plus" (or any QB model)
Set OpenAI Base URL to: https://api.llmdiscount.com/v1
Paste your QwenBridge API key
Test with Ctrl+K

openaiBaseUrl = "https://api.llmdiscount.com/v1"

Cline

Set QwenBridge as your OpenAI-compatible provider in VS Code.

Install Cline from VS Code marketplace
Set API Provider: OpenAI Compatible
Base URL: https://api.llmdiscount.com/v1
Paste your QwenBridge API key
Select model: qwen3.6-plus

cline.openAiBaseUrl = "https://api.llmdiscount.com/v1"

Continue.dev

Configure QwenBridge as an OpenAI provider in config.json.

Open ~/.continue/config.json
Add model with provider: "openai"
Set baseURL: "https://api.llmdiscount.com/v1"
Set apiKey to your QwenBridge token
Choose model: qwen3.5-plus

baseURL = "https://api.llmdiscount.com/v1"

Windsurf

Use QwenBridge models in Windsurf via OpenAI-compatible config.

Open Windsurf → Preferences → AI
Choose provider: OpenAI Compatible
Set base URL: https://api.llmdiscount.com/v1
Enter QwenBridge API key
Select qwen3.6-plus as default model

openai.baseUrl = "https://api.llmdiscount.com/v1"

Endpoints

Base URL: https://api.llmdiscount.com/v1

Method	Path	Description
POST	/chat/completions	Create a chat completion. OpenAI-compatible.
GET	/models	List all available models.

Error Codes

Code	Label	Description
400	Bad Request	Malformed request body or missing required fields.
401	Unauthorized	Invalid or missing API key.
429	Too Many Requests	Rate limit exceeded. Retry after the specified delay.
500	Internal Server Error	Something went wrong on our end. Retry with backoff.

Rate Limits

Free accounts: 60 requests/minute, 100K tokens/day

Paid accounts: 600 requests/minute, unlimited tokens

All limits are per API key. If you hit a rate limit, the API returns 429 Too Many Requests with a Retry-After header.

python

# Handle rate limits with exponential backoff
import time, random
 
for attempt in range(5):
    try:
        resp = client.chat.completions.create(...)
        break
    except openai.RateLimitError:
        wait = 2 ** attempt + random.random()
        time.sleep(wait)