DOCUMENTATION
Get your first response in under 2 minutes.
Quick Start
Get your first response in 3 steps:
- Create a free account — $2 credit, no card required.
- Copy your API key from the dashboard.
- Make your first API call:
curl https://api.llmdiscount.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"qwen3.5-flash","messages":[{"role":"user","content":"Hello!"}]}'Authentication
QwenBridge uses Bearer token authentication — the same format as OpenAI. Include your API key in every request:
Authorization: Bearer YOUR_QWENBRIDGE_KEYGet your key at api.qwenbridge.com → Dashboard → API Keys.
Models
Available models and their IDs:
| Model ID | Name | Input/1M | Output/1M |
|---|---|---|---|
| qwen3.6-plus | Qwen3.6-Plus | $0.22 | $1.20 |
| qwen3-max | Qwen3-Max | $0.55 | $2.80 |
| qwen3.5-plus | Qwen3.5-Plus | $0.22 | $1.30 |
| qwq-plus | QwQ-Plus | $0.50 | $1.50 |
| qwen3.5-flash | Qwen3.5-Flash | $0.07 | $0.35 |
SDK Examples
The OpenAI SDK works without modification — just change the base URL and API key.
from openai import OpenAI client = OpenAI( base_url="https://api.llmdiscount.com/v1", api_key="YOUR_QWENBRIDGE_KEY") response = client.chat.completions.create( model="qwen3.6-plus", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello, how are you?"} ]) print(response.choices[0].message.content)Cursor
Drop-in replacement for OpenAI in Cursor's model config.
- Open Cursor → Settings → Models
- Add model: "qwen3.6-plus" (or any QB model)
- Set OpenAI Base URL to: https://api.llmdiscount.com/v1
- Paste your QwenBridge API key
- Test with Ctrl+K
openaiBaseUrl = "https://api.llmdiscount.com/v1"Cline
Set QwenBridge as your OpenAI-compatible provider in VS Code.
- Install Cline from VS Code marketplace
- Set API Provider: OpenAI Compatible
- Base URL: https://api.llmdiscount.com/v1
- Paste your QwenBridge API key
- Select model: qwen3.6-plus
cline.openAiBaseUrl = "https://api.llmdiscount.com/v1"Continue.dev
Configure QwenBridge as an OpenAI provider in config.json.
- Open ~/.continue/config.json
- Add model with provider: "openai"
- Set baseURL: "https://api.llmdiscount.com/v1"
- Set apiKey to your QwenBridge token
- Choose model: qwen3.5-plus
baseURL = "https://api.llmdiscount.com/v1"Windsurf
Use QwenBridge models in Windsurf via OpenAI-compatible config.
- Open Windsurf → Preferences → AI
- Choose provider: OpenAI Compatible
- Set base URL: https://api.llmdiscount.com/v1
- Enter QwenBridge API key
- Select qwen3.6-plus as default model
openai.baseUrl = "https://api.llmdiscount.com/v1"Endpoints
Base URL: https://api.llmdiscount.com/v1
| Method | Path | Description |
|---|---|---|
| POST | /chat/completions | Create a chat completion. OpenAI-compatible. |
| GET | /models | List all available models. |
Error Codes
| Code | Label | Description |
|---|---|---|
| 400 | Bad Request | Malformed request body or missing required fields. |
| 401 | Unauthorized | Invalid or missing API key. |
| 429 | Too Many Requests | Rate limit exceeded. Retry after the specified delay. |
| 500 | Internal Server Error | Something went wrong on our end. Retry with backoff. |
Rate Limits
Free accounts: 60 requests/minute, 100K tokens/day
Paid accounts: 600 requests/minute, unlimited tokens
All limits are per API key. If you hit a rate limit, the API returns 429 Too Many Requests with a Retry-After header.
# Handle rate limits with exponential backoffimport time, random for attempt in range(5): try: resp = client.chat.completions.create(...) break except openai.RateLimitError: wait = 2 ** attempt + random.random() time.sleep(wait)