Qwen OpenAI-Compatible Proxy Server - Works with opencode, crush, claude code router, roo code, cline mostly everything

A proxy server that exposes Qwen models through an OpenAI-compatible API endpoint. Has tool calling and streaming support.

Important Notes

Users might face errors or 504 Gateway Timeout issues when using contexts with 130,000 to 150,000 tokens or more. This appears to be a practical limit for Qwen models. Qwen code itself tends to also break down and get stuck on this limit.

Quick Start

Install Dependencies:
```
npm install
```
Authenticate: You need to authenticate with Qwen to generate the required credentials file.
- Run npm run auth:add <account> to authenticate with your Qwen account
- This will create the ~/.qwen/oauth_creds.json file needed by the proxy server
- Alternatively, you can use the official qwen-code CLI tool from QwenLM/qwen-code
Start the Server:
```
npm start
```
Use the Proxy: Point your OpenAI-compatible client to http://localhost:8080/v1.
API key? Random doesn't matter.

Multi-Account Support

The proxy supports multiple Qwen accounts to overcome the 2,000 requests per day limit per account. Accounts are automatically rotated when quota limits are reached.

Setting Up Multiple Accounts

List existing accounts:
```
npm run auth:list
```
Add a new account:
```
npm run auth:add <account-id>
```
Replace <account-id> with a unique identifier for your account (e.g., account2, team-account, etc.)
Remove an account:
```
npm run auth:remove <account-id>
```

How Account Rotation Works

When you have multiple accounts configured, the proxy will automatically rotate between them
Each account has a 2,000 request per day limit
When an account reaches its limit, Qwen's API will return a quota exceeded error
The proxy detects these quota errors and automatically switches to the next available account
If a DEFAULT_ACCOUNT is configured, the proxy will use that account first before rotating to others
Request counts are tracked locally and reset daily at UTC midnight
You can check request counts for all accounts with:
```
npm run auth:counts
```

Account Usage Monitoring

The proxy provides real-time feedback in the terminal:

Shows which account is being used for each request
Displays current request count for each account
Notifies when an account is rotated due to quota limits
Indicates which account will be tried next during rotation
Shows which account is configured as the default account on server startup
Marks the default account in the account list display

Configuration

The proxy server can be configured using environment variables. Create a .env file in the project root or set the variables directly in your environment.

LOG_FILE_LIMIT: Maximum number of debug log files to keep (default: 20)
DEBUG_LOG: Set to true to enable debug logging (default: false)
STREAM: Set to true to enable streaming responses (default: false)
- Important: Set this to true when using tools like opencode or crush that require streaming responses
DEFAULT_ACCOUNT: Specify which account the proxy should use by default (when using multi-account setup)
- Should match the name used when adding an account with npm run auth add <name>
- If not set or invalid, the proxy will use the first available account

Example .env file:

# Keep only the 10 most recent log files
LOG_FILE_LIMIT=10

# Enable debug logging (log files will be created)
DEBUG_LOG=true

# Enable streaming responses (disabled by default)
# Required for tools like opencode and crush
STREAM=true

# Specify which account to use by default (when using multi-account setup)
# Should match the name used when adding an account with 'npm run auth add <name>'
DEFAULT_ACCOUNT=my-primary-account

Example Usage

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'fake-key', // Not used, but required by the OpenAI client
  baseURL: 'http://localhost:8080/v1'
});

async function main() {
  const response = await openai.chat.completions.create({
    model: 'qwen-coder-plus',
    messages: [
      { "role": "user", "content": "Hello!" }
    ]
  });

  console.log(response.choices[0].message.content);
}

main();

Supported Endpoints

POST /v1/chat/completions

Tool Calling Support

This proxy server supports tool calling functionality, allowing you to use it with tools like opencode and crush roo cline kilo and etc .

opencode Configuration

To use with opencode, add the following to ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "myprovider": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "proxy",
      "options": {
        "baseURL": "http://localhost:8080/v1"
      },
      "models": {
        "qwen3-coder-plus": {
          "name": "qwen3"
        }
      }
    }
  }
}

Note: For opencode to work properly with streaming responses, you need to enable streaming in the proxy server by setting STREAM=true in your .env file.

crush Configuration

To use with crush, add the following to ~/.config/crush/crush.json:

{
  "$schema": "https://charm.land/crush.json",
  "providers": {
    "proxy": {
      "type": "openai",
      "base_url": "http://localhost:8080/v1",
      "api_key": "",
      "models": [
        {
          "id": "qwen3-coder-plus",
          "name": "qwen3-coder-plus",
          "cost_per_1m_in": 0.0,
          "cost_per_1m_out": 0.0,
          "cost_per_1m_in_cached": 0,
          "cost_per_1m_out_cached": 0,
          "context_window": 150000,
          "default_max_tokens": 64000
        }
      ]
    }
  }
}

Note: For crush to work properly with streaming responses, you need to enable streaming in the proxy server by setting STREAM=true in your .env file.

Claude code Router

{
  "LOG": false,
  "Providers": [
    {
      "name": "qwen-code",
      "api_base_url": "http://localhost:8080/v1/chat/completions/",
      "api_key": "wdadwa-random-stuff",
      "models": ["qwen3-coder-plus"],
      "transformer": {
        "use": [
          [
            "maxtoken",
            {
              "max_tokens": 65536
            }
          ],
          "enhancetool",
          "cleancache"
        ]
      }
    }
  ],
  "Router": {
    "default": "qwen-code,qwen3-coder-plus"
  }
}

Roo Code and Kilo Code and Cline Configuration

To use with Roo Code or Kilo Code or Cline :

Go to settings in the client
Choose the OpenAI compatible option
Set the URL to: http://localhost:8080/v1
Use a random API key (it doesn't matter)
Type or choose the model name exactly as: qwen3-coder-plus
Disable streaming in the checkbox for Roo Code or Kilo Code
Change the max output setting from -1 to 65000
You can change the context window size to around 300k or so but after 150k it gets slower so keep that in mind .

Token Counting

The proxy now displays token counts in the terminal for each request, showing both input tokens and API-returned usage statistics (prompt, completion, and total tokens).

Token Usage Tracking

The proxy includes comprehensive token usage tracking that monitors daily input and output token consumption across all accounts. View detailed token usage reports with either:

npm run auth:tokens

or

npm run tokens

Both commands display a clean table showing daily token usage trends, lifetime totals, and request counts. For more information, see docs/token-usage-tracking.md.

For more detailed documentation, see the docs/ directory.

For information about configuring a default account, see docs/default-account.md.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
docs		docs
src		src
tmp-test		tmp-test
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
QWEN.md		QWEN.md
README.md		README.md
authenticate.js		authenticate.js
package-lock.json		package-lock.json
package.json		package.json
simple_qwen_test.py		simple_qwen_test.py
test-all-accounts.js		test-all-accounts.js
tokens.js		tokens.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Qwen OpenAI-Compatible Proxy Server - Works with opencode, crush, claude code router, roo code, cline mostly everything

Important Notes

Quick Start

Multi-Account Support

Setting Up Multiple Accounts

How Account Rotation Works

Account Usage Monitoring

Configuration

Example Usage

Supported Endpoints

Tool Calling Support

opencode Configuration

crush Configuration

Claude code Router

Roo Code and Kilo Code and Cline Configuration

Token Counting

Token Usage Tracking

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

aptdnfapt/qwen-code-oai-proxy

Folders and files

Latest commit

History

Repository files navigation

Qwen OpenAI-Compatible Proxy Server - Works with opencode, crush, claude code router, roo code, cline mostly everything

Important Notes

Quick Start

Multi-Account Support

Setting Up Multiple Accounts

How Account Rotation Works

Account Usage Monitoring

Configuration

Example Usage

Supported Endpoints

Tool Calling Support

opencode Configuration

crush Configuration

Claude code Router

Roo Code and Kilo Code and Cline Configuration

Token Counting

Token Usage Tracking

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages