Documentation Index Fetch the complete documentation index at: https://aisa.one/docs/llms.txt
Use this file to discover all available pages before exploring further.
View on GitHub ->
Chinese LLM routing through AIsa. Send Chinese-language tasks to Qwen, DeepSeek, GLM, Baichuan, and related model families.
Install
aisa skills install cn-llm
What can agents do with it?
Chinese prompts Route Chinese-language tasks to strong Chinese LLMs.
Model selection Choose between Qwen, DeepSeek, GLM, Baichuan, and related models.
Bilingual workflows Switch between Chinese and global model families.
Agent defaults Give agents practical routing heuristics.
🔥 What You Can Do
Intelligent Chat
"Use Qwen to answer Chinese questions, use DeepSeek for coding"
Deep Reasoning
"Use DeepSeek-R1 for complex reasoning tasks"
Code Generation
"Use DeepSeek-Coder to generate Python code with explanations"
Long Text Processing
"Use Qwen-Long for ultra-long document summarization"
Model Comparison
"Compare response quality between Qwen-Max and DeepSeek-V3"
Supported Models
Qwen (Alibaba)
Model Input Price Output Price Features qwen3-max $1.37/M $5.48/M Most powerful general model qwen3-max-2026-01-23 $1.37/M $5.48/M Latest version qwen3-coder-plus $2.86/M $28.60/M Enhanced code generation qwen3-coder-flash $0.72/M $3.60/M Fast code generation qwen3-coder-480b-a35b-instruct $2.15/M $8.60/M 480B large model qwen3-vl-plus $0.43/M $4.30/M Vision-language model qwen3-vl-flash $0.86/M $0.86/M Fast vision model qwen3-omni-flash $4.00/M $16.00/M Multimodal model qwen-vl-max $0.23/M $0.57/M Vision-language qwen-plus-2025-12-01 $1.26/M $12.60/M Plus version qwen-mt-flash $0.168/M $0.514/M Fast machine translation qwen-mt-lite $0.13/M $0.39/M Lite machine translation
DeepSeek
Model Input Price Output Price Features deepseek-r1 $2.00/M $8.00/M Reasoning model, supports Tools deepseek-v3 $1.00/M $4.00/M General chat, 671B parameters deepseek-v3-0324 $1.20/M $4.80/M V3 stable version deepseek-v3.1 $4.00/M $12.00/M Latest Terminus version
Note : Prices are in M (million tokens). Model availability may change, see console.aisa.one/pricing for the latest list.
Quick Start
export AISA_API_KEY = "your-key"
API Endpoints
OpenAI Compatible Interface
POST https://api.aisa.one/v1/chat/completions
Qwen Example
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max",
"messages": [
{"role": "system", "content": "You are a professional Chinese assistant."},
{"role": "user", "content": "Please explain what a large language model is?"}
],
"temperature": 0.7,
"max_tokens": 1000
}'
DeepSeek Example
# DeepSeek-V3 general chat (671B parameters)
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v3",
"messages": [{"role": "user", "content": "Write a quicksort algorithm in Python"}],
"temperature": 0.3
}'
# DeepSeek-R1 deep reasoning (supports Tools)
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-r1",
"messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a sheep, and a cabbage. The boat can only carry the farmer and one item at a time. If the farmer is not present, the wolf will eat the sheep, and the sheep will eat the cabbage. How can the farmer safely cross?"}]
}'
# DeepSeek-V3.1 Terminus latest version
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v3.1",
"messages": [{"role": "user", "content": "Implement an LRU cache with get and put operations"}]
}'
Qwen3 Code Generation Example
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-plus",
"messages": [{"role": "user", "content": "Implement a thread-safe Map in Go"}]
}'
Parameter Reference
Parameter Type Required Description modelstring Yes Model identifier messagesarray Yes Message list temperaturenumber No Randomness (0-2, default 1) max_tokensinteger No Maximum tokens to generate streamboolean No Stream output (default false) top_pnumber No Nucleus sampling parameter (0-1)
{
"id" : "chatcmpl-xxx" ,
"object" : "chat.completion" ,
"created" : 1234567890 ,
"model" : "qwen-max" ,
"choices" : [
{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "A large language model (LLM) is a deep learning-based..."
},
"finish_reason" : "stop"
}
],
"usage" : {
"prompt_tokens" : 30 ,
"completion_tokens" : 150 ,
"total_tokens" : 180 ,
"cost" : 0.001
}
}
Streaming Output
curl -X POST "https://api.aisa.one/v1/chat/completions" \
-H "Authorization: Bearer $AISA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus",
"messages": [{"role": "user", "content": "Tell a Chinese folk story"}],
"stream": true
}'
Returns Server-Sent Events (SSE) format:
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Once"}}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" upon"}}]}
...
data: [DONE]
Python Client
CLI Usage
# Qwen chat
python3 scripts/cn_llm_client.py chat --model qwen3-max --message "Hello, please introduce yourself"
# Qwen3 code generation
python3 scripts/cn_llm_client.py chat --model qwen3-coder-plus --message "Write a binary search algorithm"
# DeepSeek-R1 reasoning
python3 scripts/cn_llm_client.py chat --model deepseek-r1 --message "Which is larger, 9.9 or 9.11? Please reason in detail"
# DeepSeek-V3 chat
python3 scripts/cn_llm_client.py chat --model deepseek-v3 --message "Tell a story" --stream
# With system prompt
python3 scripts/cn_llm_client.py chat --model qwen3-max --system "You are a classical poetry expert" --message "Write a poem about plum blossoms"
# Model comparison
python3 scripts/cn_llm_client.py compare --models "qwen3-max,deepseek-v3" --message "What is quantum computing?"
# List supported models
python3 scripts/cn_llm_client.py models
Python SDK Usage
from cn_llm_client import CNLLMClient
client = CNLLMClient() # Uses AISA_API_KEY environment variable
# Qwen chat
response = client.chat(
model = "qwen3-max" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
print (response[ "choices" ][ 0 ][ "message" ][ "content" ])
# Qwen3 code generation
response = client.chat(
model = "qwen3-coder-plus" ,
messages = [
{ "role" : "system" , "content" : "You are a professional programmer." },
{ "role" : "user" , "content" : "Implement a singleton pattern in Python" }
],
temperature = 0.3
)
# Streaming output
for chunk in client.chat_stream(
model = "deepseek-v3" ,
messages = [{ "role" : "user" , "content" : "Tell a story about an idiom" }]
):
print (chunk, end = "" , flush = True )
# Model comparison
results = client.compare_models(
models = [ "qwen3-max" , "deepseek-v3" , "deepseek-r1" ],
message = "Explain what machine learning is"
)
for model, result in results.items():
print ( f " { model } : { result[ 'response' ][: 100 ] } ..." )
Use Cases
1. Chinese Content Generation
# Copywriting
response = client.chat(
model = "qwen3-max" ,
messages = [
{ "role" : "system" , "content" : "You are a professional copywriter." },
{ "role" : "user" , "content" : "Write a product introduction for a smart watch" }
]
)
2. Code Development
# Code generation and explanation
response = client.chat(
model = "qwen3-coder-plus" ,
messages = [{ "role" : "user" , "content" : "Implement a thread-safe Map in Go" }]
)
3. Complex Reasoning
# Mathematical reasoning
response = client.chat(
model = "deepseek-r1" ,
messages = [{ "role" : "user" , "content" : "Prove: For any positive integer n, n³-n is divisible by 6" }]
)
4. Visual Understanding
# Image understanding
response = client.chat(
model = "qwen3-vl-plus" ,
messages = [
{ "role" : "user" , "content" : [
{ "type" : "text" , "text" : "Describe the content of this image" },
{ "type" : "image_url" , "image_url" : { "url" : "https://example.com/image.jpg" }}
]}
]
)
5. Model Routing Strategy
MODEL_MAP = {
"chat" : "qwen3-max" , # General chat
"code" : "qwen3-coder-plus" , # Code generation
"reasoning" : "deepseek-r1" , # Complex reasoning
"vision" : "qwen3-vl-plus" , # Visual understanding
"fast" : "qwen3-coder-flash" , # Fast response
"translate" : "qwen-mt-flash" # Machine translation
}
def route_by_task ( task_type : str , message : str ) -> str :
model = MODEL_MAP .get(task_type, "qwen3-max" )
return client.chat( model = model, messages = [{ "role" : "user" , "content" : message}])
Error Handling
Errors return JSON with error field:
{
"error" : {
"code" : "model_not_found" ,
"message" : "Model 'xxx' is not available"
}
}
Common error codes:
401 - Invalid or missing API Key
402 - Insufficient balance
404 - Model not found
429 - Rate limit exceeded
500 - Server error
Pricing
Model Input ($/M) Output ($/M) qwen3-max $1.37 $5.48 qwen3-coder-plus $2.86 $28.60 qwen3-coder-flash $0.72 $3.60 qwen3-vl-plus $0.43 $4.30 deepseek-v3 $1.00 $4.00 deepseek-r1 $2.00 $8.00 deepseek-v3.1 $4.00 $12.00
Price unit: $ per Million tokens. Each response includes usage.cost and usage.credits_remaining.
Get Started
Register at aisa.one
Get API Key
Top up (pay-as-you-go)
Set environment variable: export AISA_API_KEY="your-key"
Full API Reference
See API Reference for complete endpoint documentation.
Get started
Sign up at aisa.one (new accounts start with $2 free credit).
Generate an API key from the console.
Set your key and install the skill:
export AISA_API_KEY = "your-key"
aisa skills install cn-llm
Start a new agent session so the runtime loads the updated skill instructions.
Chinese LLMs Overview of Chinese model families in AIsa.
AIsa LLM Router General-purpose model routing across providers.
Model Catalog Browse supported model IDs.