How to call models
Use yourAISA_API_KEY as a Bearer token. For OpenAI-compatible SDKs, set the base URL to https://api.aisa.one/v1.
Endpoint types
| Endpoint | Current model count |
|---|---|
POST /v1/chat/completions | 55 |
POST /v1/messages | 10 |
POST /v1/responses | 4 |
POST /v1/images/generations | 1 |
POST /v1beta/models/{model}:generateContent | 1 |
POST /v1/chat/completions. Claude models can also expose Anthropic-compatible POST /v1/messages. Selected models expose POST /v1/responses, Gemini-compatible POST /v1beta/models/{model}:generateContent, or image-generation routes. Always pass the exact model string shown below.
Capability vocabulary
| Capability | What it means in AIsa Model Gateway |
|---|---|
| Text | Natural-language generation, summarization, analysis, translation, and long-context reasoning. |
| Coding | Code reasoning, code completion, software-agent planning, and tool-use workflows. |
| Vision | Image/document understanding, visual coding, and spatial reasoning over visual inputs. |
| Audio | Audio understanding or speech-to-speech style interaction where supported by the upstream model. |
| Image | Image generation, image editing, image consistency, or rendering text in images. |
| Video | Video understanding, temporal reasoning, long-video work, image-to-video, or omni/video workflows. |
Provider summary
| Provider | Models | Capabilities | Example model IDs |
|---|---|---|---|
| OpenAI | 8 | Coding, Text, Vision, Image | gpt-5, gpt-5-mini, gpt-5.2, gpt-5.2-chat-latest … |
| Anthropic | 7 | Coding, Text, Vision | claude-haiku-4-5-20251001, claude-opus-4-1-20250805, claude-opus-4-5-20251101, claude-opus-4-6 … |
| Google Gemini | 2 | Text | gemini-3-pro-preview, gemini-3.5-flash |
| xAI | 4 | Text, Vision, Coding | grok-4.20-0309-non-reasoning, grok-4.20-0309-reasoning, grok-4.3, grok-build-0.1 |
| DeepSeek | 6 | Coding, Text | deepseek-r1, deepseek-v3, deepseek-v3.1, deepseek-v3.2 … |
| Alibaba | 16 | Audio, Coding, Text, Video, Vision, Image | qwen-flash, qwen-mt-flash, qwen-mt-lite, qwen-plus-2025-12-01 … |
| Moonshot | 3 | Coding, Text, Video, Vision | kimi-k2-thinking, kimi-k2.5, kimi-k2.6 |
| MiniMax | 2 | Coding, Text, Video, Vision | MiniMax-M2.5, MiniMax-M3 |
| Zhipu GLM | 1 | Coding, Text | glm-5 |
| ByteDance | 8 | Text, Video, Vision, Coding, Image | seed-1-6-250915, seed-1-6-flash-250715, seed-1-8-251228, seed-2-0-lite-260228 … |
Complete model details
OpenAI
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
gpt-5 | 400,000 | Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 10.0000 out per 1M tokens (cache read $0.1250/M) |
gpt-5-mini | 400,000 | Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 1.2000 out per 1M tokens (cache read $0.0150/M) |
gpt-5.2 | 400,000 | Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 14.0000 out per 1M tokens (cache read $1.7500/M) |
gpt-5.2-chat-latest | 400,000 | Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 14.0000 out per 1M tokens (cache read $1.7500/M) |
gpt-5.3-codex | 1,000,000 | Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 14.0000 out per 1M tokens (cache read $1.7500/M) |
gpt-5.4 | 1,050,000 | Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/responses | 15.0000 out per 1M tokens (cache read $2.5000/M) |
gpt-5.5 | 400,000 | Coding, Text, Vision; code reasoning, long context, reasoning, vision | POST /v1/messages, POST /v1/chat/completions, POST /v1/responses | 40.0000 out per 1M tokens (cache read $5.0000/M) |
gpt-image-2 | N/A | Image, Vision; image editing, image generation, text in images, vision | POST /v1/images/generations | 30.0000 out per 1M tokens (cache write $2.0000/M) |
Anthropic
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
claude-haiku-4-5-20251001 | 200,000 | Coding, Text, Vision; reasoning, long context, spatial vision, document vision, visual coding, code reasoning, agentic coding | POST /v1/messages, POST /v1/chat/completions | 5.0000 out per 1M tokens (cache read 2.0000/M) |
claude-opus-4-1-20250805 | 200,000 | Coding, Text, Vision; reasoning, long context, document vision, visual coding, code reasoning, agentic coding | POST /v1/messages, POST /v1/chat/completions | 75.0000 out per 1M tokens (cache read 30.0000/M) |
claude-opus-4-5-20251101 | 1,000,000 | Coding, Text, Vision; reasoning, long context, creative writing, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/messages, POST /v1/chat/completions | 25.0000 out per 1M tokens (cache read 10.0000/M) |
claude-opus-4-6 | 1,000,000 | Coding, Text, Vision; reasoning, long context, creative writing, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/messages, POST /v1/chat/completions | 25.0000 out per 1M tokens (cache read 10.0000/M) |
claude-opus-4-7 | 1,000,000 | Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/messages, POST /v1/chat/completions | 25.0000 out per 1M tokens (cache read 10.0000/M) |
claude-opus-4-8 | 1,000,000 | Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/messages, POST /v1/chat/completions | 25.0000 out per 1M tokens (cache read 10.0000/M) |
claude-sonnet-4-5-20250929 | 200,000 | Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding | POST /v1/messages, POST /v1/chat/completions | 15.0000 out per 1M tokens (cache read 6.0000/M) |
Google Gemini
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
gemini-3-pro-preview | N/A | Text; long context, creative writing | POST /v1/chat/completions | 12.0000 out per 1M tokens |
gemini-3.5-flash | N/A | Text; long context, creative writing | POST /v1beta/models/{model}:generateContent, POST /v1/chat/completions | 9.0000 out per 1M tokens (cache read $0.1500/M) |
xAI
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
grok-4.20-0309-non-reasoning | 1,000,000 | Text, Vision; long context, creative writing, spatial vision, document vision | POST /v1/chat/completions | 2.5000 out per 1M tokens |
grok-4.20-0309-reasoning | 1,000,000 | Text, Vision; reasoning, long context, creative writing, spatial vision, document vision | POST /v1/chat/completions | 2.5000 out per 1M tokens |
grok-4.3 | 1,000,000 | Text, Vision; reasoning, long context, creative writing, spatial vision, document vision | POST /v1/chat/completions | 2.5000 out per 1M tokens |
grok-build-0.1 | 256,000 | Coding, Text, Vision; reasoning, long context, visual coding, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 2.0000 out per 1M tokens |
DeepSeek
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
deepseek-r1 | 262,144 | Coding, Text; code reasoning, long context, reasoning | POST /v1/chat/completions | 1.6058 out per 1M tokens (cache read $0.4018/M) |
deepseek-v3 | 262,144 | Coding, Text; code reasoning, long context, reasoning | POST /v1/chat/completions | 0.8029 out per 1M tokens (cache read $0.2009/M) |
deepseek-v3.1 | 262,144 | Coding, Text; code reasoning, long context, reasoning | POST /v1/chat/completions | 1.2047 out per 1M tokens (cache read $0.4018/M) |
deepseek-v3.2 | 128,000 | Coding, Text; reasoning, long context, creative writing, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 0.3017 out per 1M tokens (cache read $0.2009/M) |
deepseek-v4-flash | 262,144 | Coding, Text; code reasoning, long context, reasoning | POST /v1/chat/completions | 0.1960 out per 1M tokens (cache read $0.0020/M) |
deepseek-v4-pro | 262,144 | Coding, Text; code reasoning, long context, reasoning | POST /v1/messages, POST /v1/chat/completions, POST /v1/responses | 0.6090 out per 1M tokens (cache read $0.0025/M) |
Alibaba
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
qwen-flash | 1,000,000 | Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 0.1800 out per 1M tokens (cache read $0.0225/M) |
qwen-mt-flash | 1,000,000 | Text; long context, translation | POST /v1/chat/completions | 0.2205 out per 1M tokens (cache read $0.0720/M) |
qwen-mt-lite | 1,000,000 | Text; translation | POST /v1/chat/completions | 0.2520 out per 1M tokens (cache read $0.0840/M) |
qwen-plus-2025-12-01 | 1,000,000 | Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 0.8400 out per 1M tokens (cache read $0.2800/M) |
qwen3-coder-480b-a35b-instruct | 262,144 | Coding, Text; reasoning, long context, code reasoning, agentic coding | POST /v1/chat/completions | 5.2500 out per 1M tokens (cache read $1.0500/M) |
qwen3-coder-flash | 1,000,000 | Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 1.0500 out per 1M tokens (cache read $0.2100/M) |
qwen3-coder-plus | 1,000,000 | Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 3.5000 out per 1M tokens (cache read $0.7000/M) |
qwen3-max | 262,144 | Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 3.6000 out per 1M tokens (cache read $0.7200/M) |
qwen3-vl-flash | 131,072 | Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 0.2800 out per 1M tokens (cache read $0.0350/M) |
qwen3-vl-flash-2025-10-15 | 131,072 | Coding, Text, Video, Vision; reasoning, long context, translation, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 0.2800 out per 1M tokens (cache read $0.0350/M) |
qwen3-vl-plus | 131,072 | Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 1.1200 out per 1M tokens (cache read $0.1400/M) |
qwen3.6-plus | 1,000,000 | Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, omni/video understanding, long video, temporal video, code reasoning, agentic coding | POST /v1/chat/completions | 1.6510 out per 1M tokens (cache read $0.2760/M) |
qwen3.6-plus-2026-04-02 | 262,144 | Coding, Text, Vision; code reasoning, long context, reasoning, vision | POST /v1/chat/completions | 1.6510 out per 1M tokens (cache read $0.2760/M) |
qwen3.7-max | 1,000,000 | Coding, Text; reasoning, long context, translation, creative writing, code reasoning, agentic coding | POST /v1/chat/completions | 3.4657 out per 1M tokens (cache read 1.4441/M) |
wan2.7-image | N/A | Image, Text, Vision; reasoning, vision, image generation, image editing, text in images, image consistency | POST /v1/chat/completions | $0.030 / request |
wan2.7-image-pro | N/A | Image, Text, Video, Vision; reasoning, long context, vision, image generation, image editing, text in images, image consistency, image-to-video | POST /v1/chat/completions | $0.075 / request |
Moonshot
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
kimi-k2-thinking | 256,000 | Coding, Text; reasoning, long context, code reasoning, agentic coding | POST /v1/chat/completions | 1.6058 out per 1M tokens (cache read $0.4018/M) |
kimi-k2.5 | 262,144 | Coding, Text, Video, Vision; reasoning, long context, spatial vision, document vision, visual coding, omni/video understanding, long video, code reasoning, agentic coding | POST /v1/chat/completions | 2.1077 out per 1M tokens (cache read $0.4018/M) |
kimi-k2.6 | 128,000 | Text; long context, reasoning | POST /v1/chat/completions | 2.5992 out per 1M tokens (cache read $0.6257/M) |
MiniMax
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
MiniMax-M2.5 | 262,144 | Coding, Text; reasoning, long context, creative writing, code reasoning, agentic coding | POST /v1/chat/completions | 0.8400 out per 1M tokens (cache read $0.2100/M) |
MiniMax-M3 | 1,000,000 | Coding, Text, Video, Vision; reasoning, long context, code reasoning, agentic coding, vision, long video | POST /v1/messages, POST /v1/chat/completions, POST /v1/responses | 0.8400 out per 1M tokens (cache read $0.0500/M) |
Zhipu GLM
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
glm-5 | 128,000 | Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 1.8060 out per 1M tokens (cache read $0.4011/M) |
ByteDance
| Model ID | Context | Capabilities | Endpoint(s) | Billing |
|---|---|---|---|---|
seed-1-6-250915 | 262,144 | Text, Video, Vision; reasoning, long context, creative writing, vision, omni/video understanding | POST /v1/chat/completions | 0.9000 out per 1M tokens (cache read $0.2250/M) |
seed-1-6-flash-250715 | 262,144 | Text, Video, Vision; reasoning, long context, spatial vision, omni/video understanding, temporal video | POST /v1/chat/completions | 0.2700 out per 1M tokens (cache read $0.0675/M) |
seed-1-8-251228 | 262,144 | Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 1.8000 out per 1M tokens (cache read $0.2250/M) |
seed-2-0-lite-260228 | 262,144 | Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 2.0000 out per 1M tokens (cache read $0.2500/M) |
seed-2-0-mini-260215 | 262,144 | Coding, Text, Video, Vision; reasoning, long context, spatial vision, document vision, omni/video understanding, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 0.4000 out per 1M tokens (cache read $0.1000/M) |
seed-2-0-pro-260328 | 262,144 | Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding | POST /v1/chat/completions | 3.0000 out per 1M tokens (cache read $0.5000/M) |
seedream-4-5-251128 | N/A | Image, Vision; vision, image generation, image editing, text in images, image consistency | POST /v1/chat/completions | $0.036 / request |
seedream-5-0-260128 | 262,144 | Image, Vision; image editing, image generation, text in images, vision | POST /v1/chat/completions | $0.035 / request |
Choosing a model
| Need | Start with | Why |
|---|---|---|
| Frontier text + vision | gpt-5.5, claude-opus-4-8, gpt-5.4 | Strong reasoning and broad multimodal/coding coverage. |
| Agentic coding | gpt-5.3-codex, claude-opus-4-8, qwen3-coder-plus, MiniMax-M3 | Coding, long-context, and agentic sub-capabilities. |
| Low-cost high-volume text | qwen-flash, deepseek-v4-flash, qwen-mt-flash | Very low input/output pricing for routine tasks. |
| Long-context Chinese or bilingual work | qwen3.6-plus, qwen3.7-max, MiniMax-M3 | 1M-token context options with Chinese-language strength. |
| Visual/document tasks | qwen3-vl-plus, claude-opus-4-8, gpt-5.4 | Vision/document/spatial capability tags. |
| Image generation | gpt-image-2, seedream-5-0-260128, wan2.7-image-pro | Image-generation and image-editing model IDs with per-request billing. |
Notes for agents
- Do not invent AIsa model IDs. Use the exact
modelstrings in the tables. - Do not assume a model supports every modality its upstream family supports. Use the capability tags listed here or check the live model page.
- If a model appears in aisa.one/models but not in a static table, the pricing API has likely enabled it at runtime; prefer the live catalog.
- Pricing tables are informational. The final billed amount appears in AIsa Usage Logs and may include workspace-level pricing rules.