Skip to main content
AIsa’s Model Gateway routes LLM and media-generation requests through one API key. This guide is the stable Markdown source for agents that read llms.txt and need to understand which AIsa model IDs exist, which endpoint each model uses, and what each model can do. The live catalog is aisa.one/models. The tables below were refreshed from live Model Gateway metadata on June 4, 2026. Model availability and prices can change, so use the live catalog for the final source of truth before production routing.

How to call models

Use your AISA_API_KEY as a Bearer token. For OpenAI-compatible SDKs, set the base URL to https://api.aisa.one/v1.
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_AISA_API_KEY",
    base_url="https://api.aisa.one/v1"
)

response = client.chat.completions.create(
    model="qwen3.7-max",
    messages=[{"role": "user", "content": "Compare these model options for a coding agent."}]
)
print(response.choices[0].message.content)

Endpoint types

EndpointCurrent model count
POST /v1/chat/completions55
POST /v1/messages10
POST /v1/responses4
POST /v1/images/generations1
POST /v1beta/models/{model}:generateContent1
Most text, vision, audio, coding, and multimodal models use POST /v1/chat/completions. Claude models can also expose Anthropic-compatible POST /v1/messages. Selected models expose POST /v1/responses, Gemini-compatible POST /v1beta/models/{model}:generateContent, or image-generation routes. Always pass the exact model string shown below.

Capability vocabulary

CapabilityWhat it means in AIsa Model Gateway
TextNatural-language generation, summarization, analysis, translation, and long-context reasoning.
CodingCode reasoning, code completion, software-agent planning, and tool-use workflows.
VisionImage/document understanding, visual coding, and spatial reasoning over visual inputs.
AudioAudio understanding or speech-to-speech style interaction where supported by the upstream model.
ImageImage generation, image editing, image consistency, or rendering text in images.
VideoVideo understanding, temporal reasoning, long-video work, image-to-video, or omni/video workflows.

Provider summary

ProviderModelsCapabilitiesExample model IDs
OpenAI8Coding, Text, Vision, Imagegpt-5, gpt-5-mini, gpt-5.2, gpt-5.2-chat-latest
Anthropic7Coding, Text, Visionclaude-haiku-4-5-20251001, claude-opus-4-1-20250805, claude-opus-4-5-20251101, claude-opus-4-6
Google Gemini2Textgemini-3-pro-preview, gemini-3.5-flash
xAI4Text, Vision, Codinggrok-4.20-0309-non-reasoning, grok-4.20-0309-reasoning, grok-4.3, grok-build-0.1
DeepSeek6Coding, Textdeepseek-r1, deepseek-v3, deepseek-v3.1, deepseek-v3.2
Alibaba16Audio, Coding, Text, Video, Vision, Imageqwen-flash, qwen-mt-flash, qwen-mt-lite, qwen-plus-2025-12-01
Moonshot3Coding, Text, Video, Visionkimi-k2-thinking, kimi-k2.5, kimi-k2.6
MiniMax2Coding, Text, Video, VisionMiniMax-M2.5, MiniMax-M3
Zhipu GLM1Coding, Textglm-5
ByteDance8Text, Video, Vision, Coding, Imageseed-1-6-250915, seed-1-6-flash-250715, seed-1-8-251228, seed-2-0-lite-260228

Complete model details

OpenAI

Model IDContextCapabilitiesEndpoint(s)Billing
gpt-5400,000Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, code reasoning, code completion, agentic codingPOST /v1/chat/completions1.2500in/1.2500 in / 10.0000 out per 1M tokens (cache read $0.1250/M)
gpt-5-mini400,000Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.1500in/0.1500 in / 1.2000 out per 1M tokens (cache read $0.0150/M)
gpt-5.2400,000Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, code reasoning, code completion, agentic codingPOST /v1/chat/completions1.7500in/1.7500 in / 14.0000 out per 1M tokens (cache read $1.7500/M)
gpt-5.2-chat-latest400,000Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/chat/completions1.7500in/1.7500 in / 14.0000 out per 1M tokens (cache read $1.7500/M)
gpt-5.3-codex1,000,000Coding, Text; reasoning, long context, code reasoning, code completion, agentic codingPOST /v1/chat/completions1.7500in/1.7500 in / 14.0000 out per 1M tokens (cache read $1.7500/M)
gpt-5.41,050,000Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/responses2.5000in/2.5000 in / 15.0000 out per 1M tokens (cache read $2.5000/M)
gpt-5.5400,000Coding, Text, Vision; code reasoning, long context, reasoning, visionPOST /v1/messages, POST /v1/chat/completions, POST /v1/responses5.0000in/5.0000 in / 40.0000 out per 1M tokens (cache read $5.0000/M)
gpt-image-2N/AImage, Vision; image editing, image generation, text in images, visionPOST /v1/images/generations8.0000in/8.0000 in / 30.0000 out per 1M tokens (cache write $2.0000/M)

Anthropic

Model IDContextCapabilitiesEndpoint(s)Billing
claude-haiku-4-5-20251001200,000Coding, Text, Vision; reasoning, long context, spatial vision, document vision, visual coding, code reasoning, agentic codingPOST /v1/messages, POST /v1/chat/completions1.0000in/1.0000 in / 5.0000 out per 1M tokens (cache read 0.1000/M;cachewrite0.1000/M; cache write 2.0000/M)
claude-opus-4-1-20250805200,000Coding, Text, Vision; reasoning, long context, document vision, visual coding, code reasoning, agentic codingPOST /v1/messages, POST /v1/chat/completions15.0000in/15.0000 in / 75.0000 out per 1M tokens (cache read 1.5000/M;cachewrite1.5000/M; cache write 30.0000/M)
claude-opus-4-5-202511011,000,000Coding, Text, Vision; reasoning, long context, creative writing, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/messages, POST /v1/chat/completions5.0000in/5.0000 in / 25.0000 out per 1M tokens (cache read 0.5000/M;cachewrite0.5000/M; cache write 10.0000/M)
claude-opus-4-61,000,000Coding, Text, Vision; reasoning, long context, creative writing, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/messages, POST /v1/chat/completions5.0000in/5.0000 in / 25.0000 out per 1M tokens (cache read 0.5000/M;cachewrite0.5000/M; cache write 10.0000/M)
claude-opus-4-71,000,000Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/messages, POST /v1/chat/completions5.0000in/5.0000 in / 25.0000 out per 1M tokens (cache read 0.5000/M;cachewrite0.5000/M; cache write 10.0000/M)
claude-opus-4-81,000,000Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/messages, POST /v1/chat/completions5.0000in/5.0000 in / 25.0000 out per 1M tokens (cache read 0.5000/M;cachewrite0.5000/M; cache write 10.0000/M)
claude-sonnet-4-5-20250929200,000Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic codingPOST /v1/messages, POST /v1/chat/completions3.0000in/3.0000 in / 15.0000 out per 1M tokens (cache read 0.3000/M;cachewrite0.3000/M; cache write 6.0000/M)

Google Gemini

Model IDContextCapabilitiesEndpoint(s)Billing
gemini-3-pro-previewN/AText; long context, creative writingPOST /v1/chat/completions2.0000in/2.0000 in / 12.0000 out per 1M tokens
gemini-3.5-flashN/AText; long context, creative writingPOST /v1beta/models/{model}:generateContent, POST /v1/chat/completions1.5000in/1.5000 in / 9.0000 out per 1M tokens (cache read $0.1500/M)

xAI

Model IDContextCapabilitiesEndpoint(s)Billing
grok-4.20-0309-non-reasoning1,000,000Text, Vision; long context, creative writing, spatial vision, document visionPOST /v1/chat/completions1.2500in/1.2500 in / 2.5000 out per 1M tokens
grok-4.20-0309-reasoning1,000,000Text, Vision; reasoning, long context, creative writing, spatial vision, document visionPOST /v1/chat/completions1.2500in/1.2500 in / 2.5000 out per 1M tokens
grok-4.31,000,000Text, Vision; reasoning, long context, creative writing, spatial vision, document visionPOST /v1/chat/completions1.2500in/1.2500 in / 2.5000 out per 1M tokens
grok-build-0.1256,000Coding, Text, Vision; reasoning, long context, visual coding, code reasoning, code completion, agentic codingPOST /v1/chat/completions1.0000in/1.0000 in / 2.0000 out per 1M tokens

DeepSeek

Model IDContextCapabilitiesEndpoint(s)Billing
deepseek-r1262,144Coding, Text; code reasoning, long context, reasoningPOST /v1/chat/completions0.4018in/0.4018 in / 1.6058 out per 1M tokens (cache read $0.4018/M)
deepseek-v3262,144Coding, Text; code reasoning, long context, reasoningPOST /v1/chat/completions0.2009in/0.2009 in / 0.8029 out per 1M tokens (cache read $0.2009/M)
deepseek-v3.1262,144Coding, Text; code reasoning, long context, reasoningPOST /v1/chat/completions0.4018in/0.4018 in / 1.2047 out per 1M tokens (cache read $0.4018/M)
deepseek-v3.2128,000Coding, Text; reasoning, long context, creative writing, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.2009in/0.2009 in / 0.3017 out per 1M tokens (cache read $0.2009/M)
deepseek-v4-flash262,144Coding, Text; code reasoning, long context, reasoningPOST /v1/chat/completions0.0980in/0.0980 in / 0.1960 out per 1M tokens (cache read $0.0020/M)
deepseek-v4-pro262,144Coding, Text; code reasoning, long context, reasoningPOST /v1/messages, POST /v1/chat/completions, POST /v1/responses0.3045in/0.3045 in / 0.6090 out per 1M tokens (cache read $0.0025/M)

Alibaba

Model IDContextCapabilitiesEndpoint(s)Billing
qwen-flash1,000,000Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.0225in/0.0225 in / 0.1800 out per 1M tokens (cache read $0.0225/M)
qwen-mt-flash1,000,000Text; long context, translationPOST /v1/chat/completions0.0720in/0.0720 in / 0.2205 out per 1M tokens (cache read $0.0720/M)
qwen-mt-lite1,000,000Text; translationPOST /v1/chat/completions0.0840in/0.0840 in / 0.2520 out per 1M tokens (cache read $0.0840/M)
qwen-plus-2025-12-011,000,000Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.2800in/0.2800 in / 0.8400 out per 1M tokens (cache read $0.2800/M)
qwen3-coder-480b-a35b-instruct262,144Coding, Text; reasoning, long context, code reasoning, agentic codingPOST /v1/chat/completions1.0500in/1.0500 in / 5.2500 out per 1M tokens (cache read $1.0500/M)
qwen3-coder-flash1,000,000Coding, Text; reasoning, long context, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.2100in/0.2100 in / 1.0500 out per 1M tokens (cache read $0.2100/M)
qwen3-coder-plus1,000,000Coding, Text; reasoning, long context, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.7000in/0.7000 in / 3.5000 out per 1M tokens (cache read $0.7000/M)
qwen3-max262,144Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.7200in/0.7200 in / 3.6000 out per 1M tokens (cache read $0.7200/M)
qwen3-vl-flash131,072Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.0350in/0.0350 in / 0.2800 out per 1M tokens (cache read $0.0350/M)
qwen3-vl-flash-2025-10-15131,072Coding, Text, Video, Vision; reasoning, long context, translation, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.0350in/0.0350 in / 0.2800 out per 1M tokens (cache read $0.0350/M)
qwen3-vl-plus131,072Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.1400in/0.1400 in / 1.1200 out per 1M tokens (cache read $0.1400/M)
qwen3.6-plus1,000,000Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, omni/video understanding, long video, temporal video, code reasoning, agentic codingPOST /v1/chat/completions0.2760in/0.2760 in / 1.6510 out per 1M tokens (cache read $0.2760/M)
qwen3.6-plus-2026-04-02262,144Coding, Text, Vision; code reasoning, long context, reasoning, visionPOST /v1/chat/completions0.2760in/0.2760 in / 1.6510 out per 1M tokens (cache read $0.2760/M)
qwen3.7-max1,000,000Coding, Text; reasoning, long context, translation, creative writing, code reasoning, agentic codingPOST /v1/chat/completions1.1550in/1.1550 in / 3.4657 out per 1M tokens (cache read 0.1155/M;cachewrite0.1155/M; cache write 1.4441/M)
wan2.7-imageN/AImage, Text, Vision; reasoning, vision, image generation, image editing, text in images, image consistencyPOST /v1/chat/completions$0.030 / request
wan2.7-image-proN/AImage, Text, Video, Vision; reasoning, long context, vision, image generation, image editing, text in images, image consistency, image-to-videoPOST /v1/chat/completions$0.075 / request

Moonshot

Model IDContextCapabilitiesEndpoint(s)Billing
kimi-k2-thinking256,000Coding, Text; reasoning, long context, code reasoning, agentic codingPOST /v1/chat/completions0.4018in/0.4018 in / 1.6058 out per 1M tokens (cache read $0.4018/M)
kimi-k2.5262,144Coding, Text, Video, Vision; reasoning, long context, spatial vision, document vision, visual coding, omni/video understanding, long video, code reasoning, agentic codingPOST /v1/chat/completions0.4018in/0.4018 in / 2.1077 out per 1M tokens (cache read $0.4018/M)
kimi-k2.6128,000Text; long context, reasoningPOST /v1/chat/completions0.6257in/0.6257 in / 2.5992 out per 1M tokens (cache read $0.6257/M)

MiniMax

Model IDContextCapabilitiesEndpoint(s)Billing
MiniMax-M2.5262,144Coding, Text; reasoning, long context, creative writing, code reasoning, agentic codingPOST /v1/chat/completions0.2100in/0.2100 in / 0.8400 out per 1M tokens (cache read $0.2100/M)
MiniMax-M31,000,000Coding, Text, Video, Vision; reasoning, long context, code reasoning, agentic coding, vision, long videoPOST /v1/messages, POST /v1/chat/completions, POST /v1/responses0.2100in/0.2100 in / 0.8400 out per 1M tokens (cache read $0.0500/M)

Zhipu GLM

Model IDContextCapabilitiesEndpoint(s)Billing
glm-5128,000Coding, Text; reasoning, long context, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.4011in/0.4011 in / 1.8060 out per 1M tokens (cache read $0.4011/M)

ByteDance

Model IDContextCapabilitiesEndpoint(s)Billing
seed-1-6-250915262,144Text, Video, Vision; reasoning, long context, creative writing, vision, omni/video understandingPOST /v1/chat/completions0.2250in/0.2250 in / 0.9000 out per 1M tokens (cache read $0.2250/M)
seed-1-6-flash-250715262,144Text, Video, Vision; reasoning, long context, spatial vision, omni/video understanding, temporal videoPOST /v1/chat/completions0.0675in/0.0675 in / 0.2700 out per 1M tokens (cache read $0.0675/M)
seed-1-8-251228262,144Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.2250in/0.2250 in / 1.8000 out per 1M tokens (cache read $0.2250/M)
seed-2-0-lite-260228262,144Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.2500in/0.2500 in / 2.0000 out per 1M tokens (cache read $0.2500/M)
seed-2-0-mini-260215262,144Coding, Text, Video, Vision; reasoning, long context, spatial vision, document vision, omni/video understanding, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.1000in/0.1000 in / 0.4000 out per 1M tokens (cache read $0.1000/M)
seed-2-0-pro-260328262,144Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic codingPOST /v1/chat/completions0.5000in/0.5000 in / 3.0000 out per 1M tokens (cache read $0.5000/M)
seedream-4-5-251128N/AImage, Vision; vision, image generation, image editing, text in images, image consistencyPOST /v1/chat/completions$0.036 / request
seedream-5-0-260128262,144Image, Vision; image editing, image generation, text in images, visionPOST /v1/chat/completions$0.035 / request

Choosing a model

NeedStart withWhy
Frontier text + visiongpt-5.5, claude-opus-4-8, gpt-5.4Strong reasoning and broad multimodal/coding coverage.
Agentic codinggpt-5.3-codex, claude-opus-4-8, qwen3-coder-plus, MiniMax-M3Coding, long-context, and agentic sub-capabilities.
Low-cost high-volume textqwen-flash, deepseek-v4-flash, qwen-mt-flashVery low input/output pricing for routine tasks.
Long-context Chinese or bilingual workqwen3.6-plus, qwen3.7-max, MiniMax-M31M-token context options with Chinese-language strength.
Visual/document tasksqwen3-vl-plus, claude-opus-4-8, gpt-5.4Vision/document/spatial capability tags.
Image generationgpt-image-2, seedream-5-0-260128, wan2.7-image-proImage-generation and image-editing model IDs with per-request billing.

Notes for agents

  • Do not invent AIsa model IDs. Use the exact model strings in the tables.
  • Do not assume a model supports every modality its upstream family supports. Use the capability tags listed here or check the live model page.
  • If a model appears in aisa.one/models but not in a static table, the pricing API has likely enabled it at runtime; prefer the live catalog.
  • Pricing tables are informational. The final billed amount appears in AIsa Usage Logs and may include workspace-level pricing rules.