AIsa LLM Model Catalog - API Endpoints, Capabilities, Context Windows

AIsa’s Model Gateway routes LLM and media-generation requests through one API key. This guide is the stable Markdown source for agents that read llms.txt and need to understand which AIsa model IDs exist, which endpoint each model uses, and what each model can do. The live catalog is aisa.one/models. The tables below were refreshed from live Model Gateway metadata on June 4, 2026. Model availability and prices can change, so use the live catalog for the final source of truth before production routing.

How to call models

Use your AISA_API_KEY as a Bearer token. For OpenAI-compatible SDKs, set the base URL to https://api.aisa.one/v1.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_AISA_API_KEY",
    base_url="https://api.aisa.one/v1"
)

response = client.chat.completions.create(
    model="qwen3.7-max",
    messages=[{"role": "user", "content": "Compare these model options for a coding agent."}]
)
print(response.choices[0].message.content)

Endpoint types

Endpoint	Current model count
`POST /v1/chat/completions`	55
`POST /v1/messages`	10
`POST /v1/responses`	4
`POST /v1/images/generations`	1
`POST /v1beta/models/{model}:generateContent`	1

Most text, vision, audio, coding, and multimodal models use POST /v1/chat/completions. Claude models can also expose Anthropic-compatible POST /v1/messages. Selected models expose POST /v1/responses, Gemini-compatible POST /v1beta/models/{model}:generateContent, or image-generation routes. Always pass the exact model string shown below.

Capability vocabulary

Capability	What it means in AIsa Model Gateway
Text	Natural-language generation, summarization, analysis, translation, and long-context reasoning.
Coding	Code reasoning, code completion, software-agent planning, and tool-use workflows.
Vision	Image/document understanding, visual coding, and spatial reasoning over visual inputs.
Audio	Audio understanding or speech-to-speech style interaction where supported by the upstream model.
Image	Image generation, image editing, image consistency, or rendering text in images.
Video	Video understanding, temporal reasoning, long-video work, image-to-video, or omni/video workflows.

Provider summary

Provider	Models	Capabilities	Example model IDs
OpenAI	8	Coding, Text, Vision, Image	`gpt-5`, `gpt-5-mini`, `gpt-5.2`, `gpt-5.2-chat-latest` …
Anthropic	7	Coding, Text, Vision	`claude-haiku-4-5-20251001`, `claude-opus-4-1-20250805`, `claude-opus-4-5-20251101`, `claude-opus-4-6` …
Google Gemini	2	Text	`gemini-3-pro-preview`, `gemini-3.5-flash`
xAI	4	Text, Vision, Coding	`grok-4.20-0309-non-reasoning`, `grok-4.20-0309-reasoning`, `grok-4.3`, `grok-build-0.1`
DeepSeek	6	Coding, Text	`deepseek-r1`, `deepseek-v3`, `deepseek-v3.1`, `deepseek-v3.2` …
Alibaba	16	Audio, Coding, Text, Video, Vision, Image	`qwen-flash`, `qwen-mt-flash`, `qwen-mt-lite`, `qwen-plus-2025-12-01` …
Moonshot	3	Coding, Text, Video, Vision	`kimi-k2-thinking`, `kimi-k2.5`, `kimi-k2.6`
MiniMax	2	Coding, Text, Video, Vision	`MiniMax-M2.5`, `MiniMax-M3`
Zhipu GLM	1	Coding, Text	`glm-5`
ByteDance	8	Text, Video, Vision, Coding, Image	`seed-1-6-250915`, `seed-1-6-flash-250715`, `seed-1-8-251228`, `seed-2-0-lite-260228` …

Complete model details

OpenAI

Model ID	Context	Capabilities	Endpoint(s)	Billing
`gpt-5`	400,000	Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$1.2500 in /$ 10.0000 out per 1M tokens (cache read $0.1250/M)
`gpt-5-mini`	400,000	Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.1500 in /$ 1.2000 out per 1M tokens (cache read $0.0150/M)
`gpt-5.2`	400,000	Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$1.7500 in /$ 14.0000 out per 1M tokens (cache read $1.7500/M)
`gpt-5.2-chat-latest`	400,000	Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$1.7500 in /$ 14.0000 out per 1M tokens (cache read $1.7500/M)
`gpt-5.3-codex`	1,000,000	Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$1.7500 in /$ 14.0000 out per 1M tokens (cache read $1.7500/M)
`gpt-5.4`	1,050,000	Coding, Text, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/responses`	$2.5000 in /$ 15.0000 out per 1M tokens (cache read $2.5000/M)
`gpt-5.5`	400,000	Coding, Text, Vision; code reasoning, long context, reasoning, vision	`POST /v1/messages`, `POST /v1/chat/completions`, `POST /v1/responses`	$5.0000 in /$ 40.0000 out per 1M tokens (cache read $5.0000/M)
`gpt-image-2`	N/A	Image, Vision; image editing, image generation, text in images, vision	`POST /v1/images/generations`	$8.0000 in /$ 30.0000 out per 1M tokens (cache write $2.0000/M)

Anthropic

Model ID	Context	Capabilities	Endpoint(s)	Billing
`claude-haiku-4-5-20251001`	200,000	Coding, Text, Vision; reasoning, long context, spatial vision, document vision, visual coding, code reasoning, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$1.0000 in /$ 5.0000 out per 1M tokens (cache read $0.1000/M; cache write$ 2.0000/M)
`claude-opus-4-1-20250805`	200,000	Coding, Text, Vision; reasoning, long context, document vision, visual coding, code reasoning, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$15.0000 in /$ 75.0000 out per 1M tokens (cache read $1.5000/M; cache write$ 30.0000/M)
`claude-opus-4-5-20251101`	1,000,000	Coding, Text, Vision; reasoning, long context, creative writing, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$5.0000 in /$ 25.0000 out per 1M tokens (cache read $0.5000/M; cache write$ 10.0000/M)
`claude-opus-4-6`	1,000,000	Coding, Text, Vision; reasoning, long context, creative writing, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$5.0000 in /$ 25.0000 out per 1M tokens (cache read $0.5000/M; cache write$ 10.0000/M)
`claude-opus-4-7`	1,000,000	Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$5.0000 in /$ 25.0000 out per 1M tokens (cache read $0.5000/M; cache write$ 10.0000/M)
`claude-opus-4-8`	1,000,000	Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$5.0000 in /$ 25.0000 out per 1M tokens (cache read $0.5000/M; cache write$ 10.0000/M)
`claude-sonnet-4-5-20250929`	200,000	Coding, Text, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, code reasoning, code completion, agentic coding	`POST /v1/messages`, `POST /v1/chat/completions`	$3.0000 in /$ 15.0000 out per 1M tokens (cache read $0.3000/M; cache write$ 6.0000/M)

Google Gemini

Model ID	Context	Capabilities	Endpoint(s)	Billing
`gemini-3-pro-preview`	N/A	Text; long context, creative writing	`POST /v1/chat/completions`	$2.0000 in /$ 12.0000 out per 1M tokens
`gemini-3.5-flash`	N/A	Text; long context, creative writing	`POST /v1beta/models/{model}:generateContent`, `POST /v1/chat/completions`	$1.5000 in /$ 9.0000 out per 1M tokens (cache read $0.1500/M)

xAI

Model ID	Context	Capabilities	Endpoint(s)	Billing
`grok-4.20-0309-non-reasoning`	1,000,000	Text, Vision; long context, creative writing, spatial vision, document vision	`POST /v1/chat/completions`	$1.2500 in /$ 2.5000 out per 1M tokens
`grok-4.20-0309-reasoning`	1,000,000	Text, Vision; reasoning, long context, creative writing, spatial vision, document vision	`POST /v1/chat/completions`	$1.2500 in /$ 2.5000 out per 1M tokens
`grok-4.3`	1,000,000	Text, Vision; reasoning, long context, creative writing, spatial vision, document vision	`POST /v1/chat/completions`	$1.2500 in /$ 2.5000 out per 1M tokens
`grok-build-0.1`	256,000	Coding, Text, Vision; reasoning, long context, visual coding, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$1.0000 in /$ 2.0000 out per 1M tokens

DeepSeek

Model ID	Context	Capabilities	Endpoint(s)	Billing
`deepseek-r1`	262,144	Coding, Text; code reasoning, long context, reasoning	`POST /v1/chat/completions`	$0.4018 in /$ 1.6058 out per 1M tokens (cache read $0.4018/M)
`deepseek-v3`	262,144	Coding, Text; code reasoning, long context, reasoning	`POST /v1/chat/completions`	$0.2009 in /$ 0.8029 out per 1M tokens (cache read $0.2009/M)
`deepseek-v3.1`	262,144	Coding, Text; code reasoning, long context, reasoning	`POST /v1/chat/completions`	$0.4018 in /$ 1.2047 out per 1M tokens (cache read $0.4018/M)
`deepseek-v3.2`	128,000	Coding, Text; reasoning, long context, creative writing, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.2009 in /$ 0.3017 out per 1M tokens (cache read $0.2009/M)
`deepseek-v4-flash`	262,144	Coding, Text; code reasoning, long context, reasoning	`POST /v1/chat/completions`	$0.0980 in /$ 0.1960 out per 1M tokens (cache read $0.0020/M)
`deepseek-v4-pro`	262,144	Coding, Text; code reasoning, long context, reasoning	`POST /v1/messages`, `POST /v1/chat/completions`, `POST /v1/responses`	$0.3045 in /$ 0.6090 out per 1M tokens (cache read $0.0025/M)

Alibaba

Model ID	Context	Capabilities	Endpoint(s)	Billing
`qwen-flash`	1,000,000	Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.0225 in /$ 0.1800 out per 1M tokens (cache read $0.0225/M)
`qwen-mt-flash`	1,000,000	Text; long context, translation	`POST /v1/chat/completions`	$0.0720 in /$ 0.2205 out per 1M tokens (cache read $0.0720/M)
`qwen-mt-lite`	1,000,000	Text; translation	`POST /v1/chat/completions`	$0.0840 in /$ 0.2520 out per 1M tokens (cache read $0.0840/M)
`qwen-plus-2025-12-01`	1,000,000	Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.2800 in /$ 0.8400 out per 1M tokens (cache read $0.2800/M)
`qwen3-coder-480b-a35b-instruct`	262,144	Coding, Text; reasoning, long context, code reasoning, agentic coding	`POST /v1/chat/completions`	$1.0500 in /$ 5.2500 out per 1M tokens (cache read $1.0500/M)
`qwen3-coder-flash`	1,000,000	Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.2100 in /$ 1.0500 out per 1M tokens (cache read $0.2100/M)
`qwen3-coder-plus`	1,000,000	Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.7000 in /$ 3.5000 out per 1M tokens (cache read $0.7000/M)
`qwen3-max`	262,144	Audio, Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, speech-to-speech, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.7200 in /$ 3.6000 out per 1M tokens (cache read $0.7200/M)
`qwen3-vl-flash`	131,072	Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.0350 in /$ 0.2800 out per 1M tokens (cache read $0.0350/M)
`qwen3-vl-flash-2025-10-15`	131,072	Coding, Text, Video, Vision; reasoning, long context, translation, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.0350 in /$ 0.2800 out per 1M tokens (cache read $0.0350/M)
`qwen3-vl-plus`	131,072	Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.1400 in /$ 1.1200 out per 1M tokens (cache read $0.1400/M)
`qwen3.6-plus`	1,000,000	Coding, Text, Video, Vision; reasoning, long context, translation, creative writing, spatial vision, document vision, visual coding, omni/video understanding, long video, temporal video, code reasoning, agentic coding	`POST /v1/chat/completions`	$0.2760 in /$ 1.6510 out per 1M tokens (cache read $0.2760/M)
`qwen3.6-plus-2026-04-02`	262,144	Coding, Text, Vision; code reasoning, long context, reasoning, vision	`POST /v1/chat/completions`	$0.2760 in /$ 1.6510 out per 1M tokens (cache read $0.2760/M)
`qwen3.7-max`	1,000,000	Coding, Text; reasoning, long context, translation, creative writing, code reasoning, agentic coding	`POST /v1/chat/completions`	$1.1550 in /$ 3.4657 out per 1M tokens (cache read $0.1155/M; cache write$ 1.4441/M)
`wan2.7-image`	N/A	Image, Text, Vision; reasoning, vision, image generation, image editing, text in images, image consistency	`POST /v1/chat/completions`	$0.030 / request
`wan2.7-image-pro`	N/A	Image, Text, Video, Vision; reasoning, long context, vision, image generation, image editing, text in images, image consistency, image-to-video	`POST /v1/chat/completions`	$0.075 / request

Moonshot

Model ID	Context	Capabilities	Endpoint(s)	Billing
`kimi-k2-thinking`	256,000	Coding, Text; reasoning, long context, code reasoning, agentic coding	`POST /v1/chat/completions`	$0.4018 in /$ 1.6058 out per 1M tokens (cache read $0.4018/M)
`kimi-k2.5`	262,144	Coding, Text, Video, Vision; reasoning, long context, spatial vision, document vision, visual coding, omni/video understanding, long video, code reasoning, agentic coding	`POST /v1/chat/completions`	$0.4018 in /$ 2.1077 out per 1M tokens (cache read $0.4018/M)
`kimi-k2.6`	128,000	Text; long context, reasoning	`POST /v1/chat/completions`	$0.6257 in /$ 2.5992 out per 1M tokens (cache read $0.6257/M)

MiniMax

Model ID	Context	Capabilities	Endpoint(s)	Billing
`MiniMax-M2.5`	262,144	Coding, Text; reasoning, long context, creative writing, code reasoning, agentic coding	`POST /v1/chat/completions`	$0.2100 in /$ 0.8400 out per 1M tokens (cache read $0.2100/M)
`MiniMax-M3`	1,000,000	Coding, Text, Video, Vision; reasoning, long context, code reasoning, agentic coding, vision, long video	`POST /v1/messages`, `POST /v1/chat/completions`, `POST /v1/responses`	$0.2100 in /$ 0.8400 out per 1M tokens (cache read $0.0500/M)

Zhipu GLM

Model ID	Context	Capabilities	Endpoint(s)	Billing
`glm-5`	128,000	Coding, Text; reasoning, long context, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.4011 in /$ 1.8060 out per 1M tokens (cache read $0.4011/M)

ByteDance

Model ID	Context	Capabilities	Endpoint(s)	Billing
`seed-1-6-250915`	262,144	Text, Video, Vision; reasoning, long context, creative writing, vision, omni/video understanding	`POST /v1/chat/completions`	$0.2250 in /$ 0.9000 out per 1M tokens (cache read $0.2250/M)
`seed-1-6-flash-250715`	262,144	Text, Video, Vision; reasoning, long context, spatial vision, omni/video understanding, temporal video	`POST /v1/chat/completions`	$0.0675 in /$ 0.2700 out per 1M tokens (cache read $0.0675/M)
`seed-1-8-251228`	262,144	Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.2250 in /$ 1.8000 out per 1M tokens (cache read $0.2250/M)
`seed-2-0-lite-260228`	262,144	Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.2500 in /$ 2.0000 out per 1M tokens (cache read $0.2500/M)
`seed-2-0-mini-260215`	262,144	Coding, Text, Video, Vision; reasoning, long context, spatial vision, document vision, omni/video understanding, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.1000 in /$ 0.4000 out per 1M tokens (cache read $0.1000/M)
`seed-2-0-pro-260328`	262,144	Coding, Text, Video, Vision; reasoning, long context, creative writing, spatial vision, document vision, visual coding, omni/video understanding, long video, temporal video, code reasoning, code completion, agentic coding	`POST /v1/chat/completions`	$0.5000 in /$ 3.0000 out per 1M tokens (cache read $0.5000/M)
`seedream-4-5-251128`	N/A	Image, Vision; vision, image generation, image editing, text in images, image consistency	`POST /v1/chat/completions`	$0.036 / request
`seedream-5-0-260128`	262,144	Image, Vision; image editing, image generation, text in images, vision	`POST /v1/chat/completions`	$0.035 / request

Choosing a model

Need	Start with	Why
Frontier text + vision	`gpt-5.5`, `claude-opus-4-8`, `gpt-5.4`	Strong reasoning and broad multimodal/coding coverage.
Agentic coding	`gpt-5.3-codex`, `claude-opus-4-8`, `qwen3-coder-plus`, `MiniMax-M3`	Coding, long-context, and agentic sub-capabilities.
Low-cost high-volume text	`qwen-flash`, `deepseek-v4-flash`, `qwen-mt-flash`	Very low input/output pricing for routine tasks.
Long-context Chinese or bilingual work	`qwen3.6-plus`, `qwen3.7-max`, `MiniMax-M3`	1M-token context options with Chinese-language strength.
Visual/document tasks	`qwen3-vl-plus`, `claude-opus-4-8`, `gpt-5.4`	Vision/document/spatial capability tags.
Image generation	`gpt-image-2`, `seedream-5-0-260128`, `wan2.7-image-pro`	Image-generation and image-editing model IDs with per-request billing.

Notes for agents

Do not invent AIsa model IDs. Use the exact model strings in the tables.
Do not assume a model supports every modality its upstream family supports. Use the capability tags listed here or check the live model page.
If a model appears in aisa.one/models but not in a static table, the pricing API has likely enabled it at runtime; prefer the live catalog.
Pricing tables are informational. The final billed amount appears in AIsa Usage Logs and may include workspace-level pricing rules.

AIsa Documentation

Tutorials

Pricing

Dashboard

Learn

Compare

Agent Discovery

Changelog

AIsa LLM Model Catalog - API Endpoints, Capabilities, Context Windows

How to call models

Endpoint types

Capability vocabulary

Provider summary

Complete model details

OpenAI

Anthropic

Google Gemini

xAI

DeepSeek

Alibaba

Moonshot

MiniMax

Zhipu GLM

ByteDance

Choosing a model

Notes for agents

​How to call models

​Endpoint types

​Capability vocabulary

​Provider summary

​Complete model details

​OpenAI

​Anthropic

​Google Gemini

​xAI

​DeepSeek

​Alibaba

​Moonshot

​MiniMax

​Zhipu GLM

​ByteDance

​Choosing a model

​Notes for agents

How to call models

Endpoint types

Capability vocabulary

Provider summary

Complete model details

OpenAI

Anthropic

Google Gemini

xAI

DeepSeek

Alibaba

Moonshot

MiniMax

Zhipu GLM

ByteDance

Choosing a model

Notes for agents