
- Input tokens: the tokens included in your prompt
- Output tokens: the tokens generated by the model
How Token-Based Billing Works

- Your prompt is converted into input tokens.
- The model generates output tokens.
- Both input and output tokens are counted separately.
- The total cost is calculated using the model’s pricing.
Total Cost = (Input tokens ÷ 1,000,000 × Input price) + (Output tokens ÷ 1,000,000 × Output price)
For example:
- If a model charges $1.00 per 1M input tokens
- And you send 2,000 input tokens
- The input cost is:
2,000 ÷ 1,000,000 × 1.00 = $0.002
The same calculation applies to output tokens.
What Counts as Tokens?
Tokens represent fragments of text processed by the model. They may include:- Words
- Punctuation
- Numbers
- Formatting characters
Model Versions and Naming
Some models include version identifiers such as:- Date-based versions (e.g.,
-2025-12-11) - “thinking” variants
- “mini” or “flash” variants
Group-Based Pricing
If your workspace uses multiple groups, pricing may vary by group. Group-level pricing rules and ratios are applied automatically during billing. You can view the final calculated cost for each request in the Usage Logs page.AI Model Pricing Table
AISA supports multiple types of AI models. Pricing is categorized based on how the model consumes compute:- Token-based pricing: used for text and multimodal LLM inference.
- Media-based pricing: used for image generation and video generation models.
OpenAI
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| gpt-4.1 | 1.4000 | 5.6000 |
| gpt-4.1-mini | 0.2800 | 1.1200 |
| gpt-4o | 1.7500 | 7.0000 |
| gpt-4o-mini | 0.1050 | 0.4200 |
| gpt-5 | 0.8750 | 7.0000 |
| gpt-5-mini | 0.1750 | 1.4000 |
| gpt-5.2 | 1.2250 | 9.8000 |
| gpt-5.2-2025-12-11 | 1.2250 | 9.8000 |
| gpt-5.2-chat-latest | 1.2250 | 9.8000 |
| gpt-5.3-codex | 1.2250 | 9.8000 |
| gpt-5.4 | 1.7500 | 10.5000 |
| gpt-oss-120b | 0.0280 | 0.1330 |
Anthropic
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| claude-3-7-sonnet-20250219 | 3.0000 | 15.0000 |
| claude-3-7-sonnet-20250219-thinking | 3.0000 | 15.0000 |
| claude-haiku-4-5-20251001 | 1.0000 | 5.0000 |
| claude-sonnet-4-20250514 | 3.0000 | 15.0000 |
| claude-sonnet-4-20250514-thinking | 3.0000 | 15.0000 |
| claude-sonnet-4-5-20250929 | 3.0000 | 15.0000 |
| claude-sonnet-4-6 | 3.0000 | 15.0000 |
| claude-sonnet-4-6-thinking | 3.0000 | 15.0000 |
| claude-opus-4-20250514 | 15.0000 | 75.0000 |
| claude-opus-4-20250514-thinking | 15.0000 | 75.0000 |
| claude-opus-4-1-20250805 | 15.0000 | 75.0000 |
| claude-opus-4-1-20250805-thinking | 15.0000 | 75.0000 |
| claude-opus-4-5-20251101 | 5.0000 | 25.0000 |
| claude-opus-4-6 | 5.0000 | 25.0000 |
| claude-opus-4-7 | 5.0000 | 25.0000 |
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| gemini-2.5-flash | 0.2100 | 1.7500 |
| gemini-2.5-flash-lite | 0.0700 | 0.2800 |
| gemini-2.5-pro | 0.8750 | 7.0000 |
| gemini-3-pro-preview | 1.4000 | 8.4000 |
| gemini-3.1-pro-preview | 1.4000 | 8.4000 |
DeepSeek
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| deepseek-r1 | 0.4018 | 1.6058 |
| deepseek-v3.1 | 0.4018 | 1.2047 |
| deepseek-v3.2 | 0.2009 | 0.3017 |
Qwen (Alibaba)
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| qwen-flash | 0.0220 | 0.1800 |
| qwen-mt-flash | 0.0720 | 0.2200 |
| qwen-mt-lite | 0.0840 | 0.2520 |
| qwen-plus-2025-12-01 | 0.2800 | 0.8400 |
| qwen3-coder-480b-a35b-instruct | 1.0500 | 5.2500 |
| qwen3-coder-flash | 0.2100 | 1.0500 |
| qwen3-coder-plus | 0.7000 | 3.5000 |
| qwen3-max | 0.7200 | 3.6000 |
| qwen3-vl-flash | 0.0350 | 0.2800 |
| qwen3-vl-flash-2025-10-15 | 0.0350 | 0.2800 |
| qwen3-vl-plus | 0.1400 | 1.1200 |
| qwen3.6-plus | 0.2760 | 1.6510 |
Moonshot AI
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| kimi-k2-thinking | 0.4020 | 1.6060 |
| kimi-k2.5 | 0.4020 | 2.1080 |
MiniMax
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| MiniMax-M2.5 | 0.2100 | 0.8400 |
Zhipu GLM
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| glm-5 | 0.4010 | 1.8060 |
ByteDance (Seed)
| Model Name | Input (USD / 1M tokens) | Output (USD / 1M tokens) |
|---|---|---|
| seed-1-6-250915 | 0.2250 | 0.9000 |
| seed-1-6-flash-250715 | 0.0680 | 0.2700 |
| seed-1-8-251228 | 0.2250 | 1.8000 |
| seed-2-0-mini-260215 | 0.1000 | 0.4000 |
| seed-2-0-lite-260228 | 0.2500 | 2.0000 |
| seed-2-0-pro-260328 | 0.5000 | 3.0000 |
Image & Video Generation Pricing
Some models generate media rather than tokens. These models are billed per asset (pay-per-view).| Model Name | Provider | Pricing |
|---|---|---|
| gemini-3-pro-image-preview | $0.100 per image | |
| seedream-4-5-251128 | ByteDance | $0.040 per image |
| wan2.7-image | Qwen | $0.030 per image |
| wan2.7-image-pro | Qwen | $0.075 per image |
| wan2.7-i2v | Qwen | $1.836 per video (i2v) |
Important Notes
- All prices are listed in USD.
- Text-based models are billed per input and output token.
- Image generation models are billed per generated image.
- Video generation models are billed per second of generated video.
- Pricing is usage-based and calculated per request.
- Model availability and pricing may change over time.
- Always refer to the Marketplace for the most up-to-date pricing information.
- The final billed amount for each request can be verified in Usage Logs.