Media Gen - AIsa

View on GitHub → AI-powered image and video generation for autonomous agents. One AISA_API_KEY unlocks Gemini 3 Pro Image and Qwen Wan 2.6 — high-fidelity synthesis from text or reference images.

Install

aisa skills install media-gen

What can agents do with it?

Marketing creatives

“Generate a cinematic hero image for a product launch deck.”

Social content

“Create a 5-second video loop for the next post.”

Storyboarding

“Produce 6 key frames illustrating the happy path of a user journey.”

Reference-to-video

“Animate this static mock into a cinematic slow push-in.”

Photorealism

“8k ultra-detailed cyberpunk skyline with neon rain.”

Agent reports

“Generate visual illustrations for a research report agent.”

Core capabilities

Image generation — gemini-3-pro-image-preview via the /v1/models/{model}:generateContent endpoint. Returns base64 image data.
Video generation — wan2.6-t2v (text-to-video) and image-to-video via an async task system. POST creates a task, GET polls status.
Asynchronous workflow — long-running video jobs are handled with X-DashScope-Async: enable + status polling.

Quick start

export AISA_API_KEY="your-key"

Image generation

curl -X POST "https://api.aisa.one/v1/models/gemini-3-pro-image-preview:generateContent" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {"role": "user", "parts": [{"text": "A cute red panda, ultra-detailed, cinematic lighting"}]}
    ]
  }'

The response returns base64-encoded image data in candidates[0].content.parts[0].inline_data.data.

Video generation (async)

# 1. Create the task
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model": "wan2.6-t2v",
    "input": {
      "prompt": "cinematic close-up, slow push-in, shallow depth of field",
      "img_url": "https://example.com/reference.jpg"
    }
  }'

# Response includes a task_id. Poll for completion:
curl "https://api.aisa.one/apis/v1/services/aigc/tasks?task_id=YOUR_TASK_ID" \
  -H "Authorization: Bearer $AISA_API_KEY"

When the task status is SUCCEEDED, the response includes a video_url you can download.

Python client

# Image
python3 scripts/media_gen_client.py image --prompt "A cute red panda" --out out.png

# Video — create + poll + download in one command
python3 scripts/media_gen_client.py video-wait \
  --prompt "cinematic close-up, slow push-in" \
  --download --out out.mp4

# Or create/poll separately
python3 scripts/media_gen_client.py video-create --prompt "cinematic sunset"
python3 scripts/media_gen_client.py video-status --task-id YOUR_TASK_ID

Endpoint reference

Endpoint	Method	Purpose
`/v1/models/{model}:generateContent`	POST	Image generation (Gemini 3)
`/apis/v1/services/aigc/video-generation/video-synthesis`	POST	Create video task
`/apis/v1/services/aigc/tasks`	GET	Poll video task status

Get started

Sign up at aisa.one (new accounts start with $2 free credit).
Generate an API key from the console.
export AISA_API_KEY="your-key" and install the skill:
```
aisa skills install media-gen
```

Video API reference

Create-task and poll-status endpoints with live playgrounds.

Gemini generateContent

Image generation endpoint reference.

Async Operations

How task creation and polling work end-to-end.

​Install

​What can agents do with it?

Marketing creatives

Social content

Storyboarding

Reference-to-video

Photorealism

Agent reports

​Core capabilities

​Quick start

​Image generation

​Video generation (async)

​Python client

​Endpoint reference

​Get started

​Related