Skip to main content
POST
/
chat
/
completions
Create chat completion
curl --request POST \
  --url https://api.aisa.one/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-4.1",
  "messages": [
    {
      "role": "user",
      "content": "<string>"
    }
  ],
  "stream": false,
  "logprobs": true,
  "top_logprobs": 123,
  "functions": [
    {
      "name": "<string>",
      "description": "<string>",
      "parameters": {}
    }
  ],
  "function_call": "auto"
}
'
Creates a model response for the given chat conversation. Learn more in the text generation, vision, and audio guides. Parameter support can differ depending on the model used to generate the response, particularly for newer reasoning models. Parameters that are only supported for reasoning models are noted below. For the current state of unsupported parameters in reasoning models, refer to the reasoning guide.

Streaming responses

Set "stream": true to receive server-sent events (SSE) as each token is generated. This produces a lower time-to-first-token and is ideal for chat UIs.
curl https://api.aisa.one/v1/chat/completions \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "Write a haiku about APIs."}],
    "stream": true
  }'

Stream anatomy

Each line of the SSE stream looks like:
data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Quiet"},"index":0}]}
data: {"id":"chatcmpl-...","choices":[{"delta":{"content":" packets"},"index":0}]}
...
data: {"id":"chatcmpl-...","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
data: [DONE]
  • Each data: line is a JSON object. The first chunk includes the role; subsequent chunks contain only delta.content.
  • The stream ends with a final chunk whose finish_reason is set, followed by a literal data: [DONE] line.
  • If tool calls are used, delta.tool_calls arrives incrementally and should be concatenated by index.

Handling errors and timeouts

  • Mid-stream errors arrive as a normal SSE event with an error key instead of choices. Close the stream and surface the error to the caller.
  • Stream disconnects (network blip, client timeout) cannot be resumed — restart the request. The partial response is not billed beyond the tokens you received.
  • Idle timeout: AIsa closes streams that are idle (no tokens) for more than 60 s. Set your client read timeout to 120 s to give a safety margin.
  • Client backpressure: stop reading from the stream if your downstream consumer is slow — AIsa throttles delivery rather than dropping tokens.
Streaming bills the same per-token rate as non-streaming. You pay for tokens that were delivered, even if the stream is cut off mid-response.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
Example:

"gpt-4.1"

messages
object[]
stream
boolean
Example:

false

logprobs
boolean
top_logprobs
integer
functions
object[]
function_call
Example:

"auto"

Response

200

Successful completion