Create chat completion
Chat API
OpenAI Chat
Create chat completion
POST
Create chat completion
Creates a model response for the given chat conversation. Learn more in the
text generation, vision,
and audio guides.
Parameter support can differ depending on the model used to generate the
response, particularly for newer reasoning models. Parameters that are only
supported for reasoning models are noted below. For the current state of
unsupported parameters in reasoning models,
refer to the reasoning guide.
Streaming responses
Set"stream": true to receive server-sent events (SSE) as each token is generated. This produces a lower time-to-first-token and is ideal for chat UIs.
Stream anatomy
Each line of the SSE stream looks like:- Each
data:line is a JSON object. The first chunk includes therole; subsequent chunks contain onlydelta.content. - The stream ends with a final chunk whose
finish_reasonis set, followed by a literaldata: [DONE]line. - If tool calls are used,
delta.tool_callsarrives incrementally and should be concatenated byindex.
Handling errors and timeouts
- Mid-stream errors arrive as a normal SSE event with an
errorkey instead ofchoices. Close the stream and surface the error to the caller. - Stream disconnects (network blip, client timeout) cannot be resumed — restart the request. The partial response is not billed beyond the tokens you received.
- Idle timeout: AIsa closes streams that are idle (no tokens) for more than 60 s. Set your client read timeout to 120 s to give a safety margin.
- Client backpressure: stop reading from the stream if your downstream consumer is slow — AIsa throttles delivery rather than dropping tokens.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
application/json
Response
200
Successful completion