API Reference

/v1/chat/completions

Main chat endpoint. Compatible with OpenAI v1: messages, streaming, tools, response_format. Default model: `Qwen/Qwen3.6-35B-A3B`.

Overview

Creates a conversational completion from a list of role-typed messages (`system`, `user`, `assistant`, `tool`). Supports SSE streaming, parallel tool calling, and `response_format=json_object` or `json_schema` for structured output.

Endpoint and model

POST `https://api.tesseraai.cloud/v1/chat/completions`. The `model` field literal is `Qwen/Qwen3.6-35B-A3B`.

  • Context: 32K default, up to 128K configurable on Scale, up to 256K native on Enterprise.
  • Both `direct` and `thinking` modes available. The real switch is the `chat_template_kwargs.enable_thinking` body field (use `extra_body` with the OpenAI SDK). OpenAI's `reasoning_effort` parameter is not translated — pass the flag explicitly. Details at /docs/concepts/direct-vs-thinking.
  • Streaming: add `"stream": true`. Returns OpenAI-compatible SSE events.

Request

POST /v1/chat/completions
curl https://api.tesseraai.cloud/v1/chat/completions \
  -H "Authorization: Bearer $TESSERA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3.6-35B-A3B",
    "messages": [
      {"role": "system", "content": "You are a concise assistant."},
      {"role": "user", "content": "Summarise Cervantes in one sentence."}
    ],
    "temperature": 0.3,
    "max_tokens": 200
  }'

Response

Identical structure to OpenAI: `id`, `object`, `created`, `model`, `choices[].message.content`, `usage.prompt_tokens` / `completion_tokens` / `total_tokens`.

Response (non-streaming)
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1730812345,
  "model": "Qwen/Qwen3.6-35B-A3B",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "..."},
    "finish_reason": "stop"
  }],
  "usage": {"prompt_tokens": 24, "completion_tokens": 41, "total_tokens": 65}
}

Streaming, tools and structured JSON

  • `"stream": true` enables SSE; the client receives `chat.completion.chunk` events until the final `[DONE]`.
  • `tools` and `tool_choice` follow OpenAI v1 — see /docs/concepts/tools-function-calling.
  • `response_format: {"type": "json_object"}` for free-form JSON; `{"type": "json_schema", "json_schema": {...}}` for validated structured JSON.