Quickstart

Conceptos

API Reference

Guías

RAG con embeddings próximamente
Agentes con tool calling próximamente
Streaming responses próximamente
Structured outputs (JSON mode) próximamente

Operación

Status & uptime próximamente
Webhooks próximamente
Audit logs próximamente
DPA y compliance

SDKs

Python (oficial) próximamente
Node.js (oficial) próximamente
Go (oficial) próximamente
Otros (LangChain, LlamaIndex, AI SDK Vercel) próximamente

Conceptos

Tiers y límites

Los límites de Tessera protegen la capacidad dedicada y hacen que la factura plana siga siendo predecible para cada cliente.

Resumen por tier

Tier	RPM sostenido	Burst	Contexto	Thinking
Async	Queue	No aplica	16 K	No recomendado
Lite	50	100 durante 5 min/h	8 K default · 32 K configurable	100 req/mes
Pro	200	400 durante 5 min/h	8 K default · 32 K configurable	1.000 req/mes
Pro+	500	700 durante 5 min/h	16 K default · 32 K configurable	Ilimitado
Scale	5.000+	Negociable	A medida	Ilimitado · prioridad alta

Sublímites del bundle

Embeddings tienen cuota separada para no bloquear tráfico conversacional.
Whisper y TTS se limitan por RPM y tamaño razonable de audio o texto.
Los 429 deben tratarse con backoff exponencial y reintentos con jitter.

Qué ocurre al superar un límite

La API devuelve `429` con metadatos suficientes para reintentar. En planes Pro y superiores se puede pactar un burst mayor o mover cargas batch al tier Async.