Concepts

Tiers and limits

Tessera limits protect dedicated capacity and keep flat-rate billing predictable for every customer.

Tier summary

TierSustained RPMBurstContextThinking
AsyncQueueNot applicable16KNot recommended
Lite50100 for 5 min/h8K default · 32K configurable100 req/month
Pro200400 for 5 min/h8K default · 32K configurable1,000 req/month
Pro+500700 for 5 min/h16K default · 32K configurableUnlimited
Scale5,000+NegotiableCustomUnlimited · high priority

Bundle sublimits

  • Embeddings have a separate quota so conversational traffic is not blocked.
  • Whisper and TTS are limited by RPM and reasonable audio or text size.
  • Handle 429 responses with exponential backoff and jittered retries.

When a limit is exceeded

The API returns `429` with enough metadata to retry. On Pro and higher, you can negotiate a larger burst or move batch workloads to Async.