Service Level Commitments (SLA)
Version 1.0 · Effective from May 1, 2026
Introduction
At Tessera we believe an SLA must be concrete, verifiable and compensable. Every commitment in this document is based on measured metrics, not estimates, and a breach generates automatic credits on the next invoice. This page is the single source of truth for our commitments. If you find a discrepancy between this document and a signed contract, the contract prevails.
Definitions
- Monthly uptime
- Percentage of the calendar month during which the api.tesseraai.cloud/v1 API responds with legitimate 2xx or 4xx codes. 5xx responses caused by Tessera and timeouts count as downtime.
- TTFT (Time To First Token)
- Time between the gateway receiving a request and emitting the first model token, measured in milliseconds. Excludes network latency between client and gateway.
- E2E latency
- Total time between request entry at the gateway and emission of the last token, measured in seconds.
- Success rate
- Percentage of requests returning a response with valid model content, free of 5xx errors and internal-failure truncations. Excludes 4xx errors caused by the client.
- Scheduled maintenance
- Update or configuration operation planned with the per-tier minimum lead time. Executed preferentially during low-traffic windows.
- Emergency maintenance
- Urgent operation to mitigate a critical vulnerability (CVE 9.0 or higher) or an active incident. Does not require advance notice but is documented after the fact.
- Service credit
- Economic compensation applied as a deduction on the customer’s next invoice, calculated as a percentage of the affected tier’s monthly fee.
- Measurement period
- Calendar month (day 1 to last day), Europe/Madrid timezone. Metrics are calculated at month-end over the customer’s aggregated requests.
Per-tier commitments
The table below summarises per-tier commitments. Values are contractual and measured according to the definitions above.
| Commitment | Lite | Pro | Pro+ | Scale | Enterprise |
|---|---|---|---|---|---|
| Monthly uptime | 99.0 % | 99.5 % | 99.5 % | 99.9 % | 99.95 % (negotiable up to 99.99 %) |
| TTFT P95, context ≤ 16K | < 3s | < 2s | < 2s | < 1s | tailored |
| TTFT P95, context 16K – 64K | not applicable | < 5s (best-effort) | < 10s | < 5s | tailored |
| TTFT P95, context 64K – 128K | not applicable | not applicable | < 30s | < 20s | tailored |
| TTFT P95, context 128K – 256K+ | not applicable | not applicable | not applicable | < 60s (with caveats) | tailored |
| Generation success rate | 99.5 % | 99.5 % | 99.5 % | 99.5 % | 99.5 % |
| Support response time (business hours) | 24h | 8h | 4h | 1h | 30 min |
| P1 incident response time (service down) | 4h | 2h | 1h | 30 min | 15 min (24/7) |
| Scheduled-maintenance notice | 48h | 72h | 7 days | 14 days | 30 days + negotiated window |
| Maximum scheduled maintenance / month | 4h | 2h | 1h | 30 min (or 0 with active redundancy) | 0 |
Tessera Lite view commitments
- Monthly uptime
- 99.0 %
- TTFT P95, context ≤ 16K
- < 3s
- TTFT P95, context 16K – 64K
- not applicable
- TTFT P95, context 64K – 128K
- not applicable
- TTFT P95, context 128K – 256K+
- not applicable
- Generation success rate
- 99.5 %
- Support response time (business hours)
- 24h
- P1 incident response time (service down)
- 4h
- Scheduled-maintenance notice
- 48h
- Maximum scheduled maintenance / month
- 4h
Tessera Pro view commitments
- Monthly uptime
- 99.5 %
- TTFT P95, context ≤ 16K
- < 2s
- TTFT P95, context 16K – 64K
- < 5s (best-effort)
- TTFT P95, context 64K – 128K
- not applicable
- TTFT P95, context 128K – 256K+
- not applicable
- Generation success rate
- 99.5 %
- Support response time (business hours)
- 8h
- P1 incident response time (service down)
- 2h
- Scheduled-maintenance notice
- 72h
- Maximum scheduled maintenance / month
- 2h
Tessera Pro+ view commitments
- Monthly uptime
- 99.5 %
- TTFT P95, context ≤ 16K
- < 2s
- TTFT P95, context 16K – 64K
- < 10s
- TTFT P95, context 64K – 128K
- < 30s
- TTFT P95, context 128K – 256K+
- not applicable
- Generation success rate
- 99.5 %
- Support response time (business hours)
- 4h
- P1 incident response time (service down)
- 1h
- Scheduled-maintenance notice
- 7 days
- Maximum scheduled maintenance / month
- 1h
Tessera Scale view commitments
- Monthly uptime
- 99.9 %
- TTFT P95, context ≤ 16K
- < 1s
- TTFT P95, context 16K – 64K
- < 5s
- TTFT P95, context 64K – 128K
- < 20s
- TTFT P95, context 128K – 256K+
- < 60s (with caveats)
- Generation success rate
- 99.5 %
- Support response time (business hours)
- 1h
- P1 incident response time (service down)
- 30 min
- Scheduled-maintenance notice
- 14 days
- Maximum scheduled maintenance / month
- 30 min (or 0 with active redundancy)
Tessera Enterprise view commitments
- Monthly uptime
- 99.95 % (negotiable up to 99.99 %)
- TTFT P95, context ≤ 16K
- tailored
- TTFT P95, context 16K – 64K
- tailored
- TTFT P95, context 64K – 128K
- tailored
- TTFT P95, context 128K – 256K+
- tailored
- Generation success rate
- 99.5 %
- Support response time (business hours)
- 30 min
- P1 incident response time (service down)
- 15 min (24/7)
- Scheduled-maintenance notice
- 30 days + negotiated window
- Maximum scheduled maintenance / month
- 0
Credits for breaches
When a commitment is not met in a measurement period, the customer receives a credit calculated on the affected tier’s monthly fee. Percentages are cumulative by category within the same month.
Monthly uptime below commitment
| Difference vs commitment | Credit |
|---|---|
| 0.1 % – 1 % | 10 % of monthly fee |
| 1 % – 5 % | 25 % of monthly fee |
| 5 % – 10 % | 50 % of monthly fee |
| More than 10 % | 100 % of monthly fee + option to cancel without penalty |
TTFT P95 above commitment (calendar month)
| Difference vs commitment | Credit |
|---|---|
| Up to 50 % over | 5 % of monthly fee |
| 50 % – 100 % over | 15 % of monthly fee |
| More than 100 % over | 25 % of monthly fee |
Success rate below 99.5 %
| Difference vs commitment | Credit |
|---|---|
| 99.0 % – 99.4 % | 10 % of monthly fee |
| 98.0 % – 98.9 % | 25 % of monthly fee |
| Below 98 % | 50 % of monthly fee + mandatory joint technical review |
Support response-time breach
| Incidents in month | Credit |
|---|---|
| 1 incident | Internal note, no credit |
| 2 – 3 incidents | 5 % of monthly fee |
| More than 3 incidents | 10 % of monthly fee |
Credits are calculated on the contracted tier’s monthly fee. They apply automatically on the next invoice after verification against Tessera’s internal logs. The customer may request manual review if they disagree with the calculation.
Exclusions
The following cases do not count as breaches and do not generate credit:
- Scheduled maintenance with the per-tier agreed lead time.
- Emergency maintenance to mitigate a critical vulnerability (CVE 9.0 or higher).
- Outages of upstream providers (Cloudflare, DNS roots, regional hyperscaler). Tessera will provide documented evidence of the external cause.
- Customer use beyond tier limits (exceeded RPM, context above maximum, etc.).
- Outages caused by customer code, including infinite loops, mishandled errors and abusive retry patterns.
- Force majeure: natural disasters, armed conflict, government decisions or international sanctions.
- Suspension after 30 days of confirmed non-payment.
- Requests exceeding documented reasonable limits (Whisper audio > 2h, TTS text > 50K characters, embedding batches > 10K documents per request).
Claim procedure
- Customer identifies the breach using account logs or metrics published at status.tesseraai.cloud.
- Opens a ticket at support@tesseraai.cloud with subject SLA Claim: [short description].
- Tessera replies within a maximum equal to their tier’s support response time, with a preliminary analysis.
- If the breach is confirmed, the credit applies automatically on the next invoice.
- If the customer disagrees with the analysis, it escalates to a joint technical review by founder and senior engineer, resolved within five business days.
Status page and measurement
Tessera publishes real-time service metrics at status.tesseraai.cloud. The metrics shown there are the single source of truth for SLA verification. All metrics are measured from internal gateways over real customer requests, not from external synthetic probes. This means the metrics reflect the actual experience, not a proxy.
- Data retained for 90 days for retrospective verification.
- Monthly uptime history published on day 5 of the following month.
- Customers can export their individual metrics via API at any time.
- Audit logs available for Pro+, Scale and Enterprise tiers, exportable to your own S3 or GCS bucket.
Special cases
9.1 Long context (more than 16K tokens)
TTFT SLAs for context above 16K reflect the physical reality of the model: more tokens means more compute time. If your use case requires intensive long context, we recommend evaluating Tessera Pro+ or Scale for stricter guarantees.
9.2 Thinking mode
Thinking mode (extended reasoning) is not covered by the TTFT SLA. Latency in thinking mode is typically ten times that of direct mode and depends on task complexity. Each tier has a documented monthly thinking-mode quota in its plan.
9.3 Bundle services (embeddings, rerank, Whisper, TTS)
Bundle services have an uptime and success-rate SLA identical to the main LLM. TTFT SLAs do not apply to these services; each has an equivalent metric:
- Embeddings: P95 latency under 500 ms for batches up to 100 documents.
- Whisper: real-time factor under 0.5 (one hour of audio processed in under 30 minutes).
- TTS: P95 latency under 2 s for texts up to 500 characters.
9.4 Transparent failover
Tessera uses intelligent routing across multiple GPU providers. When a primary provider degrades, traffic is automatically routed to secondary providers. During failover, the served model may vary (always within the compatible Qwen 3.x family) without affecting the API. The SLA applies to the aggregate service, not to a specific server.
Contact
- SLA legal questions: legal@tesseraai.cloud
- Breach claims: support@tesseraai.cloud
Changelog
| Version | Date | Changes |
|---|---|---|
| 1.0 | 2026-05-01 | Initial publication. |