Legal-commercial document

Service Level Commitments (SLA)

Version 1.0 · Effective from May 1, 2026

Last updated: May 1, 2026. Applies to customers with an active Tessera Lite, Pro, Pro+, Scale or Enterprise contract.

Introduction

At Tessera we believe an SLA must be concrete, verifiable and compensable. Every commitment in this document is based on measured metrics, not estimates, and a breach generates automatic credits on the next invoice. This page is the single source of truth for our commitments. If you find a discrepancy between this document and a signed contract, the contract prevails.

Definitions

Monthly uptime
Percentage of the calendar month during which the api.tesseraai.cloud/v1 API responds with legitimate 2xx or 4xx codes. 5xx responses caused by Tessera and timeouts count as downtime.
TTFT (Time To First Token)
Time between the gateway receiving a request and emitting the first model token, measured in milliseconds. Excludes network latency between client and gateway.
E2E latency
Total time between request entry at the gateway and emission of the last token, measured in seconds.
Success rate
Percentage of requests returning a response with valid model content, free of 5xx errors and internal-failure truncations. Excludes 4xx errors caused by the client.
Scheduled maintenance
Update or configuration operation planned with the per-tier minimum lead time. Executed preferentially during low-traffic windows.
Emergency maintenance
Urgent operation to mitigate a critical vulnerability (CVE 9.0 or higher) or an active incident. Does not require advance notice but is documented after the fact.
Service credit
Economic compensation applied as a deduction on the customer’s next invoice, calculated as a percentage of the affected tier’s monthly fee.
Measurement period
Calendar month (day 1 to last day), Europe/Madrid timezone. Metrics are calculated at month-end over the customer’s aggregated requests.

Per-tier commitments

The table below summarises per-tier commitments. Values are contractual and measured according to the definitions above.

Commitment LiteProPro+ScaleEnterprise
Monthly uptime 99.0 %99.5 %99.5 %99.9 %99.95 % (negotiable up to 99.99 %)
TTFT P95, context ≤ 16K < 3s< 2s< 2s< 1stailored
TTFT P95, context 16K – 64K not applicable< 5s (best-effort)< 10s< 5stailored
TTFT P95, context 64K – 128K not applicablenot applicable< 30s< 20stailored
TTFT P95, context 128K – 256K+ not applicablenot applicablenot applicable< 60s (with caveats)tailored
Generation success rate 99.5 %99.5 %99.5 %99.5 %99.5 %
Support response time (business hours) 24h8h4h1h30 min
P1 incident response time (service down) 4h2h1h30 min15 min (24/7)
Scheduled-maintenance notice 48h72h7 days14 days30 days + negotiated window
Maximum scheduled maintenance / month 4h2h1h30 min (or 0 with active redundancy)0
Tessera Lite view commitments
Monthly uptime
99.0 %
TTFT P95, context ≤ 16K
< 3s
TTFT P95, context 16K – 64K
not applicable
TTFT P95, context 64K – 128K
not applicable
TTFT P95, context 128K – 256K+
not applicable
Generation success rate
99.5 %
Support response time (business hours)
24h
P1 incident response time (service down)
4h
Scheduled-maintenance notice
48h
Maximum scheduled maintenance / month
4h
Tessera Pro view commitments
Monthly uptime
99.5 %
TTFT P95, context ≤ 16K
< 2s
TTFT P95, context 16K – 64K
< 5s (best-effort)
TTFT P95, context 64K – 128K
not applicable
TTFT P95, context 128K – 256K+
not applicable
Generation success rate
99.5 %
Support response time (business hours)
8h
P1 incident response time (service down)
2h
Scheduled-maintenance notice
72h
Maximum scheduled maintenance / month
2h
Tessera Pro+ view commitments
Monthly uptime
99.5 %
TTFT P95, context ≤ 16K
< 2s
TTFT P95, context 16K – 64K
< 10s
TTFT P95, context 64K – 128K
< 30s
TTFT P95, context 128K – 256K+
not applicable
Generation success rate
99.5 %
Support response time (business hours)
4h
P1 incident response time (service down)
1h
Scheduled-maintenance notice
7 days
Maximum scheduled maintenance / month
1h
Tessera Scale view commitments
Monthly uptime
99.9 %
TTFT P95, context ≤ 16K
< 1s
TTFT P95, context 16K – 64K
< 5s
TTFT P95, context 64K – 128K
< 20s
TTFT P95, context 128K – 256K+
< 60s (with caveats)
Generation success rate
99.5 %
Support response time (business hours)
1h
P1 incident response time (service down)
30 min
Scheduled-maintenance notice
14 days
Maximum scheduled maintenance / month
30 min (or 0 with active redundancy)
Tessera Enterprise view commitments
Monthly uptime
99.95 % (negotiable up to 99.99 %)
TTFT P95, context ≤ 16K
tailored
TTFT P95, context 16K – 64K
tailored
TTFT P95, context 64K – 128K
tailored
TTFT P95, context 128K – 256K+
tailored
Generation success rate
99.5 %
Support response time (business hours)
30 min
P1 incident response time (service down)
15 min (24/7)
Scheduled-maintenance notice
30 days + negotiated window
Maximum scheduled maintenance / month
0

Credits for breaches

When a commitment is not met in a measurement period, the customer receives a credit calculated on the affected tier’s monthly fee. Percentages are cumulative by category within the same month.

Monthly uptime below commitment

Difference vs commitmentCredit
0.1 % – 1 %10 % of monthly fee
1 % – 5 %25 % of monthly fee
5 % – 10 %50 % of monthly fee
More than 10 %100 % of monthly fee + option to cancel without penalty

TTFT P95 above commitment (calendar month)

Difference vs commitmentCredit
Up to 50 % over5 % of monthly fee
50 % – 100 % over15 % of monthly fee
More than 100 % over25 % of monthly fee

Success rate below 99.5 %

Difference vs commitmentCredit
99.0 % – 99.4 %10 % of monthly fee
98.0 % – 98.9 %25 % of monthly fee
Below 98 %50 % of monthly fee + mandatory joint technical review

Support response-time breach

Incidents in monthCredit
1 incidentInternal note, no credit
2 – 3 incidents5 % of monthly fee
More than 3 incidents10 % of monthly fee

Credits are calculated on the contracted tier’s monthly fee. They apply automatically on the next invoice after verification against Tessera’s internal logs. The customer may request manual review if they disagree with the calculation.

Exclusions

The following cases do not count as breaches and do not generate credit:

  1. Scheduled maintenance with the per-tier agreed lead time.
  2. Emergency maintenance to mitigate a critical vulnerability (CVE 9.0 or higher).
  3. Outages of upstream providers (Cloudflare, DNS roots, regional hyperscaler). Tessera will provide documented evidence of the external cause.
  4. Customer use beyond tier limits (exceeded RPM, context above maximum, etc.).
  5. Outages caused by customer code, including infinite loops, mishandled errors and abusive retry patterns.
  6. Force majeure: natural disasters, armed conflict, government decisions or international sanctions.
  7. Suspension after 30 days of confirmed non-payment.
  8. Requests exceeding documented reasonable limits (Whisper audio > 2h, TTS text > 50K characters, embedding batches > 10K documents per request).

Claim procedure

  1. Customer identifies the breach using account logs or metrics published at status.tesseraai.cloud.
  2. Opens a ticket at support@tesseraai.cloud with subject SLA Claim: [short description].
  3. Tessera replies within a maximum equal to their tier’s support response time, with a preliminary analysis.
  4. If the breach is confirmed, the credit applies automatically on the next invoice.
  5. If the customer disagrees with the analysis, it escalates to a joint technical review by founder and senior engineer, resolved within five business days.

Status page and measurement

Tessera publishes real-time service metrics at status.tesseraai.cloud. The metrics shown there are the single source of truth for SLA verification. All metrics are measured from internal gateways over real customer requests, not from external synthetic probes. This means the metrics reflect the actual experience, not a proxy.

  • Data retained for 90 days for retrospective verification.
  • Monthly uptime history published on day 5 of the following month.
  • Customers can export their individual metrics via API at any time.
  • Audit logs available for Pro+, Scale and Enterprise tiers, exportable to your own S3 or GCS bucket.

Special cases

9.1 Long context (more than 16K tokens)

TTFT SLAs for context above 16K reflect the physical reality of the model: more tokens means more compute time. If your use case requires intensive long context, we recommend evaluating Tessera Pro+ or Scale for stricter guarantees.

9.2 Thinking mode

Thinking mode (extended reasoning) is not covered by the TTFT SLA. Latency in thinking mode is typically ten times that of direct mode and depends on task complexity. Each tier has a documented monthly thinking-mode quota in its plan.

9.3 Bundle services (embeddings, rerank, Whisper, TTS)

Bundle services have an uptime and success-rate SLA identical to the main LLM. TTFT SLAs do not apply to these services; each has an equivalent metric:

  • Embeddings: P95 latency under 500 ms for batches up to 100 documents.
  • Whisper: real-time factor under 0.5 (one hour of audio processed in under 30 minutes).
  • TTS: P95 latency under 2 s for texts up to 500 characters.

9.4 Transparent failover

Tessera uses intelligent routing across multiple GPU providers. When a primary provider degrades, traffic is automatically routed to secondary providers. During failover, the served model may vary (always within the compatible Qwen 3.x family) without affecting the API. The SLA applies to the aggregate service, not to a specific server.

Contact

Changelog

Version Date Changes
1.0 2026-05-01 Initial publication.