INFERENCE INFRASTRUCTURE · BERLIN

SUBSTRATE.

The layer your models run on. Routing, inference, and observability with a latency budget you can read.

11msp50 latency
99.99%uptime
14regions

11ms

p50 inference latency · Llama-3.1-70B · eu-central-1 · measured over 1M requests, 30 days

We publish p50, p95, and p99. The number nobody else shows you is the one under load. Ours is on the next screen.

PLATFORM / 4 MODULES

S.0111ms p50

Inference

Any open model, autoscaled. Cold start < 900ms.

S.0214 regions

Routing

Latency- and cost-aware routing across 14 regions.

S.03100% traced

Observability

Per-request traces, token accounting, replay.

S.046 min/epoch

Fine-tuning

LoRA + full-weight, versioned, one API.

BENCHMARK / vs NAMED BASELINES

THE ONLY VENDOR PAGE
THAT SHOWS THE NUMBERS.

Providerp50p95ThroughputCost / 1M tok
Substrate11ms34ms2,400 tok/s$0.42
Hyperscaler A28ms91ms1,100 tok/s$0.71
Hyperscaler B24ms77ms1,350 tok/s$0.68
Self-hosted (ref)19ms58ms1,800 tok/s$0.55

Llama-3.1-70B · eu-central-1 · 1M requests · methodology at /benchmarks

REQUEST LIFECYCLE / 4 STAGES

01

REQUEST

Client call hits the nearest edge.

02

ROUTE

Latency/cost router selects a region + model.

03

INFER

Model executes on autoscaled capacity.

04

RETURN

Streamed tokens, fully traced.

Four stages, summed: under 11ms at p50. Every number on this page is one you can reproduce.

PRICING / TRANSPARENT

Developer

$0to start

Pay per token. 1M free tokens / month. No card.

Start building

Scale

$0.42/ 1M tok

Volume routing, all 14 regions, full observability.

Start building

Enterprise

Customannual

Dedicated capacity, SLA 99.99%, private regions, SSO.

Book a demo
“If you can’t MEASURE it, you can’t sell it to an engineer.”

— Mara Køhl, Founder & CTO, Substrate

TRUSTED BY TEAMS AT

  • NORTHLOOP
  • KAITEN AI
  • LEDGERWORKS
  • VOXEL LABS
  • MERIDIAN
  • PARSEC

We cut p95 in half the week we moved. The benchmark page was the only vendor page our staff engineers didn’t roll their eyes at.

VP EngineeringKaiten AI

A studio piece by ALO Design Pros.

Swiss precision as the material, for systems whose numbers are the argument.