64% of outages are caused by changes. Legible catches the unsafe ones before they reach production — using the production telemetry you're already collecting. Less 3 AM pages. More confident deploys.
Every one of these scenarios has caused a real outage. Legible detects all of them before deployment reaches production.
A deploy removes the fraud-check call from the checkout flow. Tests pass because the test suite mocks the fraud service. Canary looks healthy because fraud checks don't affect latency metrics.
Legible detects that a REQUIRED node (fraud-check) is absent from the post-deployment workflow graph. This is an invariant violation — no amount of healthy metrics can override it.
A change increases retry attempts from 2 to 5 on bank timeouts. Under peak traffic, retry amplification overwhelms the bank integration service with a 45-minute cascade outage.
Stage behavioral delta shows retry amplification of 3.5x — higher than the 2.5x ceiling configured for this workflow. The boundary flags EXCEEDED before production traffic hits.
A developer adds a direct call from payment-service to notification-service, bypassing the event bus. The new synchronous dependency creates a single point of failure nobody noticed.
New edge (payment→notification) was not in the Trusted Change Boundary. Structurally significant unexplained change — IIC predicted retry changes, not a new dependency.
A Consul config update changes connection pool size for inventory service. +40ms on p95 — individually acceptable, but shifts the checkout workflow p99 past the 3-second SLA.
Temporal conformance dimension detects latency shift exceeds hard_max_latency for checkout workflow. Even though each service looks healthy individually, workflow-level impact triggers governance.
A feature flag gradually rolls out a new recommendation algorithm. Over 3 weeks, the call pattern drifts from baseline. No single deployment caused the drift — it accumulated through progressive rollout.
Boundary drift detection tracks boundary composition over time. When the non-SBD percentage exceeds 40%, it emits BOUNDARY_DRIFT_WARNING before it causes an incident.
No synthetic tests. No policy guesswork. Just observed reality from your existing OpenTelemetry traces.
Projected impact based on design partner analysis. Results depend on deployment volume and system complexity.
When you skip staging for a hotfix, Legible enters Restricted Authority Mode. Less evidence means less automated trust — not no governance.
When the hotfix later goes through normal staging, Legible automatically reconciles:
If you're tired of being paged for problems that could have been caught in the pipeline, let's talk.
Talk to us →