AI ENGINEERING · OBSERVABILITY + FINOPS · 2026
Every agent run, priced and explained.
Confidential · scale-stage AI product team
THE BRIEF
A product team had a multi-agent system running in production and no way to answer why any single run cost what it did. We built the trace layer first: structured traces, token accounting, cost attribution per request path. Then used it to find waste: routine work routed to cheaper models, retry storms contained, caching added where hit rates justified it.
STACK
Structured tracesToken accountingCost attributionLatency profiling
MEASURED OUTCOMES
Cost attribution coverage
·
every call
Cost per conversation turn
$0.11
→
$0.03
Stuck-run detection time
3 h
→
4 min
Dashboard p95 load
·
420 ms