DEVELOPER TOOLING · AGENT ORCHESTRATION · 2026
A 12-agent pipeline that ships its own code.
Confidential · B2B SaaS platform
THE BRIEF
A product team wanted autonomous ticket execution: spec, plan, implement, review, merge. With live visibility so engineers could watch agents work and intervene when judgment was needed. We built the orchestration layer, the evaluation harness, and cost accounting per run. A mix of local and hosted models, sized to what each task actually needs.
STACK
Multi-agent orchestrationEvaluation loopsObservabilityProgrammatic git control
MEASURED OUTCOMES
Agents in pipeline
·
12
Median spec to merged PR
3 d
→
46 min
Cost per shipped ticket
·
$1.82
Human intervention rate
100%
→
12%