The brief: A mid-market SaaS company had a brittle ticket-routing rules engine. Customers waited days for a reply because tickets landed in the wrong queue, agents wasted hours triaging, and the rules were nobody’s full-time job.

What we built

A triage agent that reads each incoming ticket, classifies intent against the company’s actual queues (not a generic taxonomy), drafts a first-touch reply where the case is straightforward, and routes ambiguous tickets to a senior agent with an explanation of why it was unsure.

  • Built on Anthropic’s API with a thin orchestration layer in TypeScript.
  • Eval suite of 400 historical tickets, regraded by the internal QA lead — pass-rate gate on every deploy.
  • Shadow mode for two weeks before any auto-routing decisions reached production.
  • Grafana board with per-queue accuracy, false-routing rate, and escalation latency.

What changed

  • −47% median handling time within six weeks of go-live.
  • +22 points NPS on the first-reply survey.
  • Two support engineers redeployed to deflection projects.

What we left behind

An eval suite the team maintains, a runbook that on-call has actually used twice, and a quarterly review template so the agent does not silently degrade as the product changes.