01 · Section
Deflection is the only metric that matters
CSAT on AI-handled tickets is interesting; deflection rate — the percentage of tickets fully resolved without a human — is what justifies the build. The agents we have shipped land in the 35–60% range, with the upper end requiring a tightly scoped product and excellent documentation.
If your docs are weak, no agent on earth will save you. Fix the knowledge base first.
02 · Section
The architecture that ships
Claude or GPT as the reasoning model. Embeddings of your docs, knowledge base and resolved tickets in pgvector. A small set of tools — order lookup, account state, refund initiation — exposed via function calling, with strict input schemas.
Always include a "transfer to human" tool with a low threshold. Customers escalate themselves before they get frustrated, and the agent earns trust by routing instead of guessing.
03 · Section
Guardrails are not optional
Output filters on PII, profanity and competitor names. Hard limits on refund and credit amounts the agent can issue without human review. An eval set of 100+ historical tickets that runs on every prompt change, with regression alerts.
The agents that go viral for the wrong reasons skipped guardrails. The ones quietly saving teams 20 hours a week did not.
04 · Section
Where humans still win
Anything emotional — refunds for a missed birthday gift, complaints from long-time customers, churn save calls. Agents handle the long tail of routine queries; humans handle the moments that matter.
Design the system so the human team gets the high-leverage interactions, not the boring ones. That is the deal that makes both sides happy.
Key takeaways
- Track deflection rate, not just CSAT.
- Fix the knowledge base before deploying any agent — garbage in, garbage out.
- Always expose a low-friction "transfer to human" tool.
- Ship guardrails and an eval set from day one, not after the first incident.
Tags
Written by
Hassan Ali
8 min read · Posted in AI/ML