March 2025·AI Engineering·12 min read

How I built an AI that handles 300 customer tickets a day - and what it taught me about trust

The CEO called me into a meeting room with no windows. "We are losing customers faster than we can hire support agents." That conversation changed the next year of my life.

The Meeting

It was a Tuesday in February when my employer at the time - a mid-size eCommerce company - pulled me into a meeting that would define the next chapter of my career. The support queue had hit 312 tickets that morning. Average resolution time: 4.2 hours. Customer satisfaction was tanking. Three support agents had quit in the last month.

"Can AI fix this?" The question hung in the air like a dare.

I wanted to say yes. The honest answer was more nuanced: "It can, but not the way you are imagining."

What they were imagining was a chatbot. A magic box you plug in and suddenly everything is automated. What I was imagining was something far more interesting - and far more difficult.

The First Attempt (And Why It Failed)

My first prototype was embarrassing. I connected the Claude API to our ticketing system, fed it our FAQ, and let it generate responses. Day one: it told a customer their order had shipped when it was actually backordered. Day two: it offered a 50% discount to someone who was asking about return policy. Day three: I pulled the plug.

The problem was not the AI's intelligence. It was context. A support ticket is not just a question - it is a person with a history, an order with a status, a situation with nuance. Without access to all of that context, the AI was just a very confident guesser.

Building the Real System

I spent the next six weeks rebuilding from scratch. Three layers, each solving a specific problem:

Layer 1: Memory. I embedded every past ticket, resolution, knowledge base article, and FAQ into a vector database. When a new ticket arrived, the system instantly retrieved the five most relevant past cases. Not keyword matching - semantic similarity. "My package never arrived" would match against "delivery issue resolved with reshipping" even though they share no words.

Layer 2: Hands. This was the breakthrough. The AI could not just suggest responses - it could act. Through carefully designed tool-calling functions, it could pull order details from our ERP, check shipping status via carrier APIs, generate discount coupons within approved limits, and even initiate return processes. Every action logged with full audit trail.

Layer 3: Judgment. Every ticket got a confidence score. Above 90%? Auto-respond, no human needed. Between 70-90%? Draft a response, flag for human review. Below 70%? Escalate immediately with full context summary. This was the layer that made the whole thing trustworthy.

The Human Problem

Here is what nobody tells you about deploying AI in a team: the technology is the easy part.

When I presented the system to the support team, I could see the fear in their eyes. "So you built our replacement?" asked Maria, who had been with the company for seven years. I had to be honest with her.

"No. I built a tool that handles the boring stuff so you can focus on the cases that actually need a human brain."

We ran both systems in parallel for a month. Every AI response was visible to the agents. They could override anything. I tracked every override and fed corrections back into the system. By week two, something shifted. Maria started voluntarily checking the AI's draft before writing her own response. "It is actually pretty good at the shipping questions," she admitted.

By week four, the team was competing with each other over who could handle the most complex escalated cases. The AI handled the repetitive work. The humans handled the interesting stuff. Everyone was happier.

The Numbers

After three months in production:

85% of tickets resolved autonomously - verified against human quality standards
Average resolution time dropped from 4.2 hours to 1.8 hours
Customer satisfaction increased by 22% (we measured, we did not assume)
Zero support agents quit in that quarter
The company estimated annual savings at $300K in reduced overtime and hiring costs

The 15% That Taught Me Everything

The 85% that the AI handles? That is engineering. Interesting, but predictable. The 15% it escalates? That is where I learned the most about building AI systems.

A customer whose mother had just passed away, asking about canceling a subscription they had shared. A small business owner explaining they could not afford the return shipping on a bulk order that arrived damaged. A teenager asking if we could rush delivery because they were buying a birthday gift with their first paycheck.

No AI should handle those tickets. But an AI that recognizes them and routes them to a human who can? That is a system worth building.

The best AI systems do not replace human judgment. They create space for it.

What I Would Do Differently

If I built this system again tomorrow, I would start with the escalation logic, not the automation. Understanding what should NOT be automated is more valuable than automating everything you can.

I would also involve the support team from day one, not after the prototype was built. They knew things about customer behavior that no dataset could teach me. Maria told me that customers who use exclamation marks in their first message are not angry - they are excited. That insight alone improved our sentiment classification by 12%.

AI is not a replacement for human expertise. It is an amplifier. But you have to know what you are amplifying.

Igor Gawrys

AI Engineer & IT Consultant · Katowice, Poland

← Previous

Why I chose Laravel Zero for a CLI tool - and what it taught me about framework selection

I was 15, running a company, and learning that managing people is harder than any codebase