Voice of the Customer Programs Explained: Challenges, Insights, and How to Get It Right
This article originally appeared on Edge Signals – Bart Lehane’s LinkedIn newsletter on customer experience, analytics, and AI. Follow for
Generative AI chatbots are transforming customer service – but also creating a new kind of risk. When AI assistants hallucinate, mishandle data, or go off-script, it’s not just bad CX – it’s a compliance, reputation, and operational issue. This article explores how CX leaders can balance automation with assurance, and why governance is now a…

| Generative AI chatbots are transforming customer service – but also creating a new kind of risk. When AI assistants hallucinate, mishandle data, or go off-script, it’s not just bad CX – it’s a compliance, reputation, and operational issue. This article explores how CX leaders can balance automation with assurance, and why governance is now a must-have in the AI era. Here’s what you’ll learn: ✔ Why AI’s biggest CX challenge isn’t capability – it’s accountability. ✔ The real-world risks behind chatbot failures, from data leaks to viral mishaps. ✔ How regulation (like the EU AI Act) is redefining AI governance in customer service. ✔ The four lines of defense every organization needs for safe, compliant AI interactions. ✔ How an “assurance layer” turns AI from a liability into a sustainable competitive edge. |
Generative AI chatbots have exploded into customer service. Every week brings a new headline about automation, efficiency, or “instant answers at scale.” For CX heads and contact centre leaders under pressure to do more with less, the temptation is obvious: deploy a bot, deflect tickets, declare victory!
But there’s another side to the story – the one that makes headlines for the wrong reasons. At EdgeTier, we see this tension every day: CX leaders eager to deploy AI, but worried about what happens when those systems go off-script.
In August 2025, Lenovo’s customer service chatbot, Lena, was caught out by a clever trick. Security researchers used a single 400-character prompt to make the ChatGPT-powered assistant reveal sensitive company data – including live session cookies from real support agents. Lenovo fixed the flaw fast, but the incident showed just how easily even the smartest AI can be duped.
In January 2024, a customer asked the UK parcel delivery company DPD’s chatbot for help with a missing package. Frustrated by unhelpful responses, he asked it to write a poem criticizing the company. It complied. Then he asked it to swear. It did that too and went instantly viral on social media.
These aren’t isolated incidents and it isn’t just bad CX. It’s a risk event. And it’s exactly why the next competitive frontier in AI customer experience isn’t actually the bot itself, it’s the assurance layer that surrounds it.
AI in customer service is no longer experimental. It’s mainstream. Industry analysts, like Gartner, project that by 2028, the majority of customer interactions will involve some form of generative AI. Service leaders are investing heavily in conversational AI to reduce handle times, increase availability, and scale support without scaling headcount.
The payoff is real. Faster resolution. Always-on support. Consistent tone across channels. Done right, it’s transformative.
But here’s the paradox: the same technology that makes AI so powerful also makes it unpredictable.
Large language models are probabilistic in that they don’t retrieve facts, they predict what’s likely to come next based on patterns in their training data. That’s why even with safeguards like retrieval-augmented generation (RAG) or domain-specific fine-tuning, they can still invent facts, misquote policy, or use outdated information.
In controlled conditions (think simple FAQs, narrow domains, clean data) LLMs perform astonishingly well. But customer service isn’t controlled. It’s messy, emotional, and full of edge cases. A customer asking about a refund might mention a medical emergency, a complaint, a competitor, and a billing error all in one message. The bot has to navigate policy, tone, data privacy, and empathy simultaneously.
And when it gets even one of those wrong, the consequences spread fast.
Research into LLM accuracy shows hallucination rates varying widely by context – from under 5% for straightforward questions to over 25% in complex, multi-step scenarios. Even retrieval systems can fail if they pull stale or irrelevant documents. The AI doesn’t know it’s wrong. It just sounds confident.
So yes, the bots are smart. But they’re not accountable. And accountability is what customer experience demands.

AI hallucination isn’t just a technical flaw anymore. It’s a compliance risk, a reputational risk, and an operational risk, all at once.
Regulators are moving fast.
The EU AI Act, which began enforcement in 2024, introduces strict obligations for AI systems used in consumer-facing contexts. Companies must document their models, conduct risk assessments, monitor performance continuously, and report incidents. High-risk applications – like those handling financial services, utilities, or legal advice – face even tighter scrutiny.
In the US, the Federal Trade Commission launched “Operation AI Comply,” making it clear that companies using AI in consumer contexts can’t make deceptive claims, mishandle personal data, or hide behind “the algorithm did it” defenses. If your chatbot misleads a customer, even unintentionally, it can be treated as a deceptive business practice.
The UK Information Commissioner’s Office has made accuracy, transparency, and lawful data handling explicit priorities for AI-driven customer service. If your bot uses personal data incorrectly or makes decisions customers can’t understand, you’re exposed.
This is no longer a “nice to have.” It’s a legal obligation.
When a chatbot goes rogue, it’s public. Screenshots circulate on social media within minutes. Journalists pick up the story. Competitors take notes. The DPD or Lenovo incidents aren’t just embarrassing glitches, they become a case study in what not to do with AI. And here’s what makes it worse: customers don’t distinguish between “the bot said it” and “the company said it.”
Your chatbot is your brand. Every response it generates is a corporate statement. If it’s wrong, rude, or inappropriate, that’s on you. According to a recent report, only around 20% of customers are “okay” with the use of chatbots and gave AI support a 3 out of 5 in terms of experience. It’s not a great picture…
Bad AI outputs don’t just disappear. They trigger escalations. Customers who receive incorrect information come back angrier, more confused, and less trusting. Agents spend extra time undoing the damage. Compensation claims pile up. Over time, trust erodes, and once lost, it’s expensive to rebuild.
For CX leaders, this is the pivot: what used to be a “quality issue” is now a governance issue. And governance requires systems, not hope!
Think of the assurance layer as the safety, governance, and quality system wrapped around your chatbots and AI assistants. It’s the infrastructure that monitors, audits, and enforces the standards your customers (and regulators) expect.
If your chatbot is the engine, the assurance layer is the dashboard, the seatbelt, and the airbag.
It does four things:
This is the evolution of quality assurance for a new era of AI-driven CX. And just like traditional QA, it requires structure.
Here’s how to build it.
Every CX organization or contact centre needs some layers of protection for AI interactions. They mirror the same rigor that’s long existed in human-agent quality assurance, but now scaled, automated, and adapted for AI.
Before you put an AI chatbot in front of customers, test it like you’d test a new hire, but even with more rigour!
Imagine you’re about to launch a bot to handle order status inquiries. Before go-live, you’d run it through hundreds of real customer intents from your support history. Does it cite your knowledge base accurately, or does it improvise? When a customer asks about your return policy, does it quote the correct timeframe and conditions, or does it blend policies from different product categories?
You’d also red-team it: feed it edge cases, policy-sensitive prompts, and adversarial inputs. What happens when a customer asks the bot to override a rule? What if they claim an agent promised them something that’s not in policy? Does the bot hold the line, escalate appropriately, or make something up?
This is where you establish a baseline. You’re measuring accuracy against a gold set of real customer scenarios and approved responses. If the bot can’t pass this test in a safe environment, it’s not ready for production.
Once live, you need guardrails that activate in the moment – before damage is done.
Picture this: a customer is chatting with your bot about a disputed charge. Mid-conversation, they share their full credit card number, hoping the bot can look it up. Without real-time safeguards, the bot might try to be helpful, and in doing so, echo that PII back in its response or store it improperly.
Real-time safeguards prevent this. They include things like:
These aren’t about stopping the bot from being useful. They’re about stopping it from being harmful.
No AI stays static. Models update. Prompts change. Customer behavior evolves. Continuous monitoring is where quality lives or dies.
This is where you auto-QA every interaction (or a representative sample) for factuality, compliance, and tone. You’re looking for patterns: Has accuracy dropped since the last prompt update? Is the bot suddenly more likely to refuse reasonable requests? Are customers expressing frustration more often, even when the bot technically “resolved” their issue?
Things like phrase-tagging become vitally important. An influx of messages like “can I speak to a human?”, “you’re not answering my question” are clear signs that the bot isn’t up to task. Emotion detection and sentiment analysis helps too; for example, when a customer expresses frustration, the bot should reply with empathy.
You’re also watching for drift. If your bot’s CSAT was 4.2 last month and it’s 3.8 this month, something changed. Maybe the knowledge base wasn’t updated. Maybe customer expectations shifted.
The goal is to catch problems early before they become crises. This is also where you correlate AI quality with business outcomes: higher factuality rates with higher CSAT, lower hallucination with lower escalation rates. That’s the story that earns budget and board-level buy-in.
The final line of defense is institutional: being able to prove your AI is under control.
This means keeping an auditable trail. Versioned prompts and models. Change logs with test results. Incident records and corrective actions. When a regulator, auditor, or journalist asks “how do you know your AI is safe?”, you need receipts.
This isn’t bureaucracy for its own sake. It’s how you demonstrate responsible AI practice under frameworks like the EU AI Act. It’s how you defend your company if a chatbot interaction ends up in court. And it’s how you build internal accountability so product, legal, and CX teams all know who’s responsible for what.
Together, these four lines create a system that’s greater than the sum of its parts. Pre-launch testing catches the obvious failures. Real-time guardrails stop the dangerous ones. Continuous monitoring finds the subtle drift. And governance makes it all defensible.

The last two years have proven one thing: AI in CX is here to stay – but so is scrutiny.
Customers want help that’s accurate and empathetic. Regulators want transparency and accountability. Boards want proof that automation isn’t creating new liabilities. They all want the same thing: confidence that the technology is safe, accurate, and human-centric.
This is the new equation:
AI scale × Human oversight = Sustainable CX automation
The companies that win won’t be those with the flashiest chatbot demos or the highest deflection rates. They’ll be the ones who can prove their bots are accurate, compliant, and empathetic every single day. Gartner research shows that 53% of customers would consider switching to a competitor if they found out a company was going to use AI for customer service – don’t give them cause to leave!
That’s what EdgeTier is built for. We help CX teams deploy customer support (whether human or AI!) confidently by providing the visibility and QA infrastructure that makes everyone and everything accountable.
Because in the era of generative CX, trust isn’t an add-on. It’s the product.
Don’t let customer service become a liability. [See how EdgeTier lets you keep your eyes and ears on every conversation]
This article originally appeared on Edge Signals – Bart Lehane’s LinkedIn newsletter on customer experience, analytics, and AI. Follow for
Black Friday isn’t just a sales event – it’s a stress test for your contact centre’s people, systems, and leadership.
This article originally appeared on Edge Signals – Bart Lehane’s LinkedIn newsletter on customer experience, analytics, and AI. Follow for
"We thought at the time that we were putting the customer at the fore. We thought we were doing things right. But in hindsight, we really weren’t because we had no real-time insights whatsoever into customer issues."
"You’ve got an issue, but you don’t know how many people are affected. You don’t know the scale. You don’t even know if it’s real."
"I specifically liked the flexibility. I liked the can-do attitude. I always felt supported. There hasn’t been any single point in our journey where EdgeTier has said no."



Let us help your company go from reactive to proactive customer support.
Unlock AI Insights