How to Train AI on Your Knowledge Base (No Fine-Tuning)

Created time

Jun 5, 2026 12:17 PM

Title length (<60)

Author

Mike Heap

Last optimised

Ecomm?

"Training" your AI doesn't mean what you think

⚡

TL;DR: Modern support AI retrieves from your connected sources at answer time rather than fine-tuning a model on your docs. Edit a doc and the next reply reflects it, with nothing to retrain.

People mean two very different things by it, and the two get muddled constantly.

The first is fine-tuning: taking a base model and training it further on your data so the knowledge is baked into the model itself. The second (and the one we use) is connecting your knowledge as sources the AI reads at the moment a customer asks. For support, it's almost always the second one, where the difference between them is bigger than it sounds.

Fine-tuning gives you a snapshot. The model knows what it knew at training time and nothing more, so every refund-policy change means going back and retraining.

Retrieval works the other way round: the AI looks up the relevant passage from your connected sources when the question lands, then writes the answer from what it found. The industry explainers settle on the same split, that you fine-tune for behaviour and use retrieval for knowledge (change your knowledge base, and the next answer reflects it).

We learned this one the hard way. Our business actually started out on fine-tuning years ago, and we found it made everything less flexible (you really don't want to re-fine-tune every time you tweak a doc).

Fine-tuning is worth it for controlling tone or style, and even then it locks you to that style. With prompts you can change things in a few words and get 95 to 99 percent of the same result with far more room to move.

So the day-to-day job, I'd say, is building the corpus, connecting it, and checking the answers come out right. The nice payoff is that updates are live, so you fix a fact in a doc and the agent uses the corrected version on the very next reply.

The Answer Test: four things your knowledge has to pass

⚡

TL;DR: Before the AI can answer from your knowledge, the answer has to be captured, grounded, current and findable. Three of those four are down to you; the model only matters at the edges.

Most bad AI answers come down to the knowledge behind them, and they fail in one of four specific ways.

Before an AI can answer a customer from your knowledge base, the answer has to pass four tests: is it captured, is it grounded, is it current, and is it findable. I call it the Answer Test, worth running on your own setup.

The four checks of the Answer Test: captured, grounded, current, findable, each with a short description of what it verifies.

What makes it a frame rather than a checklist is whose job each test is. Captured, grounded and current are mostly yours to get right, and findable is shared between how you connect the source and how the article is written. The model itself only moves the needle at the edges (most of what lifts a resolution rate is the customer's own work).

Test 1: Is it captured?

The first test is the most basic and the most often failed. If the answer to a question doesn't exist anywhere the AI can read, no amount of model quality saves it. It's garbage in, garbage out: if it isn't written down somewhere, the AI has nothing to work from.

This is where teams under-count their own knowledge base (I see it on calls constantly). They think "help center," connect it, and stop. Your real knowledge base is a stack of sources, most of which live outside the help center:

Published docs. Your help center, website, blog, FAQs and product docs. The obvious layer, and usually live in minutes through a connector.

Past resolved tickets. Years of answers your team already wrote, sitting in closed tickets. If your docs are thin (or you don't have a help center at all), this is the unlock. You can train on historic tickets to auto-draft starter articles from past conversations (we backfill up to 5,000 by default), so a company with no docs site can still get an agent live from day one.

Saved replies and internal know-how. Your macros, canned responses, and the playbooks and SOPs sitting in Notion or a shared drive. This is the tribal knowledge that never made it into a customer-facing article, and some of the best content you have.

Live system data. Order status, account fields, subscription state, pulled in through APIs. It isn't a document as such, but it's what the AI needs to answer "where's my order?", which no help-center article will ever hold.

A breakdown of the four sources that make up a real knowledge base: published docs, past resolved tickets, saved replies and internal know-how, and live system data.

The fix is an inventory. Take your top 20 customer questions and, for each, ask: where is the answer written down, and is that source connected? Most gaps, in our experience, come down to a missing source: the answer was never written anywhere the AI could reach.

Test 2: Is it grounded?

The second test is whether the AI ties its answer to a verified source instead of guessing. People file this under the hallucination problem, but most of what gets called a hallucination isn't one.

When someone says the AI made something up, they usually look closer and find their docs were out of date, ambiguous, or just plain wrong. AI is genuinely good at grounding in truth and not inventing things these days, but hand it a false truth and it doesn't know not to use it (the model did its job; the source lied to it).

A well-built agent (ours included) answers only from your connected, verified sources, so it isn't free-associating off the open internet. And when you do want to check its working, your team can open any conversation and ask the AI why it gave an answer and where the information came from, with the full reasoning there in the dashboard. That audit trail is for your team (the customer never sees a sources footer), and the fastest way to find the stale article behind a wrong reply.

Test 3: Is it current?

The third test is freshness, the sneaky one I see catch teams out. An AI that confidently quotes a 2023 price, a dead promo code, or a feature you retired is just reading a source nobody updated. The model is fine; the fact behind it rotted.

The fix is plain hygiene: date-stamp the facts that decay, and keep one fact in one place so you aren't updating a refund window in six articles and missing two. The live-update habit is what makes this cheap.

One of our customers, Kriptomat, put it nicely in their case study: they're "a big fan of the direct integration with our pre-existing help articles, and how easy it is to re-train the agent when it's providing outdated information," which "helped our team out immeasurably especially during heavy inquiry surges." Change the article, and the next answer is right.

Test 4: Is it findable?

The last test is whether the AI can actually surface and read the answer. A passage can be captured, grounded and current, and still be invisible because the article covers five topics at once, hides the answer under a paragraph of preamble, or uses internal jargon no customer would ever type.

There's a structural side too. You have to think about how your data is built and how readable it is for the AI: if a key fact is locked inside an image with no alt text, or a table with no real structure, the AI can't ingest it.

This is its own craft, writing each article so it's retrievable (atomic articles, the answer first, the customer's own words, a text description under every screenshot). It's worth doing properly, and big enough to deserve its own guide rather than a paragraph here.

One handy shortcut: let the agent tell you what's missing. The questions your AI couldn't answer this week are your "not captured yet" list for next week, and a self-learning agent surfaces them for you automatically.

What this looks like in real rollouts

⚡

TL;DR: Across our rollouts, the big resolution-rate jumps came from connecting more sources and data. Edel Optics went from about 25 percent to 79 percent that way, and the model was rarely the lever.

The pattern across our rollouts is steady: the jumps in resolution rate come from connecting more of the customer's own knowledge and data. A cleverer model is rarely the thing that moves it.

Take Edel Optics, who went from around 25 percent to 79 percent. The European eyewear retailer connected their Zendesk help center, uploaded supplemental FAQs, then added the User Data API so the agent could see order, delivery, return and tracking info.

That last step (surfacing live data) did most of the work, taking resolution close to that 79 percent almost overnight. The Edel Optics case study has the rest: about 92 percent AI CSAT, and around 150 hours saved a month. The lever, again, was the sources they connected.

Edel Optics statistics: AI resolution rose from about 25% to 79% after connecting more sources and live data, at 92% AI CSAT.

Self-learning then compounds the gains. When we switch it on, customers typically see a 40 to 60 percent drop in the questions their agent couldn't answer, plus around a 5 percent lift in overall resolution rate. The effect is biggest in the first few weeks and tapers after, but it keeps the corpus growing from real customer questions.

And you don't need to start with good docs. Some of the teams that benefit most walked in with almost no help center at all.

Historic-ticket training drafts starter articles from past conversations, and self-learning fills in the rest over the following weeks. The real unlock is that any company can have a working agent, even without a docs site.

What to do this week

⚡

TL;DR: Connect every source (not just the help center), run the Answer Test on your top 20 questions, fix the gaps, and switch on the weekly improvement loop. Knowledge alone gets most teams to 40 to 70 percent.

If you want to train your AI on your knowledge base properly, here's the order I'd run it in. Most teams land somewhere between 40 and 70 percent resolution on knowledge alone, before touching anything fancier.

Video preview — Train an AI Agent on Your Help Docs

Connect every source you've got. Help center, website, saved replies and macros, and any internal playbook or SOP. The connectors take minutes; gathering the macros and internal docs is the real work (we'll happily connect them with you on a call). Budget about an hour. No help center to connect? Start with historic-ticket training and generate one from your past tickets instead.

Run the Answer Test on your top 20 questions. For each one, check whether the answer is captured, grounded, current and findable. Score, don't fix yet. About 90 minutes, and at the end you know exactly which questions fail and why.

Fix the "not captured" gaps first. Write the missing answers, or auto-draft them from resolved tickets. This is where most of the resolution-rate lift hides, in our experience. Half a day, then re-test.

Switch on the improvement loop and review it weekly. Self-learning drafts new articles from the questions the AI couldn't answer, and you review them for accuracy before they go live (don't trust auto-added knowledge blindly, since transcription and interpretation can slip). Budget about 30 minutes a week, and watch resolution climb over the next month.

Once knowledge is solid, the next big jump is usually connecting live customer data through an API. That's often only a couple of hours of work (you can point Claude or ChatGPT at your codebase and have it scaffold a read-only lookup endpoint), and you do it once and benefit forever. It's a bigger topic than knowledge alone, but it's the natural next step after this.

How do I run the Answer Test with AI?

If you'd rather not score 20 articles by hand, paste this prompt into ChatGPT, Claude or Gemini and let it do the first pass:

You are auditing a customer-support knowledge base for an AI agent. I'll give you our top customer questions and the sources we've connected. For each question, run "The Answer Test" and score it on four checks:

1. Captured — is the answer written down in one of the connected sources? (yes / partial / no)
2. Grounded — could the agent point to a specific verified source, or would it have to guess? (yes / no)
3. Current — is the source likely up to date, or does it carry dates, prices or policies that may have changed? (current / stale / unknown)
4. Findable — is the answer easy to retrieve: one topic per article, answer near the top, in the customer's own words, nothing important trapped in images or unstructured tables? (yes / partial / no)

Connected sources: [list yours: help center, website, past tickets, macros, internal docs, live-data APIs]
Top questions: [paste your top 20 customer questions]

Return a table with one row per question, a column per check, and a final "biggest gap" column naming the single thing to fix first. Where you can't tell from what I've given you, write "need to check" instead of guessing.

It won't judge answer quality the way a real test on your own tickets does, but it's a fast way to find the questions that fail Captured or Current before you ever go live.

When "just connect your knowledge" isn't enough

⚡

TL;DR: Three things knowledge alone can't fix: live-data questions that need an API, multi-brand knowledge you have to keep separate, and the tickets you should route to a human on purpose.

The Answer Test covers most support questions, but not all of them. Three honest limits are worth naming.

Some answers live in a database. "Where's my order?" and "what plan am I on?" can't come from any article, however well written, because the answer is different for every customer. Those need live data through an API, and no amount of help-center work stands in for it (connecting knowledge gets you to roughly 40 to 70 percent; this is the leap past it).

The second limit is multi-brand knowledge. If you run several brands or products, dumping every brand's docs into one agent doesn't keep the answers apart, it makes them bleed into each other.

The right setup depends on count: for a handful of brands you run a separate agent per brand, and past that (or for marketplaces and listing sites) you pass the brand or product context through an API so the agent only looks up the knowledge that belongs to that customer. Connecting more knowledge naively makes this worse.

The third limit is knowing where to stop. Some tickets should go to a human by design: account-security questions, refunds beyond policy, anything genuinely complex or upsetting.

A resolution number you hit by making it hard to reach a person isn't worth having. We count a conversation as resolved when the AI handled it without escalating, and that only stays trustworthy because escalating to a human is deliberately easy.

The takeaway

⚡

TL;DR: Training your knowledge base means connecting the right sources and keeping each answer captured, grounded, current and findable, then running the weekly loop. The work compounds, because every answer you add serves every future customer.

You don't train an AI on your knowledge base the way you train a model. You build up the corpus, connect the right sources, and make sure each answer is captured, grounded, current and findable. Do that, keep the weekly loop running, and "training" turns out to be mostly a connect-and-check job rather than a data-science one (which is how we'd describe it to anyone who asks).

The part most teams get wrong is treating their knowledge base as just the help center. It's wider than that: published docs, past tickets, saved replies, internal know-how and live data, each one a source the AI can answer from. The single most useful thing you can do this week is connect every one of those and run the Answer Test on your top 20 questions.

And the work compounds. Unlike training a human who can leave and take it with them, every answer you add stays in the agent and keeps working for every future customer.

FAQs

How do I train an AI on my own data?

In practice, we connect your data as sources the AI retrieves from, rather than training a model on it. We point the agent at your help center, website, past tickets, macros and internal docs, and it pulls the relevant passage at answer time. There's no model-retraining step, and changes to your data take effect on the next reply.

Do I need to fine-tune a model to train AI on my knowledge base?

Almost never. Fine-tuning bakes knowledge into a model as a snapshot, so you'd redo it every time your docs change. We use retrieval instead, which keeps the AI current and lets you fix answers by editing the source (fine-tuning is really only worth it for controlling tone or style, and even then prompts get you most of the way with far more flexibility).

How do I add AI to my customer support without replacing my helpdesk?

You don't replace a thing. We connect to your existing helpdesk (Intercom, Zendesk, Freshdesk, Gorgias or HubSpot) and your knowledge sources, and the AI works inside the inbox your team already uses. Your macros, tags and routing rules stay exactly as they are.

How do I stop the AI giving wrong answers to customers?

Use an agent that answers only from your verified sources, then fix the sources. Most "wrong answers" we see come from a stale or ambiguous article the AI repeated faithfully. When a reply looks off, open the conversation, see which source it used, and correct it (usually a five-minute fix in the source).

What counts as a knowledge base for AI?

More than your help center, which is the bit most teams get wrong. It's the whole stack of places your answers live: published docs, past resolved tickets, saved replies and macros, internal playbooks and SOPs, and live system data through APIs. We'd say the most common mistake is connecting only the help center and calling it done.

Can an AI agent learn from past support tickets?

Yes. It's the fastest way to start if your docs are thin. Historic-ticket training reads your past resolved conversations and drafts starter articles from them, so you're not writing a help center from a blank page. We backfill up to 5,000 tickets by default to get you going.

How does the AI keep learning after launch?

Through a self-learning loop. The agent tracks the questions it couldn't answer, drafts new articles to cover them, and surfaces them for your team to review before they go live. We deliberately don't add knowledge off a single human reply (so one off-hand answer doesn't reshape the agent); it waits until a gap shows up more than once.

How long does it take to train an AI on a knowledge base?

Starting from knowledge alone (help centers and websites), you can be live within minutes to hours. From there, most teams reach 40 to 70 percent resolution on knowledge before adding anything else. The bigger gains from connecting live data or setting up actions take longer and lean on your own team's availability, but I'd stress the knowledge step itself is fast.

How to Train AI on Your Knowledge Base (Without "Training" Anything)

"Training" your AI doesn't mean what you think