6 Powerful RAG Improvements to Supercharge Your Enterprise AI

AI Summary by Katonic AI

This blog outlines six key optimisations for Retrieval-Augmented Generation (RAG) systems on the Katonic AI Platform, including system prompt configuration, chunk size optimisation, retrieval parameter tuning, vision indexing for complex documents, metadata filtering, and embedding model selection. Each improvement addresses specific challenges in enterprise AI deployments, with practical implementation steps for technical decision-makers.

Ever asked your enterprise AI assistant a question only to receive a vague, irrelevant answer? You’re not
alone. While Retrieval-Augmented Generation (RAG) has revolutionised how AI systems access knowledge, the difference between a mediocre implementation and an exceptional one is night and day.

At Katonic AI, we’ve spent years refining our RAG capabilities to deliver enterprise-grade results. Today, I’m sharing six powerful improvements you can make to transform your RAG system from merely functional to genuinely impressive.

Why RAG Optimisation Matters for Your Business

Before diving into the technical improvements, let’s talk about why this matters. Poorly optimised RAG
systems:

Retrieve irrelevant information, wasting users’ time
Miss critical context, leading to inaccurate responses
Provide inconsistent experiences across different queries
Struggle with complex document formats and structures

Each of these issues directly impacts user adoption, trust, and ultimately the ROI of your AI investment.
The good news? Most RAG issues can be solved with the right configuration.

1. Configure Your System and Persona Prompts

Think of system prompts as the invisible instruction manual for your AI assistant. When properly
configured, they establish:

Core guidelines for AI behaviour
Consistent response formats
Appropriate guardrails for sensitive topics
Stable baseline performance across conversations

On the Katonic Platform, you can easily configure system prompts by navigating to: ACE →
Configuration → Prompt Personalisation → Search Knowledge Prompt → System Prompt

But system prompts are only half the story. Your AI’s personality—its tone, style, and communication
approach—is defined by persona prompts. A well-crafted persona:

Creates a consistent, recognisable voice that builds user trust
Adapts to your target audience (professional, casual, educational)
Enhances user engagement through relatable communication
Can represent domain-specific expertise (legal, medical, technical)

To configure persona prompts in the Katonic Platform: ACE → Persona Management → Create new
persona

Once created, users can switch between personas using the dropdown at the top of the chat interface—
perfect for different departments or use cases.

2. Optimise Chunk Size and Overlap

Have you ever noticed how some RAG systems nail specific factual questions but struggle with complex
topics? Or conversely, how they sometimes provide general context but miss the precise details you
need? That’s often down to chunk size configuration.

Chunk size refers to how your documents are divided for embedding and retrieval. The impact on accuracy is significant:

Smaller chunks (100-500 tokens) provide precise retrieval for specific questions but may miss
broader context
Larger chunks (1000+ tokens) capture comprehensive context but can retrieve irrelevant
information
Optimal sizing matches your typical query complexity—shorter for factual queries, longer for
complex reasoning
Once satisfied, deploy the application

Just as important is chunk overlap—how much text is shared between adjacent chunks:

Prevents context fragmentation by ensuring related information isn’t artificially separated
Maintains semantic continuity across chunk boundaries
Creates beneficial redundancy that increases the chance of retrieving relevant information

For most applications, a 10-20% overlap works well, but complex documents with context spanning
multiple paragraphs may benefit from 20-50% overlap.

3. Fine-tune Your Number of Retrieved Chunks

This often-overlooked parameter controls how many chunks the system retrieves before generating a
response:

Too few chunks (1-3) might miss critical information
Too many chunks (15+) introduce noise and irrelevant content
The sweet spot is typically 8-10 chunks for balanced retrieval

To adjust this on the Katonic Platform: ACE → Configuration → Application Settings → Chat Accuracy
Settings

One financial services client saw their RAG response accuracy jump from 67% to 89% simply by
optimising this parameter based on their specific document types and query patterns.

4. Apply Vision Indexing for Complex Data

Standard text-based chunking works well for straightforward documents, but what about complex
structured files, tables, or diagrams? That’s where vision indexing comes in.

The Katonic Vision Reindex feature helps fetch more accurate details from complex structured files by
using AI vision capabilities to understand document layout and structure.

To apply vision indexing: ACE → Knowledge Management → Select knowledge → Knowledge Objects
tab → Preview button → Reindex Using Vision

We’ve seen this make a dramatic difference for clients with complex financial reports, legal documents,
and technical manuals—information that would be lost in standard text chunking is properly preserved
and made retrievable.

5. Leverage Metadata Filtering

Not all knowledge is created equal. Sometimes you need information from specific document types or
categories. Metadata filtering constrains retrieval to the most relevant sources.

Users can select document types directly in ACE Chat or chat with a specific document by typing “@” and
selecting the document name.

A telecommunications client used this feature to create separate knowledge bases for consumer
products, enterprise solutions, and internal policies. When answering customer queries, their support
teams could instantly filter to only the relevant document categories, dramatically improving response
accuracy.

6. Choose the Right Embedding Model

The embedding model you select fundamentally impacts how well your system understands and retrieves
information:

Higher-dimensional models often capture semantic relationships more effectively
Domain-specific embeddings trained on relevant data can dramatically improve performance
Multilingual models provide better results for international content

To reset your embedding model: ACE → Configuration → Application Settings → Model
Configuration → Reset Embedding Model

Don’t underestimate the impact of the right embedding model.One healthcare client switched from a
general embedding model to a domain-specific one and saw a 43% improvement in retrieval precision
for medical terminology.

Bonus Tip: Query Rephrasing

Even with all these optimisations, sometimes users don’t ask questions in the most effective way. Query
rephrasing automatically reformulates questions to better match how information is stored.

Users can leverage this on the Katonic Platform by typing their original question and pressing ALT + L
from the keyboard in ACE chat.

The Business Impact of RAG Optimisation

These improvements aren’t just technical tweaks—they deliver measurable business value:

Reduced support costs: One client reduced tier 2 support escalations by 37% after RAG optimisation
Higher user satisfaction: Average satisfaction scores increased from 3.6/5 to 4.7/5 for another
customer
Faster time-to-information: Average time to find critical information dropped from 4.2 minutes to
under 30 seconds
Increased AI adoption: System usage increased by 215% after RAG improvements made responses
noticeably more relevant

Getting Started with RAG Improvements

The best part about these RAG improvements? They don’t require data science expertise to implement.
The Katonic AI Platform provides intuitive interfaces to make these adjustments with just a few clicks.

Whether you’re just starting your RAG journey or looking to optimise an existing implementation,
focusing on these six areas will yield significant improvements in accuracy, relevance, and user
satisfaction.

Getting Started

Whether you’re just starting your RAG journey or looking to optimise an existing implementation, focusing on these six areas will yield significant improvements in accuracy, relevance, and user satisfaction.

FAQ for RAG Improvements

What is Retrieval-Augmented Generation (RAG) and why is it important for enterprise AI?

Retrieval-Augmented Generation (RAG) is a technique that enhances AI systems by enabling them to access and reference external knowledge sources before generating responses. It’s critical for enterprise AI because it improves accuracy, reduces hallucinations, provides up-to-date information, and allows AI to reference company-specific knowledge that wasn’t in its training data.

How does chunk size affect RAG performance in enterprise applications?

Chunk size significantly impacts RAG performance by determining the granularity of information retrieval. Smaller chunks (100-500 tokens) provide precise answers for specific questions but may miss context, while larger chunks (1000+ tokens) capture comprehensive context but might include irrelevant information. The optimal chunk size depends on your typical query complexity—shorter for factual queries, longer for complex reasoning tasks.

What business benefits can companies expect from optimizing their RAG systems?

Companies that optimize their RAG systems typically see reduced support costs (up to 37% fewer escalations), higher user satisfaction (increases from 3.6/5 to 4.7/5), faster time-to-information (from minutes to seconds), and increased AI adoption (up to 215% higher usage rates). These improvements directly translate to better ROI on AI investments and more effective knowledge management.

How can I implement vision indexing for complex documents in my RAG system?

Vision indexing for complex documents can be implemented on the Katonic AI Platform by navigating to Knowledge Management, selecting the knowledge base, clicking on the Knowledge Objects tab, using the Preview button, and selecting “Reindex Using Vision.” This process uses AI vision capabilities to understand document layout and structure, making complex financial reports, legal documents, and technical manuals more accurately retrievable.

6 Powerful RAG Improvements to Supercharge Your Enterprise AI

Table of contents

AI Summary by Katonic AI

Why RAG Optimisation Matters for Your Business

1. Configure Your System and Persona Prompts

2. Optimise Chunk Size and Overlap

3. Fine-tune Your Number of Retrieved Chunks

4. Apply Vision Indexing for Complex Data

5. Leverage Metadata Filtering

6. Choose the Right Embedding Model

Bonus Tip: Query Rephrasing

The Business Impact of RAG Optimisation

Getting Started with RAG Improvements

Getting Started

FAQ for RAG Improvements

How to Build an AI-Powered Candidate Scoring Application in 6 Simple Steps

The Complete Guide to Enterprise LLM Fine-Tuning: Making AI Work for Your Business

Katonic AI

Generative AI productivity suite for you Enterprise

6 Powerful RAG Improvements to Supercharge Your Enterprise AI

Table of contents

AI Summary by Katonic AI

Why RAG Optimisation Matters for Your Business

1. Configure Your System and Persona Prompts

2. Optimise Chunk Size and Overlap

3. Fine-tune Your Number of Retrieved Chunks

4. Apply Vision Indexing for Complex Data

5. Leverage Metadata Filtering

6. Choose the Right Embedding Model

Bonus Tip: Query Rephrasing

The Business Impact of RAG Optimisation

Getting Started with RAG Improvements

Getting Started

FAQ for RAG Improvements

How to Build an AI-Powered Candidate Scoring Application in 6 Simple Steps

The Complete Guide to Enterprise LLM Fine-Tuning: Making AI Work for Your Business

Katonic AI

Generative AI productivity suite for you Enterprise

Follow Us