The advancement of artificial intelligence (AI) has enabled its widespread use, especially in automating customer support through AI-based chatbots. These chatbots provide continuous, real-time assistance, offering 24/7 availability while reducing operational costs and dependence on human agents. Their ability to efficiently handle large volumes of inquiries makes them scalable and cost-effective. As digital transformation progresses, AI-driven chatbots become key tools for improving service quality and operational efficiency. This Master’s thesis explores how Retrieval-Augmented Generation systems can be used to improve responses to complex user queries within large, semi-structured documentation systems. This is achieved by combining the generative power of large language models with the targeted retrieval of relevant information from external sources. This hybrid approach is particularly effective in specialised and frequently changing knowledge domains. It also addresses the challenges posed by the diverse and inconsistent structure of technical documentation, which is often authored by multiple contributors and exists in formats such as HTML, PDF, and databases. This emphasises the need for intelligent systems that can interpret and present information in a clear and contextualised manner. The core of the thesis is an introduction to RAG systems and the development of a prototype RAG-based chatbot designed to answer software-related questions based on HTML documentation. Three segmentation methods are evaluated: sentence-based, paragraph-based, and HTML heading-based, using quantitative metrics and expert feedback. The findings indicate that the effectiveness of segmentation, along with its respective advantages and disadvantages, varies significantly depending on the type and structure of the document, thereby highlighting the necessity of adaptive segmentation approaches that can dynamically adjust to different formats and content styles. Tailoring segmentation strategies to the specific characteristics of the source material enables chatbot systems to generate more accurate, relevant, and context-aware responses. This makes adaptive methods essential for optimising chatbot performance and improving user access to complex knowledge bases across diverse domains.
| Date of Award | 2025 |
|---|
| Original language | German (Austria) |
|---|
| Supervisor | Ulrich Bodenhofer (Supervisor) |
|---|
- Information Engineering and -Management
Entwicklung eines Chatbots zum Finden von Informationen in Produktdokumentationen
Dizdarevic, H. (Author). 2025
Student thesis: Master's Thesis