Abstract
The quality of source code is a critical factor in software development, significantly influencing maintainability, security, and efficiency. Despite modern tools such as GitHubCopilot or ChatGPT, structural issues - such as anti-patterns and code smells - continue to occur regularly and often go unnoticed. Since source code is often considered
intellectual property and protected by copyright, this work focuses on analyzing and optimizing such issues through the development of a prototype that uses locally executed
Large Language Models (LLMs) to analyze Python code. Python is examined in this
context because existing research predominantly focuses on object-oriented languages
such as Java. A systematic analysis of Python code in the context of LLM-based pattern
recognition remains a rarity.
The developed prototype identifies problematic code patterns in Python and generates concrete suggestions for improvement. To integrate expert knowledge, In-Context
Learning is employed - a lightweight, flexibly extendable method that does not require
training or fine-tuning of existing models and allows knowledge about anti-patterns
and code smells to be conveyed directly via prompts. Particular emphasis is placed
on protecting sensitive data and ensuring independence from cloud-based services: the
entire process is based on open-source components and runs locally on consumer hardware. The evaluation was conducted using a custom dataset, initially generated with
ChatGPT-4o and subsequently reviewed and refined manually. This dataset served to
compare local models such as Qwen2.5, IBM Granite, and Codestral with cloud-based
models such as ChatGPT-4o and DeepSeek v3.
The results show that models like ChatGPT-4o and Qwen2.5 achieve high detection accuracy while maintaining acceptable response times. Notably, the comparable
performance of Qwen2.5 to the cloud-based model ChatGPT-4o highlights that locally
executable LLMs represent an effective and privacy-friendly alternative to commercial
AI-based code analysis tools.
| Date of Award | 2025 |
|---|---|
| Original language | German (Austria) |
| Supervisor | Harald Lampesberger (Supervisor) |
Studyprogram
- Secure Information Systems