Secret Agents: AI Red Team Automation

  • Manuel Achhorner

    Student thesis: Master's Thesis

    Abstract

    The recent rise in AI usage in the field of cybersecurity has caused some researchers to
    examine its viability in red teaming. This research has assumed the process of red teaming to be a monolithic process and heavily relied on the AI having access to the Internet
    or handcrafted knowledge databases. While this has shown promising results in related
    works, a closer examination of AI automation in light of the chain of steps described
    by the Cyber Kill Chain is yet to be done. Using the Internet further increases the risk
    of incidentally harming third parties and handcrafted knowledge databases make the
    aforementioned research nearly impossible to reproduce. Breaking the process of red
    teaming down into applications, that more closely resemble the atomic steps in the Cyber Kill Chain, may reveal problem areas that can be improved upon individually. This
    thesis therefore conceptualizes a theoretical framework and develops two independent
    processes modeled after the reconnaissance and the weaponization/exploitation/delivery
    steps of the Cyber Kill Chain respectively, without being reliant on non-reproducible
    knowledge databases or the Internet. These processes use agentic workflows and supplemental resources, such as predefined tools, automatic code execution and RAG, and were
    implemented based on the AG2 framework. Furthermore, the processes were evaluated
    by solving tasks in Linux privilege escalation and website vulnerability exploitation scenarios. These tests revealed positive results in the weaponization/exploitation/delivery
    steps, while the results of the reconnaissance process were generally negative. They further showed fundamental problems in the vast majority of open source LLMs, when
    using predefined tools, which were integral in both proposed processes. Through the
    developed applications, this thesis shows the viability of viewing red team automation
    as smaller, atomic tasks and provides a theoretical framework which standardizes the
    development of these applications. It further showed the AG2 LLM agent framework
    inadequate, in its current state, for future use in AI red team automation, due to its
    limited customizability in both tool and code execution environments.
    Date of Award2025
    Original languageEnglish
    SupervisorHarald Lampesberger (Supervisor)

    Studyprogram

    • Secure Information Systems

    Cite this

    '