Evolving the Embedding Space of Diffusion Models in the Field of Visual Arts

Marcel Salvenmoser*, Michael Affenzeller*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingsConference contributionpeer-review

Abstract

This paper presents a novel method to guide image generation by optimizing the embedding space of diffusion models using evolutionary algorithms. Instead of relying on traditional prompt engineering, the approach directly evolves the prompt embeddings that condition text-to-image generation. Evolutionary operators, such as crossover and mutation, are applied to iteratively refine the embeddings, which are then fed into the diffusion model to generate an image. The fitness of each embedding is determined by the resulting image. Using the SDXL-Turbo model as a test case, a genetic algorithm is employed to optimize its prompt embeddings, leading to improvements in fitness as measured by the LAION Aesthetics Predictor V2. Results show that over generations, the optimized embeddings yield significant gains in fitness scores compared to the initial training images. The underlying framework is publicly available and executable in a Jupyter Notebook, allowing for further experimentation and adaptation to various generative tasks.

Original languageEnglish
Title of host publicationArtificial Intelligence in Music, Sound, Art and Design - 14th International Conference, EvoMUSART 2025, Held as Part of EvoStar 2025, Proceedings
EditorsPenousal Machado, Colin Johnson, Iria Santos
PublisherSpringer
Pages402-416
Number of pages15
ISBN (Print)9783031901669
DOIs
Publication statusPublished - 2025
Event14th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2025, held as part of EvoStar 2025 - Trieste, Italy
Duration: 23 Apr 202525 Apr 2025

Publication series

NameLecture Notes in Computer Science
Volume15611 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2025, held as part of EvoStar 2025
Country/TerritoryItaly
CityTrieste
Period23.04.202525.04.2025

Keywords

  • Aesthetics
  • Diffusion Models
  • Embedding Space
  • Evolutionary Algorithms
  • Generative AI
  • Genetic Algorithms
  • Image Generation

Fingerprint

Dive into the research topics of 'Evolving the Embedding Space of Diffusion Models in the Field of Visual Arts'. Together they form a unique fingerprint.

Cite this