TY - GEN
T1 - Evolving the Embedding Space of Diffusion Models in the Field of Visual Arts
AU - Salvenmoser, Marcel
AU - Affenzeller, Michael
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - This paper presents a novel method to guide image generation by optimizing the embedding space of diffusion models using evolutionary algorithms. Instead of relying on traditional prompt engineering, the approach directly evolves the prompt embeddings that condition text-to-image generation. Evolutionary operators, such as crossover and mutation, are applied to iteratively refine the embeddings, which are then fed into the diffusion model to generate an image. The fitness of each embedding is determined by the resulting image. Using the SDXL-Turbo model as a test case, a genetic algorithm is employed to optimize its prompt embeddings, leading to improvements in fitness as measured by the LAION Aesthetics Predictor V2. Results show that over generations, the optimized embeddings yield significant gains in fitness scores compared to the initial training images. The underlying framework is publicly available and executable in a Jupyter Notebook, allowing for further experimentation and adaptation to various generative tasks.
AB - This paper presents a novel method to guide image generation by optimizing the embedding space of diffusion models using evolutionary algorithms. Instead of relying on traditional prompt engineering, the approach directly evolves the prompt embeddings that condition text-to-image generation. Evolutionary operators, such as crossover and mutation, are applied to iteratively refine the embeddings, which are then fed into the diffusion model to generate an image. The fitness of each embedding is determined by the resulting image. Using the SDXL-Turbo model as a test case, a genetic algorithm is employed to optimize its prompt embeddings, leading to improvements in fitness as measured by the LAION Aesthetics Predictor V2. Results show that over generations, the optimized embeddings yield significant gains in fitness scores compared to the initial training images. The underlying framework is publicly available and executable in a Jupyter Notebook, allowing for further experimentation and adaptation to various generative tasks.
KW - Aesthetics
KW - Diffusion Models
KW - Embedding Space
KW - Evolutionary Algorithms
KW - Generative AI
KW - Genetic Algorithms
KW - Image Generation
UR - http://www.scopus.com/inward/record.url?scp=105003926550&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-90167-6_27
DO - 10.1007/978-3-031-90167-6_27
M3 - Conference contribution
AN - SCOPUS:105003926550
SN - 9783031901669
T3 - Lecture Notes in Computer Science
SP - 402
EP - 416
BT - Artificial Intelligence in Music, Sound, Art and Design - 14th International Conference, EvoMUSART 2025, Held as Part of EvoStar 2025, Proceedings
A2 - Machado, Penousal
A2 - Johnson, Colin
A2 - Santos, Iria
PB - Springer
T2 - 14th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2025, held as part of EvoStar 2025
Y2 - 23 April 2025 through 25 April 2025
ER -