TY - JOUR
T1 - Using FPGA devices to accelerate the evaluation phase of tree-based genetic programming
T2 - an extended analysis
AU - Crary, Christopher
AU - Piard, Wesley
AU - Stitt, Greg
AU - Hicks, Benjamin
AU - Bean, Caleb
AU - Burlacu, Bogdan
AU - Banzhaf, Wolfgang
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2025/6
Y1 - 2025/6
N2 - This paper establishes the potential of accelerating the evaluation phase of tree-based genetic programming through contemporary field-programmable gate array (FPGA) technology. This exploration stems from the fact that FPGAs can sometimes leverage increased levels of both data and function parallelism, as well as superior power/energy efficiency, when compared to general-purpose CPU/GPU systems. In this investigation, we introduce a fixed-depth, tree-based architecture that can fully parallelize tree evaluation for type-consistent primitives that are unrolled and pipelined. We show that our accelerator on a 14nm FPGA achieves an average speedup of 43× when compared to a recent open-source GPU solution, TensorGP, implemented on 8nm process-node technology, and an average speedup of 4,902× when compared to a popular baseline GP software tool, DEAP, running parallelized across all cores of a 2-socket, 28-core (56-thread), 14nm CPU server. Despite our single-FPGA accelerator being 2.4× slower on average when compared to the recent state-of-the-art Operon tool executing on the same 2-processor, 28-core CPU system, we show that this single-FPGA system is 1.4× better than Operon in terms of performance-per-watt. Importantly, we also describe six future extensions that could provide at least a 64–192× speedup over our current design. Therefore, our initial results provide considerable motivation for the continued exploration of FPGA-based GP systems. Overall, any success in significantly improving runtime and energy efficiency could potentially enable novel research efforts through faster and/or less costly GP runs, similar to how GPUs unlocked the power of deep learning during the past fifteen years.
AB - This paper establishes the potential of accelerating the evaluation phase of tree-based genetic programming through contemporary field-programmable gate array (FPGA) technology. This exploration stems from the fact that FPGAs can sometimes leverage increased levels of both data and function parallelism, as well as superior power/energy efficiency, when compared to general-purpose CPU/GPU systems. In this investigation, we introduce a fixed-depth, tree-based architecture that can fully parallelize tree evaluation for type-consistent primitives that are unrolled and pipelined. We show that our accelerator on a 14nm FPGA achieves an average speedup of 43× when compared to a recent open-source GPU solution, TensorGP, implemented on 8nm process-node technology, and an average speedup of 4,902× when compared to a popular baseline GP software tool, DEAP, running parallelized across all cores of a 2-socket, 28-core (56-thread), 14nm CPU server. Despite our single-FPGA accelerator being 2.4× slower on average when compared to the recent state-of-the-art Operon tool executing on the same 2-processor, 28-core CPU system, we show that this single-FPGA system is 1.4× better than Operon in terms of performance-per-watt. Importantly, we also describe six future extensions that could provide at least a 64–192× speedup over our current design. Therefore, our initial results provide considerable motivation for the continued exploration of FPGA-based GP systems. Overall, any success in significantly improving runtime and energy efficiency could potentially enable novel research efforts through faster and/or less costly GP runs, similar to how GPUs unlocked the power of deep learning during the past fifteen years.
KW - Domain-specific architecture
KW - Field-programmable gate array
KW - Hardware acceleration
KW - Tree-based genetic programming
UR - http://www.scopus.com/inward/record.url?scp=85214230948&partnerID=8YFLogxK
U2 - 10.1007/s10710-024-09505-2
DO - 10.1007/s10710-024-09505-2
M3 - Article
AN - SCOPUS:85214230948
SN - 1389-2576
VL - 26
JO - Genetic Programming and Evolvable Machines
JF - Genetic Programming and Evolvable Machines
IS - 1
M1 - 8
ER -