Abstract
Language models are intended to process and generate text. In the extensive training process, however, they also develop arithmetic skills and the skills required to write programming code. In this work, we investigate whether it is possible to identify the areas in the neurons of these models responsible for a specific skill. For this purpose, we consider arithmetic tasks and let a language model solve them by completing and extracting the activation states of the neurons via synthetically generated datasets. We then try to reconstruct the results from individual groups of neurons using regression models to find the relevant groups for solving the tasks. Linear regression models, regression trees, and support vector regression are used to uncover possible relationships. We identify that neuron pairs, not individual neurons, in the LLM can be identified as responsible for specific arithmetic behavior. We also find that several distinct pairs of neurons in the GPT2 XL model are responsible for arithmetic capabilities, indicating a redundant encoding of these capabilities. In the future, this can lead to smaller models being extracted from larger ones for specific tasks.
Original language | English |
---|---|
Pages | 1-7 |
Number of pages | 7 |
DOIs | |
Publication status | Published - Sept 2024 |
Event | 23. International Conference on Modelling and Applied Simulation MAS - La Laguna, Tenerife, Spain Duration: 18 Sept 2024 → 20 Sept 2024 https://www.msc-les.org/mas2024/ |
Conference
Conference | 23. International Conference on Modelling and Applied Simulation MAS |
---|---|
Country/Territory | Spain |
City | Tenerife |
Period | 18.09.2024 → 20.09.2024 |
Internet address |
Keywords
- Large Language Models (LLMs)
- deep learning
- Explainable AI