TY - GEN
T1 - Using LLMs and Websearch in Order to Perform Fact Checking on Texts Generated by LLMs
AU - Sandler, Simone
AU - Krauss, Oliver
AU - Stöckl, Andreas
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025/4/25
Y1 - 2025/4/25
N2 - Finding out if a given text contains any false information is not an easy task. On large corpora of data, such as the tremendous amount of texts generated by LLMs, fact checking is a prohibitively expensive task. To address this challenge, we propose a novel approach that combines fact checking by LLMs with web search. This method can be applied not only to single sentences by applying a true, false or unknown label. For whole text paragraphs, a 0..1 score representing the truthfulness from the sentence labels is calculated. The process begins by extracting claims from the text, which is done using GPT3. Then, these claims are validated, with Google search results being used to supplement the GPT3 results. When validating our method against a corpus of 122 LLM-generated text samples, we achieve an accuracy of 0.79. To compare our work to other approaches, we also applied our fact checking to the FEVER dataset, achieving an accuracy of 0.78. Which is similar than the current best accuracy of 0.79 on the FEVER dataset. This demonstrates the potential of our proposed approach for automated fact checking.
AB - Finding out if a given text contains any false information is not an easy task. On large corpora of data, such as the tremendous amount of texts generated by LLMs, fact checking is a prohibitively expensive task. To address this challenge, we propose a novel approach that combines fact checking by LLMs with web search. This method can be applied not only to single sentences by applying a true, false or unknown label. For whole text paragraphs, a 0..1 score representing the truthfulness from the sentence labels is calculated. The process begins by extracting claims from the text, which is done using GPT3. Then, these claims are validated, with Google search results being used to supplement the GPT3 results. When validating our method against a corpus of 122 LLM-generated text samples, we achieve an accuracy of 0.79. To compare our work to other approaches, we also applied our fact checking to the FEVER dataset, achieving an accuracy of 0.78. Which is similar than the current best accuracy of 0.79 on the FEVER dataset. This demonstrates the potential of our proposed approach for automated fact checking.
KW - Fact Checking
KW - Large-Language-Models
KW - Natural-Language-Processing
UR - http://www.scopus.com/inward/record.url?scp=105004255331&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-82957-4_27
DO - 10.1007/978-3-031-82957-4_27
M3 - Conference contribution
SN - 9783031829598
VL - 15173
T3 - Lecture Notes in Computer Science
SP - 326
EP - 332
BT - Computer Aided Systems Theory – EUROCAST 2024 - 19th International Conference, 2024, Revised Selected Papers
A2 - Quesada-Arencibia, Alexis
A2 - Affenzeller, Michael
A2 - Moreno-Díaz, Roberto
ER -