TY - GEN
T1 - Evaluating Novel Features for Aggressive Language Detection
AU - Schuh, Tina
AU - Dreiseitl, Stephan
N1 - Publisher Copyright:
© 2018, Springer Nature Switzerland AG.
PY - 2018
Y1 - 2018
N2 - The widespread use and abuse of social media and other platforms to voice opinions online has necessitated the development of tools to regulate this exchange of opinions in light of ethical and legal considerations. In this work, we aim to detect patterns of aggressive language to gain insight into what differentiates it from non-inflammatory language. Of particular interest are features of comments that, taken together, allow this distinction to be made automatically. To that end, we employ feature selection techniques to find optimal feature subsets. We apply the feature selection and model evaluation process to two independent datasets. Depending on the dataset and model type, between 3 and 19 features are enough to outperform the full set of 68 features. Overall, the best F1 scores per dataset are 89.4%, using 35 features with a Gaussian SVM and 82.7%, using 17 features with a linear SVM.
AB - The widespread use and abuse of social media and other platforms to voice opinions online has necessitated the development of tools to regulate this exchange of opinions in light of ethical and legal considerations. In this work, we aim to detect patterns of aggressive language to gain insight into what differentiates it from non-inflammatory language. Of particular interest are features of comments that, taken together, allow this distinction to be made automatically. To that end, we employ feature selection techniques to find optimal feature subsets. We apply the feature selection and model evaluation process to two independent datasets. Depending on the dataset and model type, between 3 and 19 features are enough to outperform the full set of 68 features. Overall, the best F1 scores per dataset are 89.4%, using 35 features with a Gaussian SVM and 82.7%, using 17 features with a linear SVM.
KW - Aggressive language detection
KW - Feature selection
KW - Hate speech
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85053808349&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-99579-3_60
DO - 10.1007/978-3-319-99579-3_60
M3 - Conference contribution
SN - 9783319995786
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 585
EP - 595
BT - Speech and Computer - 20th International Conference, SPECOM 2018, Proceedings
A2 - Potapova, Rodmonga
A2 - Jokisch, Oliver
A2 - Karpov, Alexey
PB - Springer
T2 - 20th International Conference on Speech and Computer, SPECOM 2018
Y2 - 18 September 2018 through 22 September 2018
ER -