Evaluating Novel Features for Aggressive Language Detection

Tina Schuh, Stephan Dreiseitl

Research output: Chapter in Book/Report/Conference proceedingsConference contributionpeer-review

Abstract

The widespread use and abuse of social media and other platforms to voice opinions online has necessitated the development of tools to regulate this exchange of opinions in light of ethical and legal considerations. In this work, we aim to detect patterns of aggressive language to gain insight into what differentiates it from non-inflammatory language. Of particular interest are features of comments that, taken together, allow this distinction to be made automatically. To that end, we employ feature selection techniques to find optimal feature subsets. We apply the feature selection and model evaluation process to two independent datasets. Depending on the dataset and model type, between 3 and 19 features are enough to outperform the full set of 68 features. Overall, the best F1 scores per dataset are 89.4%, using 35 features with a Gaussian SVM and 82.7%, using 17 features with a linear SVM.

Original languageEnglish
Title of host publicationSpeech and Computer - 20th International Conference, SPECOM 2018, Proceedings
EditorsRodmonga Potapova, Oliver Jokisch, Alexey Karpov
PublisherSpringer
Pages585-595
Number of pages11
ISBN (Print)9783319995786
DOIs
Publication statusPublished - 2018
Event20th International Conference on Speech and Computer, SPECOM 2018 - Leipzig, Germany
Duration: 18 Sept 201822 Sept 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11096 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th International Conference on Speech and Computer, SPECOM 2018
Country/TerritoryGermany
CityLeipzig
Period18.09.201822.09.2018

Keywords

  • Aggressive language detection
  • Feature selection
  • Hate speech
  • Machine learning

Fingerprint

Dive into the research topics of 'Evaluating Novel Features for Aggressive Language Detection'. Together they form a unique fingerprint.

Cite this