ProperBERT - Proactive Recognition of Offensive Phrasing for Effective Regulation

Research output: Chapter in Book/Report/Conference proceedingsConference contributionpeer-review

Abstract

This work discusses and contains content that may be offensive or unsettling. Hateful communication has always been part of human interaction, even before the advent of social media. Nowadays, offensive content is spreading faster and wider through digital communication channels. To help improve regulation of hate speech, we introduce ProperBERT, a fine-tuned BERT model for hate speech and offensive language detection specific to English. To ensure the portability of our model, five data sets from literature were combined to train ProperBERT. The pooled dataset contains racist, homophobic, misogynistic and generally offensive statements. Due to the variety of statements, which differ mainly in the target the hate is aimed at and the obviousness of the hate, a sufficiently robust model was trained. ProperBERT shows stability on data sets that have not been used for training, while remaining efficiently usable due to its compact size. By performing portability tests on data sets not used for fine-tuning, it is shown that fine-tuning on large scale and varied data leads to increased model portability.

Original languageEnglish
Title of host publicationInternational Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-6
Number of pages6
ISBN (Electronic)9781665470957
ISBN (Print)978-1-6654-7096-4
DOIs
Publication statusPublished - 2022
Event2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022 - Male, Maldives
Duration: 16 Nov 202218 Nov 2022

Publication series

NameInternational Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022

Conference

Conference2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022
Country/TerritoryMaldives
CityMale
Period16.11.202218.11.2022

Keywords

  • BERT
  • hate speech detection
  • machine learning

Fingerprint

Dive into the research topics of 'ProperBERT - Proactive Recognition of Offensive Phrasing for Effective Regulation'. Together they form a unique fingerprint.

Cite this