Manara - Qatar Research Repository
Browse

The Landscape of Toxicity in Reddit: From Users to Conversations to Moderators

Download (7.78 MB)
thesis
submitted on 2024-10-28, 07:06 and posted on 2024-11-04, 09:43 authored by Hind Ali Almerekhi
Online platforms like Reddit enable users to build communities and converse about diverse topics and interests. However, an increasing number of users publish disturbing posts and comments containing profanity, harassment, and hate speech, otherwise known as toxic content. Such users can change their toxic behavior by participating in multiple communities. Within communities, conversations can show ominous signs of toxicity when they contain causes (i.e., triggers) of toxicity. When toxicity increases, moderators often struggle with managing the safety of conversations in online communities. To address these issues, first, we analyzed toxicity in the form of toxic user behavior. We found that 16.11% of cross-community users publish toxic posts, and 13.28% of cross-community users publish toxic comments. However, results showed that 30.68% of users publishing posts and 81.67% of users publishing comments exhibit changes in their toxicity across different communities, indicating that users adapt their behavior to the communities’ norms. Next, we extracted a set of sentiment shift, topical shift, and context-based features from 991,806 conversation threads. Then, we used them to build a dual embedding biLSTM neural network that achieved an AUC score of 0.789. Our analysis showed that specific triggering keywords, like ‘racist’ and ‘women’, are common across all communities. Lastly, we performed a mixed-method study on a collection of 1,827 responses from Reddit moderators. The survey analysis found specific themes like experience and style, views on toxicity, and how they adhere to community guidelines, which influence the toxicity of moderators and how they handle toxicity. This dissertation presents our approach, which builds on state-of-the-art toxic comment and toxicity trigger detection methods. Lastly, we show our research findings of investigating toxicity across users and moderators on Reddit.

History

Language

  • English

Publication Year

  • 2022

License statement

© The author. The author has granted HBKU and Qatar Foundation a non-exclusive, worldwide, perpetual, irrevocable, royalty-free license to reproduce, display and distribute the manuscript in whole or in part in any form to be posted in digital or print format and made available to the public at no charge. Unless otherwise specified in the copyright statement or the metadata, all rights are reserved by the copyright holder. For permission to reuse content, please contact the author.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Science and Engineering - HBKU

Degree Date

  • 2022

Degree Type

  • Doctorate

Advisors

J. Jansen Bernard ; Haewoon Kwak

Committee Members

Yin Yang ; Brahim Belhaouari Samir ; Muammer Koc ; S. Al-Maadeed Sumaya Ali ; M. Neves Joselia

Department/Program

College of Science & Engineering

Usage metrics

    College of Science and Engineering - HBKU

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC