AI Based on Graphical Structure Identifies Inappropriate Language Online

A new AI method aims to identify inappropriate language online by utilizing the relationships between messages rather than just the text itself. The research focuses on situations where large language models do not operate with sufficient accuracy or efficiency and proposes a structure-based solution instead.

The work targets the English-language Wikipedia community, where incivility—such as toxicity, aggression, and personal attacks—burdens both users and volunteer moderators. Current automatic detection systems often lag in both accuracy and computational efficiency.

Researchers introduce a so-called graph neural network, an AI model that does not view comments individually but as part of a broader network. Each user comment is represented as a node in the network, and the connections between nodes are defined based on the similarity of the texts. This way, the model learns simultaneously from both the content of the language and how comments relate to each other.

This approach allows incivility to be assessed not just based on individual words or sentences but by considering the entire discussion structure. The goal is to support the moderation of online communities with a system that detects toxicity, aggression, and personal attacks more reliably and with lighter computational resources.

The research thus suggests an alternative direction for automatic moderation at a time when attention has largely focused on large language models. The model based on a graph neural network seeks to incorporate the structure of the discussion into problem-solving, which can be crucial in large and active online communities.

Source: When Large Language Models Do Not Work: Online Incivility Prediction through Graph Neural Networks, ArXiv (AI).