TLDR: This research introduces a computational framework to understand how domestic violence (DV) victims disclose their experiences on social media and receive support. It leverages Large Language Models (LLMs) to detect self-disclosure posts, clusters them into thematic topics, summarizes these topics, and extracts corresponding support provisions from community comments. The study, using Reddit data, identified 20 distinct DV topics and various forms of online support, aiming to enable more effective digital interventions for victims.
Domestic Violence (DV) is a widespread public health issue, characterized by patterns of coercive and abusive behavior within intimate relationships. In recent years, social media has become a crucial platform for DV victims to share their experiences, making online self-disclosure a vital, yet often overlooked, avenue for seeking help.
Despite the growing attention to DV, there has been a lack of comprehensive understanding regarding how victims disclose their experiences, the types of support available, and the connections between them. To address these gaps, researchers have developed a novel computational framework designed to model DV support-seeking behavior and community support mechanisms.
This innovative framework consists of four main components:
Self-Disclosure Detection
The first step involves identifying social media posts where individuals are personally disclosing their experiences with domestic violence. This is framed as a binary classification task, where a Large Language Model (LLM) is trained to distinguish between self-disclosure and non-self-disclosure content. The study utilized Flan-T5, an open-source LLM, for this purpose, achieving an average accuracy of over 82% in detecting such posts.
Post Clustering
Once self-disclosure posts are identified, they are grouped based on their semantic similarity to uncover meaningful thematic patterns. The Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm was employed, along with Sentence-BERT and UMAP for encoding and dimensionality reduction, allowing the identification of clusters with varying densities and the automatic classification of outliers.
Topic Summarization
To make the clustered posts interpretable, topic summarization is performed for each group. This step uses the GPT-4o model to generate concise summaries that capture the main themes or topics related to DV within each cluster. This helps translate raw data into actionable insights for intervention strategies and support services.
Also Read:
- Large Language Models Transform Security Operations Centers: A Survey of Capabilities and Future Directions
- Automating the Search for Conversational AI Errors
Support Extraction and Mapping
The final component focuses on identifying the types of support provided by community members in response to the DV self-disclosure posts. GPT-4o is again used to extract and summarize supportive content from user comments. This process involves filtering comments by karma scores to prioritize informative content, and then mapping these extracted support types back to the identified DV topics.
The researchers collected data from Reddit, a social media platform known for its anonymous nature, which encourages users to share sensitive experiences. They identified twelve subreddits thematically aligned with DV discourse, such as r/domesticviolence and r/abusiverelationships. A dataset of over 9,000 posts was curated, with a subset manually annotated to train the self-disclosure detection model.
The study successfully identified 20 distinct thematic groups of DV-related topics from self-disclosure posts. These included emotional turmoil in intimate relationships, custody and legal battles, dynamics of abuse and victimization, law enforcement involvement, internal conflict and self-doubt, and the emotional aftermath of love and loss. Smaller clusters also emerged, covering topics like gaslighting, pregnancy during abuse, and narcissistic abuse.
Correspondingly, various forms of online support were identified and mapped to these topics. Common support provisions included prioritizing safety and healing, seeking professional help, building a strong support system, empowering through knowledge, letting go of shame, and asserting legal and parental rights. Other supports were more specific, such as reclaiming independence, trusting one’s instincts, stopping the rationalization of abuse, and establishing boundaries.
This framework not only advances existing knowledge on DV self-disclosure and online support but also paves the way for victim-centered digital interventions. Future research aims to incorporate multimodal data (images, audio), extend the framework to other social platforms and languages, and integrate these insights into practical tools like AI-driven chatbots for real-time support. You can read the full research paper here.


