TLDR: RedOne is a new domain-specific large language model designed for social networking services (SNS). It uses a three-stage training process (continue pretraining, supervised fine-tuning, and preference optimization) with large-scale real-world SNS data. RedOne significantly improves performance across various SNS tasks, such as content management and user interaction, while maintaining strong general language capabilities. Online tests show it reduces harmful content exposure and boosts user engagement in search.
In the rapidly evolving landscape of modern information dissemination, social networking services (SNS) have become central to how we communicate, share knowledge, and express emotions. However, the unique characteristics of SNS data—its informality, context-sensitivity, and often emotionally charged nature—pose significant challenges for traditional content management and interaction quality improvement systems.
While large language models (LLMs) have shown immense potential, existing solutions often focus on isolated tasks, struggling to adapt flexibly to the diverse, real-world contexts of social media. This limitation highlights a crucial gap: the inability of current SNS domain-specific models to incorporate a broader range of domain knowledge during their training.
Introducing RedOne: A Specialized LLM for Social Media
To address these challenges, researchers from Xiaohongshu Inc. have introduced RedOne, a groundbreaking domain-specific LLM designed to overcome the performance bottlenecks of single-task baselines and establish a comprehensive foundation for SNS. RedOne is built through a meticulous three-stage post-training strategy, leveraging a massive dataset derived from real-world social media interactions.
The Three-Stage Training Strategy
RedOne’s development involves a sophisticated pipeline to ensure it excels in the SNS domain while retaining strong general language capabilities:
1. Continue Pretraining (CPT): This initial stage focuses on enriching the model’s understanding of nuanced SNS field knowledge. It involves collecting and constructing data from both general high-quality open-source corpora (to preserve foundational generalization abilities) and large-scale SNS-specific domain data. The SNS data captures diverse communication patterns, including informal discussions, short-form comments, sarcasm, and emotionally charged content. Crucially, user interaction data is incorporated to guide the training process, naturally clustering semantically related SNS content. A rigorous data-filtering pipeline is then applied to ensure high-quality data for training.
2. Supervised Fine-Tuning (SFT): Following pretraining, this stage sharpens RedOne’s ability to follow instructions for specific real-world SNS applications. It utilizes extensive user-generated content from public platforms, such as notes, comments, queries, and interaction logs, preserving the typical linguistic style of SNS. The SFT process consolidates six core capabilities essential for SNS: content understanding, information extraction, semantic matching, user behavior modeling, dialogue and persona simulation, and translation. To prevent catastrophic forgetting and maintain generalization, open-source instruction data covering general tasks is also incorporated. A two-step mixed fine-tuning strategy is employed, initially combining SNS and general data, then focusing more heavily on SNS data to enhance domain-critical tasks.
3. Preference Optimization (PO): The final stage addresses the challenge of multiple plausible but quality-diverse outputs in SNS tasks. While SFT improves instruction-following, it doesn’t fully exploit implicit preference signals. RedOne uses Direct Preference Optimization (DPO) to align the model’s behavior with human preferences and leverage information embedded in data labels. For subjective tasks like emotional dialogue, domain experts create preference annotations, which are then scaled up using high-performing judge models. For objective tasks with definitive answers, preference pairs are constructed from the inherent structure of questions (correct answers vs. incorrect options) and model errors, using ground truth as positive examples and incorrect predictions as negative.
Also Read:
- AI’s Role in Reshaping Online Discourse: A Look at LLM-Powered Abusive Text Transformation
- Specialized Language Models: A New Frontier for Automated Algorithm Design
Remarkable Performance and Real-World Impact
Through extensive experiments, RedOne has demonstrated exceptional effectiveness. It not only maintains strong general capabilities, often surpassing its base models on general tasks, but also achieves an average improvement of up to 14.02% across 8 major SNS tasks and 7.56% in SNS bilingual evaluation benchmarks compared to base models.
The practical utility of RedOne has been validated through online testing in real-world SNS scenarios. In harmful content detection, RedOne reduced the exposure rate of harmful notes by 11.23%, significantly enhancing platform security. For post-view search recommendations, the model delivered a 14.95% increase in click page rate, indicating improved content discovery and user engagement. These results underscore RedOne’s robustness and promising applicability in real-world social media environments.
Furthermore, comparisons show that RedOne provides a stronger foundation for task-specific fine-tuning than general-purpose large models, consistently outperforming them. This indicates that domain-specific post-training is a powerful approach for improving both zero-shot capabilities and fine-tuned performance.
RedOne represents a significant step forward in developing specialized LLMs for social media, offering a comprehensive and robust baseline for future SNS applications. You can learn more about this research in the RedOne research paper.


