Advancing User Profiling with Confidence-Driven AI

TLDR: CONF-PROFILE is a new AI framework for user profiling that infers user attributes like age, gender, and occupation without needing extensive pre-labeled data. It uses a two-stage process: first, it synthesizes high-quality “pseudo-labels” with confidence scores from an advanced AI model, then refines a smaller AI model using confidence-guided reinforcement learning. The framework also introduces ProfileBench, a real-world benchmark from a video platform, to standardize evaluation. This approach significantly improves profiling accuracy and reliability, especially in data-scarce environments.

Understanding users is crucial for personalized services, from recommendations to social network analysis. This process, known as user profiling, involves inferring structural attributes like age, gender, industry, and life stage from various user information. While Large Language Models (LLMs) show great promise in this area, their application has been hampered by a lack of comprehensive benchmarks and the difficulty of obtaining large-scale, real-world ground-truth labels.

A new research paper introduces CONF-PROFILE, a novel framework designed to tackle these challenges by enabling label-free and reliable user profiling. The paper also presents ProfileBench, an industry-level benchmark derived from a real-world video platform, Douyin. This benchmark provides a rich dataset of heterogeneous user information and a well-structured taxonomy for profiling.

Addressing the Label Scarcity Problem

The core innovation of CONF-PROFILE lies in its two-stage, confidence-driven approach. The first stage focuses on synthesizing high-quality labels without relying on extensive human annotations. It begins by using an advanced LLM (referred to as the “teacher LLM”) to generate initial profile tags, along with confidence scores and supporting evidence. To enhance reliability, the framework employs parallel sampling, where the teacher LLM makes multiple inferences for each user. These multiple predictions are then combined using confidence-weighted voting, ensuring that the final decision is influenced more by high-confidence inferences. Furthermore, a confidence calibration step is applied to balance the distribution of confidence scores, making them more informative for the profiling task. The high-quality pseudo-labels, calibrated confidence scores, and rationales are then used to fine-tune a more lightweight “student LLM” through Supervised Fine-Tuning (SFT), making the system efficient for deployment.

Enhancing Reasoning with Confidence-Guided Learning

The second stage of CONF-PROFILE further refines the student LLM’s reasoning abilities using a confidence-guided unsupervised reinforcement learning (RL) approach. In a real-world setting, where ground-truth labels are scarce, this stage is critical. Confidence plays a multifaceted role here: it’s used for difficulty filtering, allowing the system to focus on informative, moderately difficult samples for training. It also helps generate “quasi-groundtruth” labels on the fly by aggregating multiple rollouts of the model’s predictions using confidence-weighted voting. Finally, a unique confidence-guided reward mechanism is introduced. This mechanism makes the model more sensitive to its own certainty, applying larger rewards or penalties to high-confidence predictions, thus strengthening its ability to make reliable inferences. The confidence values used for weighting rewards are “frozen” to prevent the model from artificially inflating its confidence scores.

ProfileBench: A Realistic Benchmark

ProfileBench is a significant contribution of this research. It’s built from a large-scale industrial video platform and integrates diverse user information, including behavioral cues (keyphrases from watched, searched, or submitted content) and demographic cues (self-reported age, gender, region, signature). The benchmark defines a systematic six-dimensional tagging taxonomy: Gender, Age, Industry, Occupation, Education Level, and Life Stage. Crucially, each dimension includes an “Unknown (NA)” option, allowing the model to abstain from predictions when evidence is insufficient, directly supporting confidence-aware evaluation. This dataset, with its 1,000 human-annotated samples for evaluation, reflects the complexities and challenges of real-world user profiling, making it a robust tool for advancing label-free methods.

Also Read:

Impressive Performance Gains

Experimental results demonstrate the substantial performance of CONF-PROFILE. The framework achieved consistent F1 score gains, improving by 13.97 points on the Qwen3-8B model. Specifically, the SFT stage alone boosted the average F1 by 10.61 points, with subsequent RL further adding 3.36 points. This highlights the effectiveness of both stages and the critical role of confidence in achieving robust and accurate multi-dimensional user profiling. The research also shows that confidence acts as an effective indicator of prediction difficulty, allowing for flexible precision-recall trade-offs by setting confidence thresholds.

This work establishes a strong foundation for user profiling and opens promising directions for applying LLMs to broader user modeling tasks, especially in scenarios where labeled data is scarce. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Advancing User Profiling with Confidence-Driven AI

Addressing the Label Scarcity Problem

Enhancing Reasoning with Confidence-Guided Learning

ProfileBench: A Realistic Benchmark

Impressive Performance Gains

Gen AI News and Updates

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Vatican Summit Addresses Ethical Imperatives of AI in Healthcare

Morgan Freeman Condemns Unauthorized AI Voice Replication, Citing Theft of Identity and Work

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates