Streamlining Subjective Data Annotation: EZ-Sort Combines AI Pre-Ordering with Human Expertise for Efficient Pairwise Comparisons

TLDR: EZ-Sort is a novel framework designed to make subjective data annotation more efficient. It achieves this by combining a zero-shot CLIP-based pre-ordering system, which uses hierarchical prompting to roughly sort items, with an uncertainty-aware human-in-the-loop MergeSort algorithm. This approach significantly reduces the number of human annotations required (up to 90.5% less than exhaustive comparisons and 19.8% less than previous sorting methods for n=100) while maintaining or improving inter-rater reliability across diverse tasks like face-age estimation, historical image chronology, and retinal image quality assessment.

Subjective data annotation tasks, such as assessing image quality or estimating age from faces, often rely on pairwise comparisons because they offer better reliability than simple ratings. However, traditional exhaustive pairwise comparisons demand a massive number of annotations, scaling quadratically with the number of items (O(n²)), making them impractical for large datasets.

Recent advancements have reduced this burden significantly by using sorting algorithms to actively sample comparisons, bringing the cost down to O(n log n). Building on this, a new framework called EZ-Sort further enhances efficiency by integrating artificial intelligence with human expertise.

Developed by Yujin Park and Haejun Chung from Hanyang University, and Ikbeom Jang from Hankuk University of Foreign Studies, EZ-Sort introduces two key innovations: first, it roughly pre-orders items using the Contrastive Language-Image Pre-training (CLIP) model without requiring any prior training; second, it automates easy and obvious comparisons, reserving human input for only the most uncertain cases. This hybrid approach aims to drastically cut down human annotation costs while maintaining or even improving the quality of the results.

How EZ-Sort Works

The EZ-Sort framework operates in three main stages:

1. CLIP-based Zero-Shot Pre-Ordering: This initial stage uses CLIP, a powerful vision-language model, to perform a rough, semantic pre-ordering of images. It employs a hierarchical prompting strategy, which recursively groups unsorted images using binary prompts. This method mimics how humans might categorize items from coarse to fine, making decisions at multiple levels to improve accuracy.

2. Bucket-Aware Elo Score Initialization: After the hierarchical pre-ordering, the fine-grained groups are merged into a smaller number of coarse ‘buckets’. Each image is then assigned an Elo score, a rating system commonly used in competitive games, based on its bucket ID and the confidence level from the CLIP model. This provides a strong starting point for the sorting process.

3. Uncertainty-Guided Human-in-the-Loop MergeSort: The final stage employs an uncertainty-aware MergeSort algorithm. Instead of asking humans to compare every pair, EZ-Sort selectively routes only high-uncertainty comparisons to human annotators. Comparisons where the model is highly confident are resolved automatically. This intelligent allocation of human effort ensures that the overall process remains efficient, preserving the optimal O(n log n) complexity of MergeSort.

Significant Efficiency Gains and Reliability

EZ-Sort was validated across various datasets, including face-age estimation (FGNET), historical image chronology (DHCI), and retinal image quality assessment (EyePACS). The results were compelling:

It reduced human annotation cost by an impressive 90.5% compared to exhaustive pairwise comparisons.
Compared to prior state-of-the-art sorting-based methods, EZ-Sort achieved a 19.8% reduction in human annotation cost for datasets with 100 items.
Crucially, these efficiency gains were achieved while improving or maintaining inter-rater reliability, especially in ambiguous tasks like retinal image quality assessment.
An ablation study confirmed that the hierarchical prompting strategy significantly improved correlation with ground truth labels compared to simpler, flat prompting methods.

The framework’s ability to combine strong CLIP-based priors with an intelligent, uncertainty-aware sampling strategy makes it a highly efficient and scalable solution for pairwise ranking tasks, particularly in domains where expert annotation is scarce and costly.

Also Read:

Future Directions and Availability

While EZ-Sort offers substantial benefits, the authors acknowledge its limitations, such as its dependence on the reliability of the underlying vision-language model and potential struggles in domains with very subtle visual distinctions. Future work includes integrating annotator reliability models, validating scalability on even larger datasets, and exploring few-shot fine-tuning for less common domains.

The code for EZ-Sort is publicly available, allowing researchers and practitioners to implement and build upon this innovative approach. You can find more details about this research in the full paper: EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Streamlining Subjective Data Annotation: EZ-Sort Combines AI Pre-Ordering with Human Expertise for Efficient Pairwise Comparisons

How EZ-Sort Works

Significant Efficiency Gains and Reliability

Future Directions and Availability

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates