TLDR: This research investigates the use of synthetic data to train face recognition models, focusing on accuracy and bias. By generating a demographically balanced dataset (FairFaceGen) using Flux.1-dev and Stable Diffusion v3.5, and various augmentation methods, the study found that while synthetic data currently lags behind real data in generalization, it shows significant potential for bias mitigation, especially with SD35. The number and quality of intra-class augmentations also critically impact performance and fairness, suggesting careful design and hybrid training approaches are key for developing fairer and more accurate face recognition systems.
Face recognition technology has become ubiquitous, but its development often faces significant hurdles related to data. Traditional methods rely heavily on large datasets of real facial images, which come with inherent challenges such as privacy concerns, legal restrictions like GDPR, and the potential for embedded biases. Imagine trying to gather millions of diverse, real-world face images while ensuring everyone’s privacy is protected and the dataset is perfectly balanced across demographics – it’s a monumental task.
This is where synthetic data steps in as a promising alternative. Synthetic data, artificially generated, offers the potential to create vast, diverse datasets without infringing on individual privacy. It also provides a unique opportunity to control demographic attributes, which could be key to mitigating biases in face recognition systems. However, a crucial question remains: can synthetic data truly deliver both high accuracy and fairness?
A recent research paper, “Investigation of Accuracy and Bias in Face Recognition Trained with Synthetic Data,” by Pavel Korshunov, Ketan Kotwal, Christophe Ecabert, Vidit Vidit, Amir Mohammadi, and Sébastien Marcel, delves deep into this question. The researchers systematically evaluated the impact of synthetic data on both the performance and fairness of face recognition systems. You can find the full paper here: Research Paper.
Generating Fairer Faces
The core of their work involved creating a demographically balanced synthetic dataset called FairFaceGen. To achieve this, they utilized two cutting-edge text-to-image generators: Flux.1-dev and Stable Diffusion v3.5 (SD35). These “seed generators” were used to create distinct identities. To add variety to each identity (like different poses, lighting, and expressions), they combined these with several “identity augmentation methods,” including Arc2Face and various IP-Adapters.
A key aspect of their methodology was ensuring fair comparisons. They maintained an equal number of identities across their synthetic and real datasets. This meticulous approach allowed them to accurately assess how synthetic data impacts face recognition performance on standard benchmarks like LFW and AgeDB-30, as well as more challenging ones like IJB-B/C. Bias was specifically evaluated using the Racial Faces in-the-Wild (RFW) dataset.
Key Findings on Accuracy and Bias
The study yielded several important insights. While synthetic data still lags behind real datasets in terms of generalization, particularly on complex benchmarks like IJB-B/C, the demographically balanced synthetic datasets, especially those generated with SD35, showed significant potential for reducing bias. This suggests that carefully constructed synthetic data can indeed lead to fairer face recognition systems.
Another critical observation was the influence of intra-class augmentations – the variations generated for each identity. The number and quality of these augmentations significantly affected both the accuracy and fairness of the face recognition models. For instance, increasing the number of images per identity from 8 to 16 generally improved performance, but further increases to 24 or 32 images per identity could sometimes lead to a drop in performance on the most challenging benchmarks, particularly for SD35-based data.
When it came to bias mitigation, SD35-based synthetic data consistently achieved better fairness metrics, even outperforming some real datasets in terms of lower standard deviation across racial groups. The researchers suggest this might be because SD35 generates images that look more like “in-the-wild” photos, offering greater visual diversity compared to the more “professionally looking portraits” generated by Flux.
Also Read:
- Automating Suspect Sketching with Generative AI
- Detecting Deepfakes: A New Approach Using Facial Movement Analysis
Looking Ahead: Hybrid Approaches
The findings from this research provide valuable practical guidelines for building fairer face recognition systems using synthetic data. The paper concludes by highlighting the importance of thoughtful design choices for both seed and augmentation generators. It also points towards hybrid training approaches, combining both synthetic and real data, as a promising path forward to achieve the best of both worlds: high performance and reduced bias in face recognition systems.


