spot_img
HomeResearch & DevelopmentHICOM: A New AI Framework Detects Multi-Face Deepfakes by...

HICOM: A New AI Framework Detects Multi-Face Deepfakes by Learning from Human Perception

TLDR: HICOM is a novel AI framework for detecting multi-face deepfake videos. It’s inspired by human cognitive patterns, leveraging four key cues: scene-motion coherence, inter-face appearance compatibility, interpersonal gaze alignment, and face-body consistency. HICOM significantly outperforms existing methods and even human detection in identifying deepfakes across various scenarios, including unseen datasets and real-world perturbations, while also offering interpretable results.

Deepfake technology has advanced rapidly, making it easier to create convincing fake videos, especially those featuring multiple altered faces in social settings. These multi-face deepfakes pose a significant threat, as they can be used for public manipulation and fraud. Traditional deepfake detection methods often fall short in these complex scenarios because they primarily focus on single faces and overlook crucial contextual information, such as how multiple faces interact within a scene.

Researchers at the National University of Singapore have developed a groundbreaking approach called HICOM (Human-Inspired, Context-Aware, Multi-face Deepfake Detection Framework) to tackle this challenge. Their work is unique because it draws inspiration from how humans naturally detect deepfakes in group settings. To understand human detection strategies, the researchers conducted extensive human studies, systematically observing how people identify fake faces in social contexts.

The quantitative analysis from these studies revealed four key cues that humans instinctively rely on:

Scene-Motion Coherence

This refers to the natural flow and consistency of movement within a scene. Deepfakes often introduce unnatural movements or misalignments between faces and their surroundings, which humans can pick up on.

Inter-Face Appearance Compatibility

Humans notice discrepancies in appearance among faces in a group. Deepfaked faces might have inconsistencies in resolution, color, or lighting compared to genuine faces in the same scene, making them stand out.

Interpersonal Gaze Alignment

Gaze direction is a vital social cue. Deepfakes frequently fail to maintain natural eye contact or gaze consistency, leading to mismatches in where a faked face is looking compared to others in the group or the camera.

Also Read:

Face-Body Consistency

Many deepfake generation methods focus solely on the face, neglecting the body. This can result in inconsistencies between the generated face and the body, particularly in terms of age and gender, which humans can detect.

Guided by these human insights, the HICOM framework was designed with four specialized modules, each corresponding to one of these human-inspired cues. The Scene-Motion Module (M1) analyzes facial and contextual features over time to expose unnatural movements. The Inter-Face Appearance Module (M2) compares different faces within a frame to detect inconsistencies in their visual attributes. The Gaze Module (M3) focuses on eye regions to identify abnormal gaze alignments. Finally, the Body-Face Module (M4) assesses age and gender consistency between the face and body.

The modular design of HICOM ensures robustness; even if one module misses an anomaly, others can still identify the deepfake cues. The framework also incorporates an advanced language model (LLM) to provide human-readable explanations for its detection results, making the process more transparent and trustworthy for users.

Extensive experiments on benchmark datasets demonstrated HICOM’s superior performance. It improved average accuracy by 3.3% in in-dataset detection and 2.8% under real-world perturbations. Crucially, HICOM outperformed existing methods by 5.8% on unseen datasets, highlighting the strong generalization of its human-inspired cues. The research also found that HICOM surpasses human detection capabilities in multi-face scenarios, proving its effectiveness in assisting users against the growing threat of deepfakes. For more details, you can refer to the full research paper: Seeing Through Deepfakes: A Human-Inspired Framework for Multi-Face Detection.

Ananya Rao
Ananya Raohttps://blogs.edgentiq.com
Ananya Rao is a tech journalist with a passion for dissecting the fast-moving world of Generative AI. With a background in computer science and a sharp editorial eye, she connects the dots between policy, innovation, and business. Ananya excels in real-time reporting and specializes in uncovering how startups and enterprises in India are navigating the GenAI boom. She brings urgency and clarity to every breaking news piece she writes. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -