spot_img
HomeResearch & DevelopmentTeaching Machines to Know When They Don't Know: A...

Teaching Machines to Know When They Don’t Know: A New Approach to AI Trustworthiness

TLDR: The research introduces a self-awareness mechanism for AI, where a supervising neural network (SNN) monitors an underlying convolutional neural network (CNN) ensemble to detect uncertainty in its face and facial expression recognition predictions. When the SNN identifies high uncertainty, it triggers an active learning mode, prompting the system to seek human assistance for clarification. This approach, which includes continuous learning and a novel loss layer with memory, significantly improves recognition accuracy, particularly in challenging “out-of-distribution” scenarios like faces with makeup or occlusions.

Imagine an artificial intelligence system that not only performs tasks but also understands when it’s unsure about its own predictions, much like a human reflecting on their thoughts. This concept, often associated with advanced Artificial General Intelligence (AGI), is being explored even in more focused machine learning applications. A recent research paper, Elements of Active Continuous Learning and Uncertainty Self-Awareness: a Narrow Implementation for Face and Facial Expression Recognition, delves into creating such a self-aware mechanism for face and facial expression recognition.

The core idea involves two interconnected artificial neural networks (ANNs). One is an underlying convolutional neural network (CNN) ensemble, which is the workhorse for identifying faces and their expressions. The other is a supervising ANN, which acts as a ‘self-awareness’ mechanism. This supervisor observes patterns in the activations of the underlying CNN ensemble, looking for signs that indicate high uncertainty in its predictions. When the supervisor detects such uncertainty, it essentially flags the prediction as less trustworthy.

This self-awareness ANN is not static; it has a memory where it stores information about its past performance. Its parameters are continuously adjusted during training to optimize its ability to identify untrustworthy predictions. When a prediction is deemed untrustworthy, it triggers an ‘active learning’ mode. In this mode, the machine learning algorithm gains a degree of agency, actively requesting human assistance – an ‘Oracle’ – to clarify the problematic image or situation. This is particularly valuable in scenarios where the AI is highly confused or uncertain.

The researchers focused on face recognition (FR) and facial expression recognition (FER) tasks, which are inherently human-centric. While state-of-the-art CNN models have achieved high accuracy in ideal lab conditions for face recognition, their performance significantly drops when faced with ‘out-of-distribution’ (OOD) data, such as faces with heavy makeup or occlusions. Facial expression recognition algorithms perform even worse, partly because human emotion recognition is highly context-dependent, going beyond just basic facial feature complexes.

To address these challenges, the meta-learning supervisor ANN was designed to learn patterns associated with failed predictions in the underlying CNN models. It uses a unique ‘uncertainty shape descriptor’ (USD), which is built by sorting and rearranging the softmax activations from the CNN ensemble. This descriptor provides a class-invariant generalization, meaning it can identify uncertainty regardless of the specific face or expression being recognized.

A novel aspect of this solution is the ‘loss layer with memory’ within the meta-learning supervisor ANN. This memory collects statistical information about training results, including the prediction, the correct label, and a learnable ‘trustworthiness threshold’. This threshold is dynamically adjusted to optimize the system’s performance, offering a clear explanation for why a prediction was categorized as trusted or not.

Experiments were conducted using the Inception v.3 CNN model as the underlying architecture and the BookClub artistic makeup data set, which is specifically designed to exaggerate OOD conditions with various makeup styles, occlusions, and expressions. The results were significant: the meta-learning SNN noticeably increased accuracy metrics for FR tasks by tens of percent and significantly doubled them for FER tasks, especially under challenging OOD conditions. Active learning further boosted FR accuracy metrics, even with a very small percentage (0.1% to 1%) of allowed ‘Oracle’ requests, pushing trusted accuracy into the high 90% range for 1% requests.

While online retraining offered minor improvements for unstructured test data, it proved valuable for structured test data, providing insights not just into uncertainty but also into the quality of the test session itself. This means the model can learn about the difficulty of certain images or even the ‘acting quality’ of expressions in a dataset.

Also Read:

In conclusion, this research demonstrates a practical implementation of self-awareness in narrow machine learning algorithms. By enabling AI to recognize its own uncertainty and actively seek help, the system becomes more robust and trustworthy, particularly in complex real-world scenarios like face and facial expression recognition with challenging visual variations. This work paves the way for more reliable and human-like AI systems that can continuously learn and adapt.

Nikhil Patel
Nikhil Patelhttps://blogs.edgentiq.com
Nikhil Patel is a tech analyst and AI news reporter who brings a practitioner's perspective to every article. With prior experience working at an AI startup, he decodes the business mechanics behind product innovations, funding trends, and partnerships in the GenAI space. Nikhil's insights are sharp, forward-looking, and trusted by insiders and newcomers alike. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -