TLDR: A new AI framework called ICAF addresses the challenge of segmenting defects in CdZnTe semiconductors by leveraging multiple image views of the same sample. Unlike traditional methods that process images individually, ICAF mimics human annotators by cross-referencing these views to improve the accuracy of defect detection, especially in low-contrast areas, even with very limited labeled data.
Detecting tiny defects in advanced materials like Cadmium Zinc Telluride (CdZnTe) semiconductors is crucial for quality control, but it’s also incredibly challenging. These materials are used in high-tech applications like X-ray detectors, and their images often show low-contrast defect boundaries. Human annotators typically need to look at multiple images of the same sample, taken from different angles or lighting conditions, to accurately identify a single defect. This “many-to-one” relationship, where several views correspond to one true defect map, is a unique hurdle for automated systems.
Traditional artificial intelligence (AI) methods for semantic segmentation, which involves pixel-level classification to identify objects or defects, usually work on a “one-to-one” basis. This means they expect each image to have its own distinct ground truth label. When applied to CdZnTe images, these methods struggle because they can’t effectively use the complementary information from multiple views. This often leads to errors accumulating, especially in hard-to-see areas, and can even reinforce incorrect predictions, a problem known as confirmation bias.
To overcome these limitations, researchers have developed an innovative solution called the Intra-group Consistency Augmentation Framework (ICAF). This new AI approach is inspired by how human experts tackle the problem. Instead of treating each image in isolation, ICAF processes a “group” of images that all belong to the same semiconductor sample and share a single ground truth defect map. By doing so, it can harness the rich boundary information available across these different views.
How ICAF Works
ICAF operates on two main principles. First, it establishes a “group-oriented baseline” through a process called Intra-group View Sampling (IVS). This involves taking several views from a CdZnTe group and using their inherent consistency as a form of data augmentation. Essentially, if different views of the same defect should lead to consistent predictions, the model learns to be more robust.
Second, ICAF introduces a Pseudo-label Correction Network (PCN) to significantly improve the quality of the AI’s initial predictions, known as pseudo-labels. The PCN has two key modules:
View Augmentation Module (VAM): This module dynamically synthesizes a “boundary-aware view” by intelligently combining features from multiple input views. Think of it as creating a super-view that highlights all the subtle defect boundaries that might be faint or invisible in individual images. This enhanced view helps refine the pseudo-labels.
View Correction Module (VCM): After the VAM creates the boundary-aware view, the VCM pairs this enhanced view with other original views from the group. It then facilitates an “information interaction” between them, allowing the system to focus on important regions while filtering out noise. This further corrects and refines the pseudo-labels, making them much more accurate.
The entire ICAF framework is used during the training phase to teach the AI model. Crucially, it doesn’t add any extra computational burden during the actual defect detection process (inference), making it practical for industrial applications.
Also Read:
- Human-Robot Teams Boost Concrete Crack Detection in Nuclear Facilities
- New AI Model Integrates Incomplete Multi-view Data with Missing Labels
Impressive Results
The ICAF framework was tested on a specially collected dataset of CdZnTe images, called TPO (Twelve images Plus One corresponding Label), which includes groups of 12 images per sample. The results were highly promising, with ICAF consistently outperforming existing state-of-the-art semi-supervised semantic segmentation methods. Notably, ICAF showed significant improvements, especially when very little labeled data was available for training. For instance, with only 5‰ (0.5%) of the data labeled, ICAF achieved a mean Intersection-over-Union (mIoU) of 70.6%, demonstrating its effectiveness in scenarios where manual labeling is extremely costly and time-consuming.
This human-inspired approach offers a robust solution for automated defect segmentation in challenging industrial environments, effectively addressing the complexities of low-contrast images and the “many-to-one” data relationship. For more technical details, you can refer to the full research paper here.


