TLDR: This paper introduces Adapt-WeldNet, an adaptive AI framework that optimizes welding defect detection in maritime operations by systematically evaluating various pre-trained models, transfer learning strategies, and optimizers. It also proposes the Defect Detection Interpretability Analysis (DDIA) framework, which uses Explainable AI (XAI) techniques like Grad-CAM and LIME, combined with human expert validation, to enhance transparency, trustworthiness, and safety in AI-driven defect detection systems.
Ensuring the safety and reliability of critical infrastructure, especially in demanding environments like offshore oil and gas operations, relies heavily on robust inspection methods. Welding, a fundamental process in constructing and maintaining maritime structures, is prone to defects that can compromise structural integrity. Traditional non-destructive testing (NDT) methods often fall short in detecting subtle or internal flaws, leading to potential failures and costly downtime. While advanced AI-driven techniques have emerged as promising tools for improved defect detection, many existing neural network approaches lack transparency and interpretability, which is a significant concern in high-stakes applications.
To address these challenges, a new research paper introduces an innovative framework called “Adapt-WeldNet” for welding defect detection. This adaptive system systematically evaluates various pre-trained AI models, learning strategies, and optimization techniques to identify the most effective configuration for detecting welding flaws. Beyond just performance, the paper also proposes a novel Defect Detection Interpretability Analysis (DDIA) framework, designed to make AI decisions more understandable and trustworthy. This is achieved by incorporating Explainable AI (XAI) techniques and, crucially, involving human experts in the evaluation process.
Adapt-WeldNet: Optimizing Defect Detection
Adapt-WeldNet is designed to overcome the limitations of using arbitrarily selected AI models for welding defect detection. It aims to find the optimal model and settings for this specific domain, ensuring greater reliability. The framework works by systematically exploring several key areas:
- Adaptive Model Selection: It evaluates eight different neural network architectures, such as ResNet18, DenseNet121, and EfficientNet, which are pre-trained on a vast image dataset (ImageNet).
- Adaptive Transfer Learning: It tests three different strategies for adapting these pre-trained models to the welding defect detection task. These strategies include freezing early layers (keeping initial learned features intact), freezing all layers (only training the final classification part), and fine-tuning all layers (adjusting all parts of the model to the new data).
- Adaptive Optimizer and Hyperparameters: The system also explores various optimizers (algorithms that adjust model weights during training) like Adam and AdamW, along with different learning rates and batch sizes, to find the combination that yields the best performance.
Through this systematic optimization process, Adapt-WeldNet identifies the best-performing model configuration, ensuring that the AI system is highly effective in identifying welding flaws tailored to the unique conditions of offshore structures.
DDIA: Making AI Decisions Transparent and Trustworthy
Even with high accuracy, an AI model’s inability to explain its decisions can hinder its adoption in critical applications. This is where the Defect Detection Interpretability Analysis (DDIA) framework comes into play. DDIA enhances system transparency and accountability by using Explainable AI (XAI) techniques and integrating a “Human-in-the-Loop” approach.
XAI techniques like Grad-CAM and LIME are employed to provide insights into how the AI classifier makes its predictions. Grad-CAM generates heatmaps that highlight the regions in an image that are most important for the model’s decision, giving a broad visualization of defect areas. LIME, on the other hand, offers more localized, boundary-focused explanations, pinpointing specific fine-grained regions.
A unique aspect of DDIA is its reliance on certified domain experts, such as ASNT NDE Level II auditors. These experts review the XAI outputs (like Grad-CAM heatmaps) and assess critical factors such as detection accuracy, defect visibility, image quality, and prediction confidence. Their feedback is crucial for validating the AI’s decisions, identifying areas for improvement, and ensuring the system aligns with real-world safety requirements. This human oversight fosters trust and ensures the safe deployment of AI systems in high-stakes environments.
Experimental Insights and Performance
The researchers tested their framework using the RIAWELC dataset, which contains over 24,000 X-ray weld images categorized into four types: porosity, lack of penetration, crack, and no defects. The dataset was carefully balanced using data augmentation techniques to ensure fair evaluation.
The experimental results showed that fine-tuning all layers of the pre-trained models consistently led to superior performance. Among the evaluated models, DenseNet121 and WideResNet50-2, when combined with optimizers like Adam or AdamW and lower learning rates, achieved the best results. The optimized classifier, specifically a DenseNet121 model fine-tuned with AdamW, demonstrated strong performance in classifying the different types of welding defects.
In terms of interpretability, both Grad-CAM and LIME proved effective. Grad-CAM was particularly reliable for identifying larger defect regions, while LIME excelled at pinpointing smaller, localized areas. Even under challenging conditions like noisy or underexposed images, Grad-CAM maintained robust defect localization. The expert evaluations within the DDIA framework confirmed that Grad-CAM generally provided clearer defect detection and received higher confidence scores from auditors, indicating its preference and trustworthiness among domain experts.
Furthermore, the paper introduced a novel recall-based evaluation metric to quantitatively assess how well Grad-CAM localizes defects. This metric measures the overlap between the AI’s predicted defect regions and the ground truth annotations provided by experts. The average recall of 0.7722 demonstrated Grad-CAM’s consistent ability to accurately highlight defect areas.
Also Read:
- Unpacking AI Recommendations: Tailored Visual Explanations for Social Media Users
- ExeKGLib: Empowering Domain Experts with User-Friendly ML Pipelines
A Step Towards Safer Maritime Operations
This research marks a significant advancement in welding defect detection for offshore environments. By integrating adaptive AI models with explainable AI techniques and a human-in-the-loop validation process, the framework not only improves detection accuracy but also enhances transparency and accountability. This approach supports the principles of Trustworthy AI, fostering confidence in automated decisions for critical operations. The full research paper can be found at this link.


