TLDR: A new framework called SAND leverages self-supervised learning (SSL) for automated feature extraction and neural architecture search (NAS) for adaptive classifier optimization to detect Hardware Trojans (HTs). This approach significantly improves detection accuracy (up to 18.3% over state-of-the-art methods), demonstrates high resilience against evasive Trojans, and offers strong generalization and adaptability to unseen threats with minimal retraining overhead.
In today’s interconnected world, the security of our electronic devices is paramount. With the globalized semiconductor supply chain, where different components might come from various third-party vendors, a significant threat known as Hardware Trojans (HTs) has emerged. These malicious circuits, often hidden within System-on-Chip (SoC) designs, can lead to severe consequences like information leakage or erroneous system execution. Detecting these hidden threats is a complex challenge that current methods often struggle with.
Traditional approaches to Hardware Trojan detection, such as formal verification and test generation, face limitations. Formal verification, while thorough, is computationally expensive and impractical for large-scale systems. Test generation methods, which involve applying specific inputs to trigger anomalous behavior, can be time-consuming, especially for well-concealed Trojans.
Machine learning (ML) has shown promise in this field, but existing ML-based techniques also have critical drawbacks. Many rely on manually selected features, which can be inconsistent and lack a universal standard. Furthermore, these models often struggle with generalizability, meaning they perform poorly when encountering new or varied types of Trojans, requiring costly and frequent retraining.
To address these persistent challenges, researchers have developed a novel framework called SAND: A Self-supervised and Adaptive NAS-Driven Framework for Hardware Trojan Detection. This innovative approach combines two powerful AI techniques: Self-supervised Learning (SSL) and Neural Architecture Search (NAS), to create a more efficient, adaptable, and accurate detection system.
Automated Feature Extraction with Self-supervised Learning
One of SAND’s core contributions is its use of Self-supervised Learning (SSL) to automate feature extraction. Unlike traditional methods that require human experts to identify and select relevant features from hardware designs, SAND employs SSL to learn these features automatically. Specifically, it uses a technique called contrastive learning, which trains a model to understand the intrinsic structure of hardware circuits by distinguishing between similar and dissimilar data points. This process allows the system to capture meaningful representations of hardware benchmarks without any manual intervention, making it highly adaptable to diverse circuit types.
Adaptive Classifier Optimization with Neural Architecture Search
The second key innovation in SAND is the integration of Neural Architecture Search (NAS). After the SSL component extracts features, these are fed into a downstream classifier responsible for identifying Trojans. NAS dynamically optimizes the architecture of this classifier. Instead of relying on a fixed, manually designed neural network, NAS explores a vast space of possible network configurations to find the most effective one for a specific detection task. This dynamic optimization, combined with a technique called SHAP-based pruning, ensures that the classifier is highly adaptive to unseen benchmarks and different Trojan variants, significantly reducing the need for extensive retraining.
How SAND Works: A Simplified Look
SAND processes hardware circuits by first transforming them into graph representations. To train the self-supervised encoder, it generates ‘positive’ samples (variations of a benign circuit that maintain its logical function) and ‘negative’ samples (benign circuits injected with Trojans). The system then uses a hybrid contrastive loss function, which not only pulls similar samples closer and pushes dissimilar ones apart but also includes a ‘global clustering loss’. This global clustering loss is crucial for grouping circuits of the same category (benign or malicious) into tight, well-separated clusters in the feature space, which is vital for effective Trojan detection given the subtle differences Trojans can introduce.
For the adaptive classifier, SAND starts with a large, over-parameterized network called a SuperNet. Through SHAP-based pruning, it intelligently identifies and removes redundant components, resulting in a streamlined, task-specific model that maintains high performance while being more efficient.
Also Read:
- Advanced AI Framework Boosts Ransomware Detection Speed and Adaptability
- Protecting Binarized Neural Networks in In-Memory Computing with PUF-Derived Keys
Remarkable Performance and Adaptability
Experimental results demonstrate SAND’s superior performance compared to state-of-the-art methods like SVM, AdaTest, and GATE-Net. SAND achieved a significant improvement in detection accuracy, up to 18.3% higher than existing techniques, and consistently delivered strong results across various benchmarks, including standard test suites and real-world designs.
Crucially, SAND exhibits high resilience against evasive Trojans and strong generalization capabilities. When tested on unseen benchmarks, traditional methods experienced significant drops in accuracy (up to 18.3%), while SAND maintained its performance with only a 3.9% drop. This remarkable adaptivity means that SAND requires minimal fine-tuning (only 7 epochs of retraining compared to 22-38 for other methods) when encountering new threats, making it highly practical for real-world deployment.
Furthermore, SAND demonstrated superior stability, maintaining consistent and reliable detection accuracy across multiple trials, a critical factor for trustworthy security systems.
In conclusion, SAND represents a significant leap forward in hardware Trojan detection. By intelligently combining self-supervised learning for automated feature extraction and neural architecture search for adaptive classifier optimization, it offers an efficient, accurate, and highly adaptable solution to a critical security threat in embedded systems. This research was published in the 2025 IEEE International Conference on Computer Design. You can read the full paper here: SAND: A Self-supervised and Adaptive NAS-Driven Framework for Hardware Trojan Detection.


