TLDR: A new activation function, RCR-AF, is proposed to enhance deep neural network robustness against adversarial attacks and improve generalization. It combines GELU and ReLU properties with unique clipping hyperparameters (α, γ) that control model sparsity and capacity by modulating Rademacher complexity. Experiments show RCR-AF consistently outperforms existing activation functions in both clean accuracy and adversarial robustness.
Deep neural networks have achieved remarkable success across various fields, from computer vision to natural language processing. However, a significant challenge remains: their vulnerability to adversarial attacks. These attacks involve subtle, often imperceptible, changes to input data that can trick even the most advanced AI models, posing serious risks, especially in critical applications like autonomous driving or medical diagnosis.
A recent research paper introduces a novel solution to this problem: the Rademacher Complexity Reduction Activation Function, or RCR-AF. This new activation function is designed to make AI models more robust against these attacks while also improving their overall ability to generalize, meaning they perform better on new, unseen data.
The researchers behind RCR-AF, Yunrui Yu, Kafeng Wang, Hang Su, and Jun Zhu from Tsinghua University, investigated activation functions as a key, yet often overlooked, component for enhancing model resilience. Activation functions are crucial elements within neural networks that determine whether a neuron should be activated or not, essentially introducing non-linearity that allows networks to learn complex patterns.
RCR-AF cleverly combines the best features of two widely used activation functions: GELU and ReLU. GELU is known for its smoothness and ability to retain negative information, which helps in stable gradient flow during training. ReLU, on the other hand, promotes sparsity, making models more efficient, but can sometimes lead to “dead neurons” and discard valuable negative information. RCR-AF integrates GELU’s benefits with ReLU’s desirable monotonicity, ensuring a balanced approach.
What makes RCR-AF particularly innovative is its built-in clipping mechanism, controlled by two unique hyperparameters, alpha (α) and gamma (γ). These parameters allow for precise control over both the model’s sparsity (how many neurons are active) and its capacity (how complex the model can become). The theoretical foundation of RCR-AF is rooted in Rademacher complexity, a concept used to measure the complexity of a function class. The paper demonstrates that alpha and gamma directly influence this complexity, providing a principled way to enhance robustness.
The research team conducted extensive experiments to evaluate RCR-AF’s performance against popular alternatives like ReLU, GELU, and Swish. The results were compelling. Under standard training conditions, RCR-AF consistently achieved higher clean accuracy on test datasets. For instance, on the CIFAR-10 dataset using a ResNet-18 model, RCR-AF achieved 96.50% clean accuracy, outperforming ReLU (95.98%), GELU (95.77%), and Swish (94.99%).
More importantly, in adversarial training scenarios, where models are specifically trained to resist attacks, RCR-AF showed superior robustness. When evaluated against AutoAttack, a strong benchmark for adversarial robustness, RCR-AF-equipped models achieved 51.96% robustness, surpassing ReLU (49.82%), GELU (49.36%), and Swish (47.45%). These findings suggest that RCR-AF can simultaneously improve both the generalization ability and the adversarial resilience of deep learning models.
Also Read:
- Assessing Neural Network Resilience with Neuron Coverage Change Rate
- Teaching AI to Craft Smarter Digital Deceptions: A Knowledge Distillation Approach
This breakthrough in activation function design offers a promising path forward for developing more reliable and secure machine learning systems, especially as AI becomes more integrated into safety-critical applications. For more in-depth technical details, you can read the full research paper available here.


