TLDR: A new research paper introduces a method to classify malware by converting binaries into 1D signals instead of traditional 2D images (byteplots). This 1D approach avoids information loss from 2D conversion, allowing existing 2D CNNs to be adapted or new 1D CNNs to be developed. The proposed 1D CNN achieved state-of-the-art performance on the MalNet dataset for binary, type, and family level malware classification, demonstrating a more effective and information-preserving method for cybersecurity.
Malware poses a constant threat in cybersecurity, with sophisticated obfuscation techniques making traditional detection methods less effective. While dynamic analysis offers deeper insights, it demands significant resources, limiting its widespread use. For years, a popular approach has involved converting malware binaries into 2D images, known as byteplots, and then using computer vision techniques to classify them. This method has shown promise in detecting complex malware variants.
However, this 2D image conversion process isn’t without its drawbacks. It often leads to a significant loss of crucial information. This loss occurs due to “quantisation noise,” which is essentially rounding errors when converting data to integer pixel values, and the introduction of artificial 2D dependencies that don’t exist in the original binary data. These issues can hinder the accuracy of classification models.
A New Perspective: 1D Signals for Malware Classification
A recent research paper, “Signal-Based Malware Classification Using 1D CNNs,” by Jack Wilkie, Hanan Hindy, Ivan Andonovic, Christos Tachtatzis, and Robert Atkinson, proposes an innovative solution to these challenges. Instead of converting malware binaries into 2D images, their work focuses on resizing them into 1D signals. This approach fundamentally changes how malware data is represented for machine learning models.
The core advantage of 1D signals is that they avoid the need for heuristic reshaping into a 2D grid, which can distort the original data structure. Furthermore, by storing these signals in a floating-point format, they bypass the quantisation noise that plagues 2D image representations. This means the 1D signals retain significantly more of the original binary’s information, leading to a better signal-to-noise ratio.
Adapting and Innovating with 1D Convolutional Neural Networks
The researchers demonstrated that existing 2D Convolutional Neural Network (CNN) architectures, commonly used in computer vision, can be effectively adapted to classify these 1D signals. They developed a novel method to convert 2D CNNs into 1D equivalents by flattening the convolution kernels and squaring the stride values. This ingenious transformation ensures that the adapted 1D models maintain the same number of parameters and computational requirements as their 2D counterparts, yet achieve improved performance.
Beyond adapting existing models, the team also developed a bespoke 1D CNN architecture. This custom model is based on the robust ResNet architecture, enhanced with squeeze-and-excitation layers for improved feature learning and the GELU activation function for smoother decision boundaries. This specialized 1D CNN was rigorously evaluated on the large-scale MalNet dataset, which contains over a million Android malware samples.
Also Read:
- Securing the Smart Grid: A Hybrid AI Approach to Intrusion Detection
- A New Self-Supervised Approach for Network Intrusion Detection Redefines Anomaly Learning
State-of-the-Art Performance and Future Implications
The results of their evaluation are highly impressive. The proposed 1D signal-based approach achieved state-of-the-art performance across various malware classification tasks: binary, type, and family level classification. Specifically, the bespoke 1D CNN model recorded F1 scores of 0.874 for binary classification, 0.503 for type classification, and 0.507 for family classification. These scores surpass those of leading 2D image-based models, including popular ResNet, DenseNet, EfficientNet architectures, and even the advanced SHERLOCK model.
The research also highlighted that the choice of resampling filter (Lanczos performed best) and signal length significantly impact performance, with longer signals generally retaining more information. Crucially, the 1D models consistently outperformed their 2D equivalents on both Android APK and Windows EXE files, demonstrating the broad applicability of this new paradigm.
This work marks a significant step forward in malware classification. By demonstrating the superior information retention and classification performance of 1D signal representations, it paves the way for future cybersecurity models to move beyond traditional image-based approaches. The ability to adapt existing 2D CNNs to this 1D modality, coupled with the development of specialized 1D architectures, offers a powerful new tool in the ongoing fight against evolving malware threats.


