MINERVA: A Neural Network Approach to Smarter Feature Selection

TLDR: MINERVA is a novel supervised feature selection method that uses neural networks to estimate mutual information between features and targets. It employs a two-stage process with a specialized loss function and sparsity-inducing regularizers to identify relevant features, especially those involved in complex, higher-order interactions. Experiments on synthetic and real-world fraud datasets demonstrate MINERVA’s superior ability to perform exact feature selection and improve predictive performance compared to existing methods.

In the world of machine learning and data analysis, dealing with vast amounts of information, often called high-dimensional data, is a common challenge. This data frequently contains features that are either irrelevant or redundant, leading to increased storage needs, higher computational costs, and less effective predictive models. The process of identifying and selecting only the most important features is known as feature selection, a crucial step in building efficient and accurate machine learning systems.

Traditional feature selection methods, often called ‘filters,’ typically rely on simple statistical measures to understand how individual features relate to the target variable we’re trying to predict. However, these methods can fall short when the target variable depends on more intricate, higher-order interactions between multiple features, rather than just the contribution of each feature on its own. Imagine trying to predict fraud; a single transaction detail might not be suspicious, but a combination of two seemingly unrelated details (like two independent transactions sharing the same device ID but from different users) could be a strong indicator. Existing methods often struggle to capture such complex dependencies.

Introducing MINERVA: A Novel Approach

To address these limitations, researchers Taurai Muvunza, Egor Kraev, Pere Planell-Morell, and Alexander Y. Shestopaloff have introduced a new method called MINERVA: Mutual Information Neural Estimation Regularized Vetting Algorithm. This innovative approach to supervised feature selection leverages neural networks to estimate the mutual information between features and targets. Mutual information is a powerful concept from information theory that quantifies how much information one random variable provides about another. Essentially, it measures the strength of the relationship between them.

MINERVA’s core strength lies in its ability to approximate mutual information using neural networks. It employs a specially designed loss function, enhanced with ‘sparsity-inducing regularizers,’ which helps in identifying and prioritizing the most relevant features while pushing the weights of less important ones towards zero. This ensures that the model focuses only on what truly matters.

A Two-Stage Process for Better Generalization

A key aspect of MINERVA is its two-stage implementation. This design separates the process of learning data representations from the actual feature selection. In the first stage, MINERVA explores the dependencies between all features and the target without any selection constraints, allowing the neural network to learn the underlying relationships stably. In the second stage, the learned knowledge is fine-tuned, and the sparsity-inducing regularizers are introduced to select the important features. This decoupling improves the model’s ability to generalize to new data and provides a more accurate understanding of feature importance.

Capturing Complex Dependencies

The researchers demonstrated MINERVA’s effectiveness through experiments on both synthetic and real-life fraud datasets. On synthetic data, they created scenarios where the target variable depended on subtle interactions, such as whether two independent discrete random variables were equal. This type of dependence is often overlooked by traditional methods that only look at pairwise relationships. MINERVA successfully captured these complex feature-target relationships by evaluating feature subsets as an ensemble, meaning it considers how features work together rather than just individually.

In Experiment A, where 30 features were generated and only two (features 3 and 8) were expected to be relevant, MINERVA, alongside FOCI, was the only method to achieve an exact selection. Other benchmark methods like KSG, Boruta, HSIC Lasso, RFE, and Random Forest failed, selecting all 30 features. In Experiment B, involving continuous features and nonlinear functions, MINERVA again performed an exact selection of the 10 expected features, significantly outperforming all baselines. Furthermore, when evaluating the predictive performance using a gradient boosting model, MINERVA achieved the highest out-of-sample R² score of 84.69%, demonstrating its ability to select features that are crucial for accurate prediction.

Real-World Application: Fraud Detection

MINERVA was also tested on a challenging real-world fraud dataset from a financial company, comprising 3 million samples and 214 processed features. This dataset was highly imbalanced, with fraud cases making up only 0.1% of the observations. After addressing the data imbalance using the Synthetic Minority Over-sampling Technique (SMOTE), MINERVA showed strong performance. With a regularization coefficient of 10³, MINERVA selected 160 features and achieved the highest out-of-sample recall of 0.573, indicating its effectiveness in identifying fraudulent transactions. While other methods like KSG and HSIC Lasso also performed well, MINERVA consistently demonstrated robust performance across various metrics.

Also Read:

Conclusion

MINERVA represents a significant advancement in feature selection, particularly for datasets where target variables depend on complex, higher-order feature interactions. By combining neural estimation of mutual information with a carefully designed two-stage training process and sparsity-inducing regularizers, MINERVA can accurately identify the most informative features. Its proven efficacy on both synthetic and real-world fraud datasets highlights its potential to enhance predictive performance and reduce the challenges associated with high-dimensional data. For more details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

MINERVA: A Neural Network Approach to Smarter Feature Selection

Introducing MINERVA: A Novel Approach

A Two-Stage Process for Better Generalization

Capturing Complex Dependencies

Real-World Application: Fraud Detection

Conclusion

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates