TLDR: AFABench is the first standardized benchmark framework for Active Feature Acquisition (AFA), a technique that dynamically selects informative features to balance predictive performance with acquisition costs. The benchmark includes diverse synthetic and real-world datasets and evaluates various AFA algorithms, including static, greedy, and reinforcement learning-based approaches. Key findings indicate that while non-greedy methods can be powerful for specific data structures, discriminative greedy methods often perform strongly on real-world datasets, and static methods can also be competitive, highlighting important trade-offs in AFA strategy selection.
In today’s data-driven world, acquiring all available information for every data instance can be incredibly costly or impractical. Imagine medical tests that are expensive or invasive, or user preferences in a recommendation system that might intrude on privacy. This is where Active Feature Acquisition (AFA) comes in. AFA is a smart approach that dynamically selects only a subset of the most informative features for each data instance, aiming to achieve good predictive performance while keeping acquisition costs low.
While many methods have been proposed for AFA, ranging from simple greedy strategies to more complex reinforcement learning approaches, comparing them fairly has been a significant challenge due to a lack of standardized evaluation tools. To address this, researchers have introduced AFABench, the first comprehensive benchmark framework specifically designed for Active Feature Acquisition.
AFABench is a robust and flexible platform that allows for standardized and fair comparisons of various AFA methods. It includes a diverse collection of both synthetic and real-world datasets, ensuring that different scenarios can be tested. The framework also supports a wide array of acquisition policies and features a modular design, making it easy for researchers to integrate new methods and tasks as the field evolves.
Exploring Different AFA Strategies
The benchmark evaluates representative algorithms from all major categories of AFA. These include:
- Static Methods: These select the same set of features for every data instance, regardless of its unique characteristics. They serve as a baseline to highlight the benefits of dynamic selection.
- Greedy Methods: These approaches acquire features one by one, based on what is expected to provide the most immediate information gain. They can be further divided into generative methods, which model data distributions, and discriminative methods, which directly estimate improvements in prediction.
- Reinforcement Learning (RL)-based Methods: These are non-greedy approaches that frame AFA as a sequential decision-making problem. They learn acquisition policies that aim to maximize long-term rewards, potentially discovering strategies that go beyond immediate gains.
To specifically test the ‘lookahead’ capabilities of AFA policies, the researchers introduced a novel synthetic dataset called AFAContext. This dataset is designed to expose the limitations of greedy selection, where a feature might not seem immediately useful but is crucial for unlocking more informative features later on.
Also Read:
- Intelligent Search Space Design for Efficient Automated Machine Learning
- New Approaches to Feature Importance in Explainable AI
Key Insights from AFABench
The empirical analysis conducted using AFABench revealed several important trade-offs and insights:
- Non-Greedy vs. Greedy: While non-greedy (RL-based) methods have the potential to learn more sophisticated, non-myopic policies, this isn’t always guaranteed. Their increased complexity can sometimes lead to training challenges and instability. Moreover, the results suggest that many real-world datasets may not exhibit the kind of strong non-greedy structure that would significantly benefit from these complex approaches.
- Discriminative Greedy Methods Shine: On many real-world datasets, discriminative greedy methods consistently performed among the best. They offer a strong alternative to more complex RL methods, often with significantly shorter training times.
- Static Methods Remain Competitive: For some datasets, static feature selection methods performed surprisingly well, sometimes comparable to dynamic approaches. This indicates that for certain data, selecting a fixed set of features for all instances can still be effective and more efficient to train.
- Compute Time Considerations: RL methods generally require more computational resources during training, especially those involving recurrent neural networks. However, once trained, their evaluation time can be very fast. Conversely, some oracle-based methods might have minimal training time but can be significantly slower during inference.
The introduction of AFABench marks a significant step forward in the field of Active Feature Acquisition. By providing a standardized and extensible framework, it enables fair comparisons, highlights key trade-offs, and offers actionable insights for future research in cost-sensitive learning and more effective feature acquisition strategies. The benchmark code is openly available for the community to use and extend, fostering reproducibility and further development. You can find the full research paper here: AFABench: A Generic Framework for Benchmarking Active Feature Acquisition.


