TLDR: ToxBench is a new, large-scale dataset featuring highly accurate binding affinity labels for Human Estrogen Receptor Alpha, calculated using the physics-based AB-FEP method. It addresses limitations of previous datasets by focusing on a single target to encourage machine learning models to learn true protein-ligand interactions. Alongside ToxBench, the DualBind model is introduced, which leverages a dual-loss framework to achieve superior binding affinity predictions. DualBind demonstrates a remarkable 106-fold speed-up compared to AB-FEP, enabling high-throughput screening for drug discovery while maintaining high accuracy.
Predicting how strongly a small molecule binds to a target protein is a cornerstone of modern drug discovery and toxicity assessment. This process, known as protein-ligand binding affinity prediction, helps scientists identify promising drug candidates and spot potential adverse effects early on. Traditionally, two main approaches have been used: machine learning (ML) models and physics-based methods like Absolute Binding Free Energy Perturbation (AB-FEP).
ML models offer the promise of speed, but their accuracy is often limited by the availability of high-quality, reliable data. Existing datasets, such as PDBBind, have been found to contain biases that can lead ML models to learn shortcuts, focusing on ligand-only or protein-only features rather than the actual protein-ligand interactions. This makes it difficult for these models to generalize to new, unseen molecules.
On the other hand, physics-based methods like AB-FEP are highly accurate, often comparable to experimental results. However, they are incredibly computationally intensive, taking hours or even days to calculate the binding affinity for a single protein-ligand complex. This makes them impractical for screening the vast libraries of molecules needed in high-throughput drug discovery.
Introducing ToxBench: A New Standard for Binding Affinity Data
To bridge this critical gap, researchers have introduced ToxBench, a groundbreaking dataset designed specifically for developing and evaluating ML models for binding affinity prediction. ToxBench is unique because it is the first large-scale dataset to use AB-FEP-calculated labels, ensuring high fidelity and reliability. Crucially, it focuses on a single, pharmaceutically important target: Human Estrogen Receptor Alpha (ERα).
ERα is vital for endocrine signaling and is a key target in both therapeutic development and toxicity assessment, as its modulation is linked to various health issues, including hormone-dependent cancers. By concentrating on a single target and providing 8,770 ERα-ligand complex structures with AB-FEP-calculated binding free energies, ToxBench creates a ‘dense’ dataset. This design encourages ML models to genuinely learn the intricate protein-ligand interactions, rather than relying on dataset-specific biases, thereby improving their ability to predict affinities for novel ligands.
The AB-FEP calculations used to generate ToxBench’s labels were rigorously validated against experimental affinities, showing an RMSE of 1.75 kcal/mol, which is close to the typical uncertainty of experimental assays. This validates the reliability of the data provided by ToxBench.
DualBind: A Novel ML Model for Faster, Accurate Predictions
Alongside the ToxBench dataset, the researchers also proposed a new ML model called DualBind. This model employs a novel ‘dual-loss’ strategy, combining a supervised Mean Squared Error (MSE) loss with an unsupervised Denoising Score Matching (DSM) loss. In simpler terms, MSE loss ensures the model’s predictions align with the known binding affinities, while DSM loss helps shape the model’s understanding of the energy landscape, guiding it towards stable protein-ligand structures.
The performance of DualBind was benchmarked against other state-of-the-art ML methods, including Chemprop (a ligand-only model) and AEV-PLIG (another interaction-aware model). DualBind consistently outperformed both, achieving superior results across all evaluation metrics, including a Pearson correlation coefficient (Rp) of 0.844 and a Root Mean Square Error (RMSE) of 2.392 kcal/mol.
A significant finding from the benchmark was the substantial performance difference between the ligand-only Chemprop model and the interaction-aware models like DualBind. Chemprop performed significantly worse, demonstrating that ToxBench’s design effectively prevents models from taking shortcuts and forces them to learn true protein-ligand interactions. This confirms ToxBench’s value as a robust benchmark for developing more generalizable ML methods.
Also Read:
- Diffusion Models: Advancing Small Molecule Design for Drug Discovery
- MolecBioNet: A Comprehensive Framework for Predicting and Explaining Drug Interactions
The Future of Drug Discovery: High-Throughput Screening
Perhaps the most exciting implication of this research is the potential for high-throughput binding affinity prediction. While a single AB-FEP calculation can take approximately 35 hours on a powerful GPU, the trained DualBind model can predict the binding affinity for a single complex in mere milliseconds (around 33 ms with batching). This represents an astonishing 106-fold speed-up compared to traditional AB-FEP calculations.
This dramatic increase in speed, combined with DualBind’s competitive predictive performance, means that ML models can now approximate the high-quality results of AB-FEP at a tiny fraction of the computational cost. This breakthrough paves the way for rapidly screening vast chemical libraries, significantly accelerating the target-based drug discovery process.
In conclusion, ToxBench and DualBind represent a significant leap forward in ML-driven binding affinity prediction. ToxBench provides a much-needed, high-fidelity dataset that encourages the development of ML models capable of truly understanding protein-ligand interactions. DualBind demonstrates the power of such models to deliver accurate predictions at unprecedented speeds, promising to transform the landscape of drug discovery. You can find the full research paper here: ToxBench: A Binding Affinity Prediction Benchmark with AB-FEP-Calculated Labels for Human Estrogen Receptor Alpha.


