TLDR: CoTox is an AI framework that uses large language models (LLMs) with a “Chain-of-Thought” approach to predict drug toxicity. It combines chemical structure information (using human-readable IUPAC names) with biological data like pathways and Gene Ontology terms. This allows CoTox to provide step-by-step, interpretable reasoning for its predictions, outperforming traditional machine learning and other LLM methods, and offering a valuable tool for early drug development.
Drug development is a complex and often challenging process, with toxicity being a major hurdle that can lead to the failure of promising new compounds or even drug withdrawals after market approval. Traditionally, predicting drug toxicity has relied on machine learning and deep learning models. While these models have shown promise, they often require vast amounts of annotated data and frequently lack the ability to explain their predictions, making it difficult to understand the underlying biological mechanisms of toxicity.
Enter CoTox, a groundbreaking new framework that leverages the power of Large Language Models (LLMs) to provide more accurate and, crucially, interpretable predictions of molecular toxicity. Developed by researchers including Jueon Park, Yein Park, Minju Song, Soyon Park, Donghyeon Lee, Seungheheun Baek, and Jaewoo Kang, CoTox addresses the limitations of previous approaches by integrating chemical structure data with vital biological context and a step-by-step reasoning process.
The CoTox Approach: Bridging Chemistry and Biology
At its core, CoTox is designed to mimic human-like reasoning. Unlike earlier LLM-based methods that often struggled with complex chemical representations like SMILES strings, CoTox utilizes IUPAC names for chemical structures. IUPAC names are the standardized, human-readable nomenclature used in the scientific community, making them easier for LLMs to interpret and connect with biological information. This seemingly small change significantly enhances the model’s ability to reason about molecular structures.
Beyond chemical structure, CoTox incorporates rich biological context. It draws information from databases like the Comparative Toxicogenomics Database (CTD) to identify relevant biological pathways and Gene Ontology (GO) terms associated with toxicity. These biological insights are then semantically filtered by an LLM (such as GPT-4o) to ensure only toxicity-related entries are used.
The magic happens through a “Chain-of-Thought” (CoT) prompting strategy. When CoTox makes a prediction, it doesn’t just give a binary answer. Instead, it follows a structured, four-step analytical process for each potential toxicity type. First, it examines relevant pathways, then analyzes GO terms for biological implications, next interprets structural features from the IUPAC name, and finally, synthesizes all this information into a coherent explanation of how a compound might induce toxicity in a specific organ system. This step-by-step reasoning provides a transparent rationale for each prediction, building trust and offering valuable insights for drug developers.
Performance and Interpretability
In rigorous evaluations, CoTox, particularly when powered by models like GPT-4o and Gemini-2.5-Pro, consistently outperformed both traditional machine learning models (like XGBoost and Chemprop) and other LLM prompting strategies that relied solely on chemical structures or lacked explicit reasoning. The integration of biological process information proved crucial, significantly improving prediction accuracy across various toxicity types, especially for hematological and liver toxicity.
The research highlighted that using IUPAC names, which are more linguistically aligned, better supports the multi-step reasoning capabilities of LLMs compared to the more machine-readable SMILES format. This allows the models to extract more meaningful structural clues and link them effectively to biological outcomes.
Real-World Utility: Case Studies
The practical utility of CoTox was demonstrated through compelling case studies. For instance, when predicting the toxicity of Propranolol, CoTox’s reasoning for cardiotoxicity and liver toxicity aligned perfectly with established scientific literature and known mechanisms. Its prediction for renal non-toxicity was also consistent with current understanding.
In another fascinating case, CoTox was used to predict the toxicity of Entecavir, even in the absence of prior biological knowledge, by inputting gene expression changes from organ-specific cell lines. While CoTox correctly predicted liver and pulmonary toxicity, it predicted renal toxicity despite the ground truth label indicating non-toxicity. However, recent clinical evidence has emerged suggesting that Entecavir may indeed pose a risk of renal function decline, indicating that CoTox has the potential to capture subtle, latent toxicity signals not yet fully documented in regulatory information.
Also Read:
- BalancedBio: A New Framework for Integrated AI in Biomedical Reasoning
- Enhancing Mathematical Problem-Solving with Knowledge Graphs and Executable Code
A Step Forward for Drug Safety
CoTox represents a significant advancement in in silico toxicity prediction. By combining interpretable chemical structures, rich biological context, and a transparent reasoning process, it offers a powerful tool for early-stage drug safety assessment. This framework not only improves prediction accuracy but also provides the much-needed interpretability that can accelerate pharmaceutical development and enhance patient safety. The code and prompts used in this work are available at https://github.com/dmis-lab/CoTox.


