Unlocking Drug Safety: How CoTox Uses AI for Interpretable Toxicity Prediction

TLDR: CoTox is an AI framework that uses large language models (LLMs) with a “Chain-of-Thought” approach to predict drug toxicity. It combines chemical structure information (using human-readable IUPAC names) with biological data like pathways and Gene Ontology terms. This allows CoTox to provide step-by-step, interpretable reasoning for its predictions, outperforming traditional machine learning and other LLM methods, and offering a valuable tool for early drug development.

Drug development is a complex and often challenging process, with toxicity being a major hurdle that can lead to the failure of promising new compounds or even drug withdrawals after market approval. Traditionally, predicting drug toxicity has relied on machine learning and deep learning models. While these models have shown promise, they often require vast amounts of annotated data and frequently lack the ability to explain their predictions, making it difficult to understand the underlying biological mechanisms of toxicity.

Enter CoTox, a groundbreaking new framework that leverages the power of Large Language Models (LLMs) to provide more accurate and, crucially, interpretable predictions of molecular toxicity. Developed by researchers including Jueon Park, Yein Park, Minju Song, Soyon Park, Donghyeon Lee, Seungheheun Baek, and Jaewoo Kang, CoTox addresses the limitations of previous approaches by integrating chemical structure data with vital biological context and a step-by-step reasoning process.

The CoTox Approach: Bridging Chemistry and Biology

At its core, CoTox is designed to mimic human-like reasoning. Unlike earlier LLM-based methods that often struggled with complex chemical representations like SMILES strings, CoTox utilizes IUPAC names for chemical structures. IUPAC names are the standardized, human-readable nomenclature used in the scientific community, making them easier for LLMs to interpret and connect with biological information. This seemingly small change significantly enhances the model’s ability to reason about molecular structures.

Beyond chemical structure, CoTox incorporates rich biological context. It draws information from databases like the Comparative Toxicogenomics Database (CTD) to identify relevant biological pathways and Gene Ontology (GO) terms associated with toxicity. These biological insights are then semantically filtered by an LLM (such as GPT-4o) to ensure only toxicity-related entries are used.

The magic happens through a “Chain-of-Thought” (CoT) prompting strategy. When CoTox makes a prediction, it doesn’t just give a binary answer. Instead, it follows a structured, four-step analytical process for each potential toxicity type. First, it examines relevant pathways, then analyzes GO terms for biological implications, next interprets structural features from the IUPAC name, and finally, synthesizes all this information into a coherent explanation of how a compound might induce toxicity in a specific organ system. This step-by-step reasoning provides a transparent rationale for each prediction, building trust and offering valuable insights for drug developers.

Performance and Interpretability

In rigorous evaluations, CoTox, particularly when powered by models like GPT-4o and Gemini-2.5-Pro, consistently outperformed both traditional machine learning models (like XGBoost and Chemprop) and other LLM prompting strategies that relied solely on chemical structures or lacked explicit reasoning. The integration of biological process information proved crucial, significantly improving prediction accuracy across various toxicity types, especially for hematological and liver toxicity.

The research highlighted that using IUPAC names, which are more linguistically aligned, better supports the multi-step reasoning capabilities of LLMs compared to the more machine-readable SMILES format. This allows the models to extract more meaningful structural clues and link them effectively to biological outcomes.

Real-World Utility: Case Studies

The practical utility of CoTox was demonstrated through compelling case studies. For instance, when predicting the toxicity of Propranolol, CoTox’s reasoning for cardiotoxicity and liver toxicity aligned perfectly with established scientific literature and known mechanisms. Its prediction for renal non-toxicity was also consistent with current understanding.

In another fascinating case, CoTox was used to predict the toxicity of Entecavir, even in the absence of prior biological knowledge, by inputting gene expression changes from organ-specific cell lines. While CoTox correctly predicted liver and pulmonary toxicity, it predicted renal toxicity despite the ground truth label indicating non-toxicity. However, recent clinical evidence has emerged suggesting that Entecavir may indeed pose a risk of renal function decline, indicating that CoTox has the potential to capture subtle, latent toxicity signals not yet fully documented in regulatory information.

Also Read:

A Step Forward for Drug Safety

CoTox represents a significant advancement in in silico toxicity prediction. By combining interpretable chemical structures, rich biological context, and a transparent reasoning process, it offers a powerful tool for early-stage drug safety assessment. This framework not only improves prediction accuracy but also provides the much-needed interpretability that can accelerate pharmaceutical development and enhance patient safety. The code and prompts used in this work are available at https://github.com/dmis-lab/CoTox.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unlocking Drug Safety: How CoTox Uses AI for Interpretable Toxicity Prediction

The CoTox Approach: Bridging Chemistry and Biology

Performance and Interpretability

Real-World Utility: Case Studies

A Step Forward for Drug Safety

Gen AI News and Updates

Animate Biosciences Unveils Generative AI Platform to Transform Treatment of Inflammatory and Fibrotic Diseases with Peptide Therapeutics

BullFrog AI to Showcase bfPREP at 2025 AI Drug Discovery & Development Summit

Smart Summaries for Smarter Investments: Personalizing Financial News with AI

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates