TusoAI: Automating and Enhancing Scientific Computational Methods with Agentic AI

TLDR: TusoAI is an agentic AI system that autonomously develops and optimizes computational methods for scientific tasks. It integrates domain knowledge into a ‘knowledge tree’ and uses iterative, domain-specific optimization and model diagnosis. TusoAI has demonstrated superior performance over existing methods in diverse scientific applications, including single-cell RNA-seq data denoising and satellite-based earth monitoring. Furthermore, it has significantly advanced genetics research by improving existing computational tools and uncovering novel biological insights, such as new disease-T cell subtype associations and variant-gene links.

Scientific discovery, a cornerstone of human progress, often faces a significant bottleneck: the slow and manual development of computational tools. Analyzing complex experimental data requires sophisticated software, and building these tools is a time-consuming process. Scientists typically have to review vast amounts of literature, test assumptions against real data, and then painstakingly implement these insights into efficient code. This iterative and labor-intensive cycle can severely impede the pace of new discoveries.

However, the advent of large language models (LLMs) offers a promising path forward. LLMs have demonstrated remarkable abilities in synthesizing information from literature, reasoning with empirical data, and generating specialized code. These capabilities open new avenues for accelerating the development of computational methods.

Introducing TusoAI: An Agentic System for Scientific Optimization

While existing LLM-based systems either focus on performing scientific analyses using already established methods or on developing general machine learning models, they often fall short in effectively integrating the unique, often unstructured, knowledge specific to scientific domains. This is where TusoAI steps in.

TusoAI is an innovative agentic AI system designed to autonomously develop and optimize computational methods for specific scientific applications. Given a scientific task description and an evaluation function, TusoAI takes charge of the entire process. Its core strength lies in its ability to integrate domain-specific knowledge into a structured ‘knowledge tree’ representation. This allows TusoAI to perform iterative, domain-specific optimization and model diagnosis, continuously improving performance across a pool of candidate solutions.

Unprecedented Performance Across Diverse Scientific Tasks

The system’s effectiveness has been rigorously tested and demonstrated through comprehensive benchmark evaluations. TusoAI consistently outperforms state-of-the-art expert methods, general machine learning engineering (MLE) agents, and other scientific AI agents across a wide array of tasks. These include complex challenges such as denoising single-cell RNA-seq data and monitoring Earth using satellite imagery.

Beyond benchmarks, TusoAI has already made significant contributions to open problems in genetics. It has improved existing computational methods, leading to a 40% power enhancement for scDRS in associating cells with diseases in simulations, and a 10.5% enrichment improvement for pgBoost in identifying ground-truth variant-gene pairs. More excitingly, TusoAI has uncovered novel biological insights, including 9 new associations between autoimmune diseases and T cell subtypes (for example, primary biliary cirrhosis with central memory T cells) and 7 previously unreported links between disease variants and their target genes (such as the glucose/HbA1c risk variant rs138917529 with GCK).

The code for TusoAI is publicly available, fostering transparency and further research. You can explore the details of this research paper here: TUSOAI: AGENTIC OPTIMIZATION FOR SCIENTIFIC METHODS.

How TusoAI Works: A Glimpse into its Methodology

TusoAI’s robust performance stems from its unique three-step approach:

Gathering Domain Knowledge: TusoAI first retrieves and summarizes key scientific papers relevant to the task. An agent creates a concise technical summary from the abstract and refines it using the methods section of each paper, ensuring that optimization instructions are grounded in established best practices and recent advancements.
Building Structured Instructions: It constructs a two-level knowledge tree. The first level consists of categories of optimization strategies (e.g., ‘regularization’, ‘model architectures’, ‘single-cell noise modeling’). The second level contains specific instructions within each category. These categories and instructions are drafted and then refined using insights from the paper summaries, promoting both diversity and scientific rigor. A special diagnostic category is also predefined to guide data logging and model diagnosis.
Iterative Optimization: After initializing a pool of candidate solutions, TusoAI iteratively selects diverse top performers. It then improves these solutions through either instruction-based optimization (using sampled instructions and feedback) or diagnostic-based optimization (collecting diagnostic logs to inform improvements). A Bayesian strategy adaptively samples instruction categories based on past performance, and feedback mechanisms help prevent repetition, encouraging broad exploration of the solution space.

Ablation studies confirm the critical role of each component: removing categories, Bayesian sampling, diagnosis, or domain knowledge significantly reduces performance and diversity. Interestingly, TusoAI demonstrates robustness across various LLM backbones, with lower-latency models often performing just as well as more powerful, but costlier, models when integrated into TusoAI’s system design.

Also Read:

Real-World Impact in Genetics

The case studies in genetics highlight TusoAI’s practical utility. For detecting disease-critical cell populations, TusoAI optimized the scDRS method, leading to a 40% increase in power in simulations and identifying 21% more true cell type-disease associations. The optimized scoring function was not only more powerful but also concise and interpretable. Similarly, for linking genetic variants to target genes, TusoAI significantly improved the pgBoost method, achieving higher enrichment of gold-standard links and uncovering novel variant-gene connections, such as linking a glucose/HbA1c risk variant to the GCK gene.

In both cases, TusoAI achieved these improvements by efficiently exploring a vast solution space in a fraction of the time and cost compared to manual expert efforts, demonstrating its potential to truly accelerate scientific discovery.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

TusoAI: Automating and Enhancing Scientific Computational Methods with Agentic AI

Introducing TusoAI: An Agentic System for Scientific Optimization

Unprecedented Performance Across Diverse Scientific Tasks

How TusoAI Works: A Glimpse into its Methodology

Real-World Impact in Genetics

Gen AI News and Updates

SOCi Achieves Major Milestone with 150,000 AI Agents Automating 10 Million Local Marketing Tasks

TD Synnex Unveils Agentic AI-Powered Digital Bridge to Revolutionize Partner Sales and Productivity

Avalara Secures $500 Million Investment from BlackRock to Propel AI-Powered Tax Automation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates