spot_img
HomeResearch & DevelopmentSelf-Adaptive AI: Giving Scientists Control Over Reasoning Processes

Self-Adaptive AI: Giving Scientists Control Over Reasoning Processes

TLDR: CLIO (Cognitive Loop via In-Situ Optimization) is a new AI approach that allows large language models (LLMs) to self-adapt their reasoning in real-time without extra training. It significantly improves accuracy on science questions (e.g., 22.37% on HLE biology/medicine with GPT-4.1, a 161.64% relative increase over base GPT-4.1) and provides transparency into its thought process through graph structures and uncertainty monitoring. This gives scientists unprecedented control and understanding of AI’s decision-making, fostering better human-AI collaboration in scientific discovery.

Artificial intelligence is rapidly transforming scientific discovery, but a key challenge remains: giving scientists precise control over how AI models think and reason. Traditional AI development often falls short, either by embedding human-like thought patterns into non-reasoning models or by abstracting away the intricate details of reasoning from the user. This lack of steerability can be a significant hurdle, especially in high-stakes scientific domains where accuracy and transparency are paramount.

Addressing this, a new approach called Cognitive Loop via In-Situ Optimization (CLIO) has been introduced. Developed by Newman Cheng, Gordon Broadbent, and William Chappell from Microsoft Discovery and Quantum, CLIO empowers large language models (LLMs) to self-formulate problem-solving strategies, adapt their behavior when uncertain, and ultimately provide scientists with well-reasoned answers. Unlike methods that rely on extensive post-training, CLIO optimizes thinking in real-time during inference, without requiring additional data or training cycles. This innovative system is designed to be an alternative or complement to reinforcement learning post-training, enhancing non-reasoning models’ ability to tackle complex problems and choose the most effective approach.

One of CLIO’s core strengths is its open design, which allows scientists to observe the model’s uncertainty levels and understand how its final conclusions are reached through graph structures. This transparency is crucial for building trust and enabling human experts to interject corrections when needed. The system’s ability to adapt and self-correct is inspired by the neuroplasticity of the human brain, which can create, modify, or remove neural connections based on experience. CLIO embodies this by dynamically adjusting its internal strategy through editable parameters, particularly to resolve self-recognized uncertainties during execution.

CLIO’s architecture incorporates both breadth-wise and depth-wise exploration capabilities. For breadth, it draws inspiration from existing techniques like chain-of-thought prompting, allowing it to explore many different options. For depth, CLIO introduces a novel recursive mechanism, enabling it to invoke itself and create independent “thought channels.” These clean context windows prevent the pollution of aggregated context with incomplete thoughts, allowing for deep dives into specific areas of exploration. To prevent endless exploration, CLIO includes algorithmic controls over its “cognitive depth,” similar to configurable reasoning effort levels in other models, ensuring efficient and focused problem-solving.

A significant innovation in CLIO is its method for overcoming the “over-indexing challenge” often faced by agentic optimization approaches that ensemble multiple perspectives. Instead of relying solely on prompt-based reduction, CLIO leverages graph structures to reduce noise and synthesize a balanced perspective. It uses GPT-4.1 to extract entities and relationships from its thought processes, which are then clustered and summarized. This graph representation is then queried to produce a final answer, especially when CLIO is configured for “more thinking,” which involves multiple runs with different configurations to build a comprehensive joint graph of all sampled thought sequences.

In evaluations, CLIO demonstrated impressive performance. When paired with OpenAI’s GPT-4.1, CLIO achieved an accuracy of 22.37% on text-based biology and medicine questions from Humanity’s Last Exam (HLE), without any further post-training. This represents a substantial 13.82% net increase (or 161.64% relative increase) compared to the base GPT-4.1 model. Furthermore, CLIO surpassed OpenAI’s o3 model in both high and low reasoning effort modes, showcasing its ability to elevate the performance of completion models to par with reasoning-class models. The system also exhibited greater stability and less variability than o3 across multiple runs, thanks to its graph-based structure and multi-resolution information querying.

Beyond just accuracy, CLIO provides critical insights into its internal workings. The research revealed that oscillations within internal uncertainty measures are key indicators of CLIO’s result accuracy. For instance, a negative gradient of uncertainty over time often correlates with correct answers, while a positive gradient or high volatility signals incorrect answers or areas where human intervention might be beneficial. This transparency allows scientists to understand when the model’s decisions can be trusted and when experts need to interject, fostering a more effective human-machine collaboration. The produced chains of thought by CLIO were also found to be more similar to human-annotated rationales compared to base models, further enhancing trust and explainability.

The development of CLIO marks a significant step towards creating AI agents that are not only powerful but also transparent and steerable. By enabling real-time adaptation and exposing its internal belief states, CLIO puts scientists in the driver’s seat, allowing them to correct thought patterns and understand the reasoning process. This is particularly vital for long-running LLM agents engaged in high-value tasks like drug discovery or materials science, where the ability to control and monitor the AI’s reasoning is essential for reliable and defensible scientific outcomes. For more details, you can refer to the full research paper: Cognitive Loop via In-Situ Optimization: Self-Adaptive Reasoning for Science.

Also Read:

Future work on CLIO will focus on optimizing its performance across accuracy, cost, and time. Researchers are exploring how control variables like temperature and depth influence performance, and how CLIO can effectively combine different reasoning and non-reasoning models (e.g., GPT-4.1 with o3, or Microsoft’s Phi-4 and xAI’s Grok-4) to solve problems that individual models cannot. While CLIO’s recursive design demands computational resources, the potential for novel scientific discoveries often outweighs the cost. Early tests also show CLIO’s capacity to autonomously orchestrate scientific tools for extended periods, paving the way for mid-stream steering to influence scientific outcomes directly.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -