Building Flexible AI: How Models Learn, Forget, and Adapt

TLDR: This research paper introduces a unified framework for building AI knowledge engines, bridging the gap between structured (knowledge graphs) and unstructured (large language models) paradigms. It identifies two core forces: ‘structure formation,’ where language modeling objectives induce interpretable patterns in both types of models, and ‘destructuring for plasticity,’ where periodic ‘active forgetting’ of embedding weights enhances a model’s ability to generalize to new entities in knowledge graphs and adapt to new languages in large language models with less data. The findings advocate for a balanced approach to create more adaptable, transparent, and controllable AI systems.

The quest to build truly intelligent machines has long fascinated humanity, much like the Industrial Revolution transformed our physical capabilities. At the heart of this endeavor lies the creation of ‘knowledge engines’ – systems capable of acquiring, consolidating, retrieving, and updating information to navigate our complex world.

Traditionally, artificial intelligence has approached this challenge through two distinct paths: the structured paradigm and the unstructured paradigm. The structured approach, exemplified by knowledge graphs, relies on predefined symbolic interactions. Think of it like a meticulously organized library where every piece of information has a specific place and relationship. This method excels in tasks requiring precision, consistency, and logical reasoning, powering systems like medical expert systems, search engines, and recommendation platforms.

In contrast, the unstructured paradigm, dominated by modern large language models (LLMs) like ChatGPT, thrives on vast amounts of raw, unorganized data, such as web text. These models scale massive transformer architectures to learn implicit patterns and generate flexible responses. They are powerful for tasks demanding creativity, broad understanding, and generative capabilities, from answering diverse questions to creating content.

For a long time, these two paradigms seemed to be at odds, with the unstructured approach gaining significant traction due to its impressive scalability. However, new research from Yihong Chen at University College London suggests that these two paths are not so different after all. In a groundbreaking thesis titled “Structure and Destructure: Dual Forces in the Making of Knowledge Engines”, Chen proposes a unified framework that bridges these seemingly disparate approaches, revealing two fundamental forces at play in all knowledge engines.

The Force of Structure Formation

The first key connection identified is ‘structure formation’. This research demonstrates that regardless of whether data is explicitly organized (like in a knowledge graph) or raw and unorganized (like web text), the act of training models with ‘language modeling’ objectives naturally induces structural patterns within their computations. Language modeling, at its core, involves predicting a token (like a word or an entity) based on its surrounding context.

For structured data, this means that language modeling can effectively complete missing links in knowledge graphs, helping models learn better representations of entities and relationships. Surprisingly, this local prediction task helps models grasp the global structure of the entire knowledge graph.

For unstructured data, the same principle applies. The thesis shows how language modeling objectives in LLMs lead to the emergence of interpretable ‘n-gram’ structures (patterns of words or tokens). These hidden patterns can be extracted to understand how LLMs work internally, track their learning progress during training, and even analyze the effects of fine-tuning. For instance, the research found that specific parts of an LLM might specialize in grammatical functions, and that the model acquires different types of word patterns at varying speeds during its pretraining phase.

The Force of Destructuring for Plasticity

While structure is essential, too much rigidity can be a hindrance. The second crucial force is ‘destructuring for plasticity’. The research highlights that ’embeddings’ – the numerical representations of symbols (like words or entities) – act as a cache for past computations. While this caching is vital for learning, excessive reliance on these fixed structures can prevent models from adapting to new, unseen information.

To counteract this, the thesis introduces ‘active forgetting’. This mechanism involves periodically resetting parts of the learned embeddings during training. Imagine an AI system intentionally clearing its short-term memory to make space for new learning. This ‘destructuring’ forces the model to focus on local, immediate information rather than over-relying on outdated or overly specific cached knowledge.

This approach has profound implications for both paradigms. In structured learning, active forgetting helps knowledge graph models generalize to entirely new entities not seen during initial training. In the unstructured world of LLMs, active forgetting during pretraining significantly improves the model’s ‘linguistic plasticity’. This means LLMs can adapt much faster and with less data to new languages, especially those vastly different from the language they were originally trained on.

Also Read:

Towards General Knowledge Engines

By understanding and balancing these dual forces of structure and destructure, we can move towards building more general, adaptable, and controllable AI knowledge engines. This unified perspective suggests that true intelligence in machines will come not just from building vast knowledge bases, but also from the ability to flexibly update, discard, and relearn information as reality constantly changes.

This work offers a fresh perspective on AI development, moving beyond the surface-level debate of structured versus unstructured data. It emphasizes that the dynamic interplay between forming and dismantling knowledge is key to creating AI systems that are not only powerful but also transparent, ethical, and capable of evolving alongside human needs. For more details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Building Flexible AI: How Models Learn, Forget, and Adapt

The Force of Structure Formation

The Force of Destructuring for Plasticity

Towards General Knowledge Engines

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates