TLDR: This research paper introduces a unified framework for building AI knowledge engines, bridging the gap between structured (knowledge graphs) and unstructured (large language models) paradigms. It identifies two core forces: ‘structure formation,’ where language modeling objectives induce interpretable patterns in both types of models, and ‘destructuring for plasticity,’ where periodic ‘active forgetting’ of embedding weights enhances a model’s ability to generalize to new entities in knowledge graphs and adapt to new languages in large language models with less data. The findings advocate for a balanced approach to create more adaptable, transparent, and controllable AI systems.
The quest to build truly intelligent machines has long fascinated humanity, much like the Industrial Revolution transformed our physical capabilities. At the heart of this endeavor lies the creation of ‘knowledge engines’ – systems capable of acquiring, consolidating, retrieving, and updating information to navigate our complex world.
Traditionally, artificial intelligence has approached this challenge through two distinct paths: the structured paradigm and the unstructured paradigm. The structured approach, exemplified by knowledge graphs, relies on predefined symbolic interactions. Think of it like a meticulously organized library where every piece of information has a specific place and relationship. This method excels in tasks requiring precision, consistency, and logical reasoning, powering systems like medical expert systems, search engines, and recommendation platforms.
In contrast, the unstructured paradigm, dominated by modern large language models (LLMs) like ChatGPT, thrives on vast amounts of raw, unorganized data, such as web text. These models scale massive transformer architectures to learn implicit patterns and generate flexible responses. They are powerful for tasks demanding creativity, broad understanding, and generative capabilities, from answering diverse questions to creating content.
For a long time, these two paradigms seemed to be at odds, with the unstructured approach gaining significant traction due to its impressive scalability. However, new research from Yihong Chen at University College London suggests that these two paths are not so different after all. In a groundbreaking thesis titled “Structure and Destructure: Dual Forces in the Making of Knowledge Engines”, Chen proposes a unified framework that bridges these seemingly disparate approaches, revealing two fundamental forces at play in all knowledge engines.
The Force of Structure Formation
The first key connection identified is ‘structure formation’. This research demonstrates that regardless of whether data is explicitly organized (like in a knowledge graph) or raw and unorganized (like web text), the act of training models with ‘language modeling’ objectives naturally induces structural patterns within their computations. Language modeling, at its core, involves predicting a token (like a word or an entity) based on its surrounding context.
For structured data, this means that language modeling can effectively complete missing links in knowledge graphs, helping models learn better representations of entities and relationships. Surprisingly, this local prediction task helps models grasp the global structure of the entire knowledge graph.
For unstructured data, the same principle applies. The thesis shows how language modeling objectives in LLMs lead to the emergence of interpretable ‘n-gram’ structures (patterns of words or tokens). These hidden patterns can be extracted to understand how LLMs work internally, track their learning progress during training, and even analyze the effects of fine-tuning. For instance, the research found that specific parts of an LLM might specialize in grammatical functions, and that the model acquires different types of word patterns at varying speeds during its pretraining phase.
The Force of Destructuring for Plasticity
While structure is essential, too much rigidity can be a hindrance. The second crucial force is ‘destructuring for plasticity’. The research highlights that ’embeddings’ – the numerical representations of symbols (like words or entities) – act as a cache for past computations. While this caching is vital for learning, excessive reliance on these fixed structures can prevent models from adapting to new, unseen information.
To counteract this, the thesis introduces ‘active forgetting’. This mechanism involves periodically resetting parts of the learned embeddings during training. Imagine an AI system intentionally clearing its short-term memory to make space for new learning. This ‘destructuring’ forces the model to focus on local, immediate information rather than over-relying on outdated or overly specific cached knowledge.
This approach has profound implications for both paradigms. In structured learning, active forgetting helps knowledge graph models generalize to entirely new entities not seen during initial training. In the unstructured world of LLMs, active forgetting during pretraining significantly improves the model’s ‘linguistic plasticity’. This means LLMs can adapt much faster and with less data to new languages, especially those vastly different from the language they were originally trained on.
Also Read:
- A New Approach to Integrating Knowledge Graphs with Large Language Models for Enhanced Completion
- Enhancing Language Models for Graph Tasks with Targeted Context
Towards General Knowledge Engines
By understanding and balancing these dual forces of structure and destructure, we can move towards building more general, adaptable, and controllable AI knowledge engines. This unified perspective suggests that true intelligence in machines will come not just from building vast knowledge bases, but also from the ability to flexibly update, discard, and relearn information as reality constantly changes.
This work offers a fresh perspective on AI development, moving beyond the surface-level debate of structured versus unstructured data. It emphasizes that the dynamic interplay between forming and dismantling knowledge is key to creating AI systems that are not only powerful but also transparent, ethical, and capable of evolving alongside human needs. For more details, you can read the full research paper here.


