TLDR: A research paper by Elisabetta Rocchetti proposes a novel method to understand how Large Language Models (LLMs) learn by modeling Transformers as evolving complex networks. By representing attention heads and MLPs as nodes and causal influence as edges, the study tracks the Pythia-14M model’s training. It reveals distinct learning phases (exploration, consolidation, refinement) and identifies stable “information spreaders” and dynamic “information gatherers” and “gatekeepers,” demonstrating how the model’s internal communication architecture self-organizes to form functional circuits.
Large Language Models (LLMs) have revolutionized many fields, but their internal workings often remain a mystery. Understanding how these complex AI systems learn and develop their impressive capabilities is a major challenge in the field of mechanistic interpretability. A new research paper by Elisabetta Rocchetti introduces a novel approach to shed light on this “black box” by viewing Transformers, the architecture behind many LLMs, as evolving complex networks. You can read the full paper here.
Unpacking the LLM Black Box
The ability of LLMs to learn new tasks from examples within a prompt, known as in-context learning (ICL), is a fascinating emergent property. Previous research has identified specific “circuits,” like induction heads, that are crucial for ICL and form during distinct “phase changes” early in training. While these microscopic details are being uncovered, a broader, macroscopic understanding of how the model’s overall architecture changes during these learning phases has been missing.
This is where Complex Network Theory (CNT) comes in. CNT has been successfully used to analyze other neural networks, but its application to Transformers has mostly focused on token-level interactions. Rocchetti’s work takes a different path, focusing on the internal computational components of an LLM – the attention heads and MLP (Multi-Layer Perceptron) blocks – and how they organize themselves into a functional network.
Mapping the Transformer’s Internal Connections
The core of this research involves representing a Transformer-based LLM as a directed, weighted graph. Imagine the model’s key computational units, like its attention heads and MLP blocks, as “nodes” in this network. The “edges” connecting these nodes aren’t just based on parameter weights; instead, they represent the causal influence one component has on another’s output.
To measure this influence, the researcher used an intervention-based ablation technique. Essentially, they compared the output of a component in a normal “clean run” with its output when a preceding component’s contribution was temporarily “zeroed out” or removed. The change in output, quantified by cosine similarity, determined the strength and existence of an edge. A stronger impact meant a higher-weighted edge. This process was repeated for 143 training checkpoints of the Pythia-14M model, allowing for a detailed look at how the network evolves over time as the model learns a specific induction task.
A Dynamic Learning Landscape: Key Discoveries
The analysis of these evolving networks revealed distinct phases in the model’s learning journey:
- Exploration, Consolidation, and Refinement: Early in training, the network shows rapid growth in active nodes and connections, an “exploratory” phase. This is followed by “consolidation,” where less effective components are pruned, and then a “refinement” phase where the network discovers more specialized and efficient circuits.
- Stable Information Spreaders: The study identified a remarkably stable hierarchy of “information spreaders” – components that broadcast foundational features widely. These tend to be the embedding layer and early-layer MLP blocks, establishing a fundamental pattern of information flow early on.
- Dynamic Information Gatherers: In contrast, “information gatherers” – components that integrate inputs from many predecessors – showed dynamic reconfiguration. Their roles shifted at key learning junctures, indicating the model actively discovers more efficient computational pathways as it refines its solution.
- Evolving Gatekeepers: Components acting as critical “gatekeepers” or bridges, controlling information flow, also showed dynamic rewiring. While a stable core of gatekeepers emerged early, the specific components fulfilling this role changed over time, suggesting the network actively re-routes information flow to optimize for the task.
- Increased Spreading Efficiency: Overall, the component-graph became progressively more integrated and globally efficient throughout training. More nodes acquired higher “closeness centrality,” meaning information could propagate faster and more effectively across the network.
Also Read:
- Unlocking Efficiency in Language Models: A New Bias-Selection Method for Fine-Tuning
- Beyond Next-Word Prediction: How Feedback Shapes AI Storytellers
A New Lens for Understanding LLMs
These findings demonstrate that a component-level network perspective offers a powerful way to visualize and understand the self-organizing principles that drive the formation of functional circuits in LLMs. By tracking macroscopic metrics like node degree and centrality, researchers can gain tangible insights into the model’s learning process, from broad exploration to the formation and refinement of specialized computational circuits.
While this study provides a proof-of-concept using a smaller model and a specific task, it opens exciting avenues for future research. Applying this methodology to larger models, different tasks, and exploring various input dependencies could further deepen our understanding of how LLMs truly learn and adapt.


