TLDR: A new study reveals that human brains and deep neural networks process visual information through a remarkably similar two-stream system. One stream handles scene structure, while a newly emphasized lateral-dorsal stream, particularly the LOTC hub, specializes in social and biological content, demonstrating a convergent computational solution for visual encoding driven by environmental structure.
Have you ever wondered if everyone sees the world in the same way? Or if the way our brains process visual information is similar to how artificial intelligence (AI) systems “see”? A recent study delves into these fascinating questions, revealing that human brains and advanced AI models share a surprisingly similar approach to understanding the visual world around us.
For a long time, scientists have debated what truly shapes our visual perception: is it the inherent structure of the world itself, or is it primarily influenced by the unique architecture of our brains? While individual differences certainly play a role, research has shown that when people look at the same things, their brains often show similar patterns of activity. This suggests there might be a common way our brains represent what we see.
This new research, led by Pablo Marcos-Manchón and LluÃs Fuentemilla, introduces a novel framework to trace how visual information flows from basic sensory input to more complex, high-level understanding in both human brains and deep neural networks (DNNs). They applied this framework to three different fMRI datasets, which capture brain activity while people view visual scenes.
The findings are quite remarkable. The study uncovered a brain-wide network, consistent across different individuals, that is organized into two main pathways. One pathway, located in the medial-ventral part of the brain, appears to be specialized in processing the structure of scenes, like recognizing landscapes or rooms. The other, a lateral-dorsal pathway, is tuned to detect social and biological content, such as people or animals.
What’s even more intriguing is that this functional organization is mirrored in the hierarchies of vision-based deep neural networks. These AI models, trained on vast amounts of images, process visual information in a way that aligns with how our brains do. However, language models, which are trained on text, did not show this same alignment, reinforcing that this specific visual-to-semantic transformation is unique to visual processing systems.
The researchers found that areas in the early visual cortex, responsible for basic features, aligned with the shallowest layers of AI models. A “ventral hub” in the brain, involved in scene and object recognition, showed alignment across many layers of the AI models, suggesting it integrates both simple visual features and more abstract information. Most notably, a “lateral occipitotemporal cortex (LOTC) hub” showed strong alignment only with the deepest, most semantic layers of the vision models. This LOTC hub was particularly sensitive to the presence of biological agents (people or animals) in the images, confirming its role in processing social and animate content.
This discovery provides strong support for the idea of a “third visual pathway” in the brain, specifically dedicated to social perception. It suggests that both human and artificial vision systems have converged on a similar computational solution for encoding visual information, largely driven by the inherent structure of the external world.
The study also demonstrated the robustness of these findings across different tasks and stimulus sets. Whether participants were performing a memory task or a valence judgment task, and whether they viewed complex scenes or isolated objects, the core principles of shared representational geometry between brains and models remained consistent, although the extent of engagement of higher-order areas varied with the complexity of the visual input.
Also Read:
- Bridging the Forgetting Divide in AI: A Brain-Inspired Approach to Continual Learning
- Spiking Neural Networks: Bringing Brain-Inspired AI to the Edge
This work offers a powerful, data-driven approach to understanding how information is processed in the human brain. By combining insights from how brains represent information across individuals, how they align with AI model hierarchies, and by breaking down the content encoded in specific brain regions, scientists can gain a more complete picture of brain function. This research not only deepens our understanding of visual perception but also highlights the potential of deep neural networks as valuable tools for exploring the mysteries of the human mind. You can read the full research paper here.


