TLDR: A new research paper argues that Artificial Intelligence’s reliance on optimization methods inherently leads to fundamental flaws like catastrophic forgetting and overfitting, preventing the development of true artificial cognition. The paper formally proves these limitations and proposes that world-modelling approaches, such as Synthetic Cognition, offer a more promising path for achieving Artificial General Intelligence by building abstract representations of the world rather than solely optimizing for specific tasks.
The field of Artificial Intelligence (AI) has made incredible strides, achieving feats once thought exclusive to human cognition, such as passing the Turing Test. However, a new research paper titled “Optimisation Is Not What You Need” by Alfredo Ibias from Avatar Cognition argues that the very foundation of much of modern AI – optimization methods – contains fundamental flaws that prevent the development of true artificial cognition.
The Core Problem with Optimization
The paper highlights two major issues inherent to optimization methods: catastrophic forgetting and overfitting. These problems, the author formally proves, are not merely bugs to be patched but are intrinsic limitations of AI systems that rely on adjusting weights to minimize a loss function.
Catastrophic forgetting occurs when an AI model, after learning a new task, abruptly and drastically forgets previously acquired information. Imagine a student who, after learning algebra, completely forgets how to do basic arithmetic. This is a significant hurdle for developing AI that can continuously learn and adapt over time, similar to how humans acquire knowledge throughout their lives.
Overfitting, on the other hand, describes a situation where an AI model becomes too specialized in the data it was trained on, leading to poor performance when encountering new, unseen data. It’s like a student who memorizes answers for a test but doesn’t truly understand the concepts, failing when faced with slightly different questions. The paper points out the paradox: the better an optimization method solves its training problem, the less useful its results become for real-world generalization.
Why Optimization Methods Struggle
The paper explains that these issues stem from the weighted nature of most machine learning and deep learning algorithms, including popular ones like Artificial Neural Networks and Transformers. When these systems learn new information, they modify their internal weights. If a new task requires different weight adjustments, the old knowledge, tied to previous weight configurations, can be lost or corrupted. This is particularly problematic when an AI needs to handle multiple, distinct tasks.
A New Path: World-Modelling Alternatives
In a positive contribution, the research explores alternative approaches, specifically “world-modelling” frameworks, which do not treat intelligence as an optimization problem. Instead, they focus on building abstract representations and models of the world from perceived inputs. The paper highlights Synthetic Cognition and JEPA (Joint-Embedding Predictive Architecture) as promising examples.
The author empirically demonstrates the resilience of a world-modelling algorithm, Unsupervised Cognition (part of the Synthetic Cognition framework), against catastrophic forgetting and overfitting. In experiments, the system was trained on two different datasets sequentially (Wisconsin Breast Cancer and Pima Indians Diabetes) and consistently retained its accuracy on the first task even after learning the second. Similarly, repeated training on the same dataset did not lead to overfitting, as the system reached a stable internal representation and stopped making further changes.
Broader Implications for AI Development
The paper also discusses why the optimization problem setup might be an “ill-posed proxy” for developing intelligence. True cognitive agents don’t always seek optimal solutions but rather reason about possibilities. Furthermore, the necessity for optimization methods to provide an answer for every input, even when uncertain, can lead to unreliable and even dangerous outcomes, such as susceptibility to adversarial attacks. The ability for an AI to say “I do not know” is presented as a crucial cognitive trait that optimization methods often lack.
Also Read:
- Rethinking AI’s Internal World: A Critical Look at World Models and a New Path Forward
- Establishing a Scientific Foundation for Measuring Artificial Intelligence
Looking Ahead
The conclusion is clear: while weight-based optimization methods have yielded impressive results for specific tasks, they are fundamentally limited in achieving Artificial General Intelligence. The paper urges the AI community to look beyond these methods and explore new mental frameworks and non-optimization approaches, such as those based on world-modelling, to truly advance artificial cognition. For more details, you can read the full research paper here.


