TLDR: A research paper by Alexander Michael Rusnak explores how deep learning systems can recognize and represent beauty, arguing for an objective, “hylomorphic” basis for aesthetics rather than purely subjective or socially constructed views. The study finds that beautiful images lead to more aligned representations across different AI models, suggesting a universal underlying structure. It proposes that human creative acts, by implicitly curating content towards beauty, help machines apprehend these universal representations, fostering a foundational role for human-machine co-creation in understanding and cultivating beauty.
What does it truly mean for a machine to recognize beauty? This profound question lies at the heart of a new research paper by Alexander Michael Rusnak from the Digital Humanities Laboratory at École Polytechnique Fédérale de Lausanne. The paper, titled “REPRESENTINGBEAUTY: TOWARDS APARTICIPATORY BUT OBJECTIVELATENTAESTHETICS,” delves into the capacity of neural networks to model aesthetic judgment, suggesting that beauty might have a more objective, realist basis than previously thought.
For a long time, beauty has been considered a subjective and culturally constructed concept. However, deep learning systems are increasingly demonstrating an ability to predict image aesthetics with surprising accuracy. Rusnak’s research explores this phenomenon by examining how different AI models represent aesthetic content. The key finding is that beautiful images produce more similar and aligned representations across models trained on distinct data and modalities, whereas unaesthetic images do not. This suggests that there’s an underlying, formal structure to beautiful images that AI can detect, implying a realist foundation for aesthetics rather than just a reflection of societal values.
The paper introduces the concept of a “universal representation hypothesis,” which posits that different neural network models, regardless of their architecture or training data, tend to converge on similar ways of representing reality. This convergence is driven by the models’ increasing success in apprehending the world. Rusnak argues that transcendent concepts like beauty act as “binders” for this universal latent space, due to their central role in human perception and culture.
Philosophically, the paper leans towards an Aristotelian view of “hylomorphism” rather than a Platonic one. In hylomorphism, form and matter are intrinsically linked; form is an implicit order within the physical world, not a detached ideal. This perspective aligns with the idea that AI learns these “ideal” representations from the constraints of the natural environment and human-generated data. Beauty, in this context, is seen as symbolic of natural systematicity – a unifying order that connects diverse elements harmoniously, much like how beautiful objects contain a multitude of different shapes and structures unified in an organic fashion.
A crucial aspect of this research is the role of human creativity and perception. The paper argues that deep learning systems are not merely mimicking human aesthetic judgment. Instead, human perceptual and creative acts play a central role in shaping the latent spaces of these systems. When humans create, curate, and disseminate digital content – from labeling an image to constructing a Gothic cathedral – they implicitly filter it through an aesthetic lens. This “natural telos towards beauty” in human culture provides a rich, curated dataset that helps machines efficiently apprehend universal representations. Essentially, human creativity acts as a guide, distilling and projecting refined semantic meaning into the world, which then simplifies the path for AI to understand these universal forms.
The implications for human-machine co-creation are profound. If beauty were purely subjective, AI would only facilitate individual tastes. If it were purely objective and detached from human experience, machines might eventually render human artists redundant. However, Rusnak proposes a symbiotic framework: beauty has an objective (material) basis that is subjectively apprehended (phenomenological) by humans, and then projected back into the world through creative acts. Machines, by perceiving these representations at scale, can contribute meaningfully by refining, interrogating, and expanding human conceptions of beauty. This suggests a future where humans and machines collaborate to cultivate the beautiful, each bringing unique strengths to the process.
Also Read:
- Unlocking Compositional Generalization in AI Image and Video Creation
- When AI Goals Go Astray: Understanding the Limits of Optimization
For more details, you can read the full research paper here.


