TLDR: “Negative Shanshui” is an interactive AI system that reinterprets traditional Chinese ink painting (shanshui) to highlight ecological crises. It uses a fine-tuned Stable Diffusion model, gaze-driven interaction, and virtual reality to transform serene landscapes into crisis imagery in real-time, prompting viewers to reflect on human impact on the environment. The system’s technical innovations ensure a smooth, immersive experience, and audience feedback reveals a spectrum of emotional engagement, from empathy and urgency to detachment and a journey from despair to hope.
Traditional Chinese landscape ink painting, known as Shanshui, often depicts a harmonious relationship between humans and nature, with small human figures humbly situated within vast natural scenes. This art form embodies a Daoist philosophy of oneness, emphasizing respect for the environment and a non-dominating approach to nature.
However, in our modern world, particularly with rapid economic and industrial growth, this humility towards nature is often overlooked. Human activities have become the dominant force shaping our planet, leading to significant and often irreversible impacts on geological and ecological systems. This era, often referred to as the Anthropocene, highlights the profound influence of humanity on Earth’s ecosystems.
In response to these pressing environmental challenges, a new interactive AI synthesis approach called “Negative Shanshui” has emerged. This innovative project reinterprets traditional shanshui through real-time interactive AI, allowing viewers to experience a powerful transformation from serene landscapes to crisis imagery, driven by their own gaze. The project aims to spark critical reflection on our environmental impact and encourage dialogue on sustainable coexistence.
How Negative Shanshui Works
The core of Negative Shanshui lies in its ability to generate imagery representing environmental degradation within the context of traditional shanshui paintings. It achieves this through several key components:
- Custom AI Synthesizer: A fine-tuned Stable Diffusion model, trained on a dataset of historical eco-crisis events, generates vivid imagery of Anthropocene crises.
- Inpainting and Frame Interpolation: As a viewer gazes at a part of the painting, the system uses “inpainting” to erase that area and fill it with AI-generated crisis content, seamlessly merging it into the original. “Frame interpolation” then creates smooth, dynamic morphing animations between these changes, making the transformation feel continuous.
- Gaze-based Interaction: The viewer’s gaze, tracked through a VR headset, directly drives these real-time transformations. This makes the interaction natural and immersive, placing the viewer not as a distant observer but as an embodied participant in the unfolding environmental narrative.
The experience begins with a serene Shanshui masterpiece in a VR headset. As the viewer’s gaze is detected, the AI synthesis erases gazed-at areas and replaces them with anthropogenic crisis imagery. This visual progression metaphorically represents the detrimental impacts of industrialization and consumerism on natural landscapes. Animated transitions immerse the viewer in this dramatic change, prompting reflection on personal and collective roles in environmental degradation. An AI-generated voice-over narrates contextual information, and when the VR headset is removed, the distortions reverse, returning to the serene Shanshui, with the transformation recorded for later viewing.
Technical Innovations
The system is built with a Unity-based frontend for immersive rendering and gaze tracking, and a Python-based backend for image synthesis. Significant optimizations were implemented to ensure real-time performance, including precomputing text embeddings for the AI model, using reduced-precision floating-point computation, and a smart masking and cropping strategy to focus processing on relevant areas. Frame interpolation techniques ensure smooth animations at approximately 31 frames per second, creating a fluid and responsive experience.
Also Read:
- Dynamic Image Creation: Aligning Text-to-Image Models with Evolving User Tastes
- PosterGen: Crafting Visually Engaging Academic Posters with AI
Audience Engagement and Impact
During its exhibition, Negative Shanshui elicited a wide range of emotional and conceptual responses from participants. Some expressed strong empathy and a desire to act, feeling empowered by the experience to consider making changes. Others reported emotional detachment, likening the experience to constant news cycles, suggesting a form of “everyday crisis fatigue.” There were also reflections on the ephemeral nature of such immersive experiences and even playful interpretations, highlighting the work’s openness to diverse meaning-making.
Notably, some viewers experienced a powerful emotional arc, moving from initial despair at the destruction to hope when the landscape reversed to its serene state. This cyclical narrative structure aligns with environmental psychology, encompassing grief, acceptance, and forward-looking care.
Negative Shanshui stands at the intersection of contemporary art, generative AI, and immersive environmental aesthetics. It redefines shanshui as a critical tool for ecological reflection, moving beyond mere stylistic imitation to engage deeply with planetary crises. While acknowledging its current human-centric interaction, future developments may explore non-human data inputs and expand the aesthetic framework to express resilience and interconnectedness.
This project, detailed further in the research paper available at arxiv.org/pdf/2508.16612, offers a powerful example of how art and technology can converge to foster critical environmental inquiry and inspire new perspectives on our relationship with the natural world.


