spot_img
HomeResearch & DevelopmentUnpacking How Sound Effects Shape Our Feelings in Music

Unpacking How Sound Effects Shape Our Feelings in Music

TLDR: A new study investigates how common audio effects like distortion, reverberation, and chorus systematically alter emotional responses in music. Leveraging advanced foundation models, researchers found that effects significantly shift predicted emotions, with distortion notably increasing ‘Anger’ and decreasing ‘Calmness’. The study also revealed how different AI models react to these effects and demonstrated that real-world artistic effect chains create the most profound and coherent emotional shifts, highlighting the intentional design behind music production.

Music is a powerful medium that deeply connects with our emotions. While we often focus on elements like melody, rhythm, and lyrics, a new study delves into a less explored aspect: how audio effects (FX) systematically alter our emotional responses to music. Researchers from the National Technical University of Athens investigated the profound impact of common audio effects like reverberation, distortion, modulation, and dynamic range processing on perceived emotion, utilizing advanced artificial intelligence known as foundation models.

Traditionally, studies have looked at how basic audio features correlate with emotions. However, audio effects are intentional tools used in music production to sculpt the aesthetic and emotional feel of a track. For instance, longer reverberation can evoke feelings of awe or nostalgia, while distortion might heighten arousal, leading to excitement or even aggression. Despite these individual observations, a comprehensive understanding of how diverse audio effects, alone or in combination, influence our emotional perception has been lacking.

This research bridges that gap by leveraging foundation models such as MERT, CLAP, and Qwen. These large-scale neural networks are pre-trained on vast amounts of multimodal data, allowing them to capture complex relationships between musical structure, timbre, and emotional meaning. By analyzing how these models interpret music with various effects applied, the study offers a powerful framework for understanding the emotional consequences of sound design.

The Experiment: Manipulating Sound, Measuring Emotion

The researchers conducted experiments using three diverse datasets (EMOPIA, DEAM, and witheFlow) that provide both categorical (e.g., Excitement, Anger, Sadness, Calmness) and dimensional (valence and arousal) emotional annotations. They applied six common audio effects from the pedalboard library—reverb, delay, distortion, EQ, chorus, and phaser—each scaled in intensity from 1 to 10. These effects simulate various real-world sound manipulations, from creating acoustic space with reverb to adding harmonic saturation with distortion.

The study focused on four key areas: how audio manipulations impact model accuracy, how emotion predictions shift, how the models’ internal representations (embeddings) are altered, and how real-world audio effect chains influence emotion estimation.

Key Findings: How Effects Reshape Feelings

Overall, the application of audio effects generally led to a decrease in the models’ performance in recognizing emotions, suggesting a direct influence of these effects on emotional cues. Specifically, phaser and distortion caused the most significant performance declines across multiple intensity levels, with all three foundation models showing similar trends as effect intensity increased.

When examining shifts in emotion predictions, the findings were particularly insightful. High levels of distortion consistently increased “Anger” predictions while simultaneously decreasing “Calmness” across all models. In contrast, delay and chorus effects introduced more variability, indicating a greater ambiguity in the models’ emotional interpretations. Interestingly, increasing the chorus effect tended to boost “Calmness” predictions in the CLAP and MERT models, while increasing the delay effect raised “Anger” predictions in the CLAP and Qwen models.

The study also visualized how the models’ internal “embedding space”—where emotional information is encoded—was altered by the effects. CLAP embeddings showed large, structured displacements, especially for delay, chorus, and distortion, indicating its strong sensitivity to timbral changes. Qwen also exhibited noticeable shifts, though with less consistent patterns. MERT, however, remained relatively stable, suggesting a robustness to such manipulations, likely due to its training on music-specific tasks.

Perhaps most compelling were the results from real-world scenarios. The researchers tested effect chains designed to mimic iconic sounds from artists like Pink Floyd, U2 (known for reverb and delay), and Rage Against the Machine (heavy distortion with moderate chorus). These artistically crafted chains produced even larger and more coherent shifts in emotion predictions compared to isolated effects. The distortion-heavy chain, for example, caused nearly unidirectional shifts, while reverb- and delay-heavy chains showed more complex paths. This highlights how intentional artistic design in sound production profoundly shapes emotional perception.

Also Read:

Conclusion: A Deeper Understanding of Music’s Emotional Core

This research clearly demonstrates that audio effects are not just technical additions but fundamental elements that substantially alter the estimated emotion in music. Distortion and phaser, in particular, strongly increase “Anger” and reduce “Calmness,” while chorus and delay introduce more nuanced variability. The way these effects reshape the internal representations of foundation models provides a quantifiable measure of their emotional impact. While MERT showed resilience, CLAP and Qwen were more sensitive to these manipulations.

The findings have significant implications for music cognition, performance, and affective computing, offering a deeper understanding of how production practices influence our emotional experience of music. For more detailed information, you can read the full research paper here.

Rhea Bhattacharya
Rhea Bhattacharyahttps://blogs.edgentiq.com
Rhea Bhattacharya is an AI correspondent with a keen eye for cultural, social, and ethical trends in Generative AI. With a background in sociology and digital ethics, she delivers high-context stories that explore the intersection of AI with everyday lives, governance, and global equity. Her news coverage is analytical, human-centric, and always ahead of the curve. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -