TLDR: A new study challenges the common assumption that observed emotions can reliably predict what people remember from conversations, especially in group settings. Researchers found that third-party annotations of group emotions (arousal, valence, intensity) did not significantly align with group memorability when compared to temporally shuffled emotional data. This suggests that the way emotions are observed and measured in Affective Computing may not accurately reflect the internal processes linked to memory, highlighting a crucial gap for future intelligent systems.
For a long time, it’s been a common belief that our emotions are closely tied to what we remember. Moments that are highly emotional are often thought to be highly memorable. This idea has been a cornerstone for developing intelligent systems, especially in areas like meeting support, memory augmentation, and summarization, where understanding what a user finds relevant and memorable is key.
Intelligent systems, particularly in Affective Computing (AC), often try to recognize human emotions to improve interactions. They use cues like facial expressions and speech patterns to infer emotional states. The assumption is that these emotional responses signal moments of high personal relevance, which in turn should be memorable.
The Crucial Disconnect: Observed vs. Experienced Emotions
However, a new study titled “The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?” by Maria Tsfasman, Ramin Ghorbani, Catholijn M. Jonker, and Bernd Dudzik, delves into a critical question: Do the emotion annotations typically used in Affective Computing, which are often based on third-party observations, truly reflect a user’s personal memorability?
Traditional research in cognitive science has indeed shown a strong link between emotional experiences and memory. When we experience something emotional, our brains are more likely to encode and retain that memory. But these studies usually rely on self-reports or physiological signals – essentially, the ‘experienced’ emotion.
The challenge for Affective Computing is that it frequently uses ‘observed’ emotions. This means a third-party annotator watches a video and labels the emotional behavior of participants. This approach is practical, but it raises a question: Can an external observer accurately capture the internal emotional relevance that leads to memory formation? What if someone hides their true feelings due to social norms, or if the observed group emotion doesn’t perfectly reflect individual experiences?
Addressing Key Gaps in Research
The researchers identified three main gaps in existing knowledge that their study aimed to address:
- Annotation Perspectives: Is there a difference between how experienced emotions (linked to memory) and observed emotions (used in AC) relate to memorability?
- Continuous Conceptualization: AC systems often use continuous, time-based emotion data. Does the emotion-memory link hold when emotions and memorability are measured continuously, rather than as static, retrospective reports?
- Group-based Analysis: Many AC applications operate in group settings (like meetings), but most emotion-memory research focuses on individuals. How do social dynamics in groups affect this link?
To investigate these questions, the study used the MeMo dataset, which contains recordings of 45-minute online group conversations about the Covid-19 pandemic. This dataset includes continuous annotations of both perceived group emotions (arousal and valence) and group memorability. Memorability was determined by participants themselves, who reported and timestamped moments they remembered after each session.
The Surprising Findings
The researchers conducted three computational experiments to compare the alignment between emotion and memorability annotations, using metrics like PATE (Proximity-Aware Time Series Evaluation), Euclidean distance, and Dynamic Time Warping (DTW). They generated synthetic ‘null hypothesis’ data to serve as a baseline for comparison:
- Experiment 1 (Random Uniform): Compared real data to completely random emotion data.
- Experiment 2 (Random with Observed Range): Compared real data to random emotion data that mimicked the actual range of emotions observed in each video.
- Experiment 3 (Temporal Shuffle): Compared real data to actual emotion data that was randomly shuffled in time, destroying any temporal alignment with memory.
While the first two experiments showed some significant differences between the observed data and completely random data, the most striking result came from Experiment 3. When the actual emotion annotations were simply shuffled in time, the relationship between group affect and memorability became statistically insignificant. This means that the observed alignment between emotions and memory was not meaningfully better than what would be expected by random chance, especially when considering the temporal aspect.
Also Read:
- Decoding Human Preferences: How PrefPalette Unveils the ‘Why’ Behind Our Choices
- Enhancing Multi-Agent Learning Through Causal Knowledge Transfer in Dynamic Settings
Implications for Intelligent Systems
This surprising finding suggests that, contrary to common assumptions, third-party observed group emotion annotations may not be reliable proxies for conversational memorability. The study highlights several reasons for this disconnect:
- Observer Bias: Third-party observations might not capture the true internal emotional experience that drives memory. Social norms can lead people to mask their emotions, and group-level observations might not reflect individual feelings.
- Temporal Mismatch: Continuous, real-time emotion annotations might not align with how memories are formed and recalled, which can involve retrospective biases and reconstruction.
- Group Complexity: Group memorability, aggregated from individual reports, might not align with observed group affect, which is an emergent property. Emotional convergence within groups could also dilute individual emotional expressions that are linked to memory.
In conclusion, while emotions and memory are conceptually linked in cognitive science, this study suggests that this relationship doesn’t automatically translate to the way emotions are typically measured and used in Affective Computing applications. The researchers emphasize the need for future research to develop dedicated models for memorability, accounting for the crucial differences in how emotions and memory are defined and measured (e.g., first-party vs. third-party, individual vs. group, static vs. continuous).


