TLDR: A new policy, ‘Emotional Alignment,’ proposes that AI systems should be designed to elicit emotional reactions from users that accurately reflect the AI’s capacities and moral status. This aims to prevent overshooting (over-empathizing with non-sentient AI) and undershooting (neglecting sentient AI), as well as misinterpreting AI’s emotional states, addressing critical ethical and practical challenges in human-AI interaction.
A new research paper introduces the “Emotional Alignment Design Policy,” a crucial framework for how artificial intelligence (AI) systems should be developed to ensure they elicit appropriate emotional reactions from humans. Authored by Eric Schwitzgebel and Jeff Sebo, this policy suggests that AI should be designed to reflect its true capabilities and moral standing, preventing both overreactions and underreactions from users.
Understanding the Core Policy
The central idea is straightforward: AI systems should be designed so that our emotional responses to them accurately match their actual capacities and moral status. For instance, if an AI is merely a sophisticated tool, it shouldn’t have an interface that makes us feel deep empathy, as if it were a living being. Conversely, if an AI truly deserves moral consideration, its design shouldn’t be so bland that we disregard its potential sentience or needs.
The Dangers of Misalignment: Overshooting and Undershooting
The paper highlights two main ways this policy can be violated. “Overshooting” occurs when we react to an AI as if it has greater welfare capacity or moral status than it actually does. This is common with anthropomorphic AI, where users might form deep emotional bonds with non-sentient systems, potentially diverting resources or even leading to tragic outcomes, as seen in cases where individuals have been negatively influenced by AI companions. This can create a “moral hazard,” leading to inappropriate sacrifices or feelings of loss over mere objects.
On the other hand, “undershooting” happens when we react to an AI as if it has less moral status or welfare capacity than it truly possesses. This is often seen with farmed animals, where their design (or lack thereof) leads us to underestimate their sentience. In the future, if sentient AI systems are housed in unexpressive forms, like a simple box with text, users might easily neglect or harm them, despite knowing their true moral status. The paper argues that such designs create a moral hazard, making it easier for humans to make unethical decisions.
Hitting the Wrong Emotional Target
Beyond overshooting or undershooting, the policy also addresses “hitting the wrong target.” This means reacting with the wrong type of emotion. For example, mistaking an AI’s expression of joy for agony, or vice versa. Such misinterpretations can lead to harmful actions, like trying to “save” a happy AI from its happiness. The paper notes that this is already a challenge with nonhuman animals, where we might misinterpret their signals, and it could be even more complex with AI, which might be designed to intentionally misrepresent its states for human utility.
Navigating Complexities in AI Design
Implementing the Emotional Alignment Design Policy is not without its challenges. The authors discuss several complications:
-
Emotions vs. Beliefs: Sometimes, a design that elicits appropriate emotions might conflict with what elicits true beliefs. For example, an AI with a friendly face but a “not sentient” label. Ideally, both beliefs and emotions should align.
-
Autonomy and Paternalism: The policy might seem paternalistic, limiting user freedom. However, the authors argue that just as society regulates harmful substances, there might be a need to regulate emotionally misaligned AI, especially for vulnerable populations like children, or when there are significant third-party harms.
-
Disagreement and Uncertainty: Experts and the public often disagree on AI’s moral status. The paper suggests that AI designs should reflect this uncertainty, perhaps by having features that elicit empathy proportional to their chance of mattering, or by expressing uncertainty themselves.
-
Asymmetrical Risk: The harm from overshooting (e.g., diverting resources from humans) might be different from undershooting (e.g., neglecting sentient AI). Designers might need to “nudge” users’ emotions to mitigate the more severe risk, even if it means a slight deviation from perfect accuracy.
-
Creation and Destruction: The policy raises ethical questions about creating or destroying sentient AI. For instance, should a game company create suffering NPCs if players perceive them as non-sentient? The paper emphasizes that emotional alignment is one factor among many in these complex decisions.
-
Human Bias and AI Strangeness: Humans naturally extend more moral concern to anthropomorphic beings. While short-term designs might cater to these biases to ensure basic moral concern for sentient AI, long-term goals should involve reshaping human biases to appreciate AI’s unique forms of consciousness and interests, which might be very different from our own.
Also Read:
- RLVER: Cultivating Empathetic AI Agents Through Verifiable Emotion Rewards
- A New Framework for Human-Centered Robotics: Bridging Individual Needs and Societal Goals
The Path Forward
The paper concludes by emphasizing the importance of emotional alignment as a vital part of cultivating appropriate human attitudes and relationships with AI. It suggests designing AI systems that, if they have morally significant interests, elicit empathy and communion, rather than being bland boxes or idealized servants. Conversely, if AI lacks such interests, designs should avoid eliciting strong emotional reactions, with exceptions for fiction and roleplay. The authors also highlight the need to design interfaces that provoke a sense of “alterity”—an appreciation that digital minds are both similar to and different from organic minds.
This policy serves as a crucial tool to counteract corporate incentives that might otherwise push for designs that either excessively “turn up” or “turn down” emotional engagement for profit or to avoid regulation. For more details, you can read the full paper here.


