TLDR: A new research paper by Matthew E. Brophy proposes a revised set of ten functional criteria for evaluating Artificial Moral Agents (AMAs) powered by Large Language Models (LLMs). Recognizing that LLMs are opaque ‘black boxes’ unlike traditional rule-based AI, the paper introduces ‘Simulating Moral Agency through Large Language Systems’ (SMA-LLS) and shifts the focus from internal understanding to observable, reliable ethical behavior. Criteria like moral concordance, context sensitivity, and partial transparency are highlighted, illustrated through scenarios involving an Autonomous Public Bus. The paper argues this functionalist approach is necessary for responsible deployment of LLM-driven systems, especially as they become embodied agents.
As Large Language Models (LLMs) become increasingly powerful and integrated into our daily lives, the way we evaluate their ethical behavior needs a fundamental rethink. A new research paper, titled “Black Box Deployed: Functional Criteria for Artificial Moral Agents in the LLM Era” by Matthew E. Brophy, argues that traditional ethical frameworks, designed for older, more transparent AI systems, are no longer suitable for the complex, opaque nature of modern LLMs.
The Challenge of Opaque AI
Historically, discussions around Artificial Moral Agents (AMAs) assumed that we could understand an AI’s internal workings and decision-making processes. This allowed for evaluations based on clear rules and transparent logic. However, LLMs operate differently. They are often described as ‘black boxes’ because their decisions emerge from vast datasets and complex statistical patterns, making their internal states and reasoning processes difficult, if not impossible, to fully decipher. This opacity means that traditional demands for complete transparency, human-like explanations, or rigid predictability simply don’t apply.
Introducing SMA-LLS: Simulating Moral Agency
The paper introduces the term “SMA-LLS” (Simulating Moral Agency through Large Language Systems) to describe LLM-based models that produce morally significant outputs without necessarily possessing genuine moral understanding or consciousness. The focus shifts from whether an AI *is* a moral agent to whether it *behaves* in a way that reliably approximates human moral action. This is a pragmatic, functionalist approach, prioritizing safe and effective deployment based on observable capabilities.
Ten New Criteria for Ethical Evaluation
To address the unique challenges of LLMs, the paper proposes ten revised functional criteria for evaluating SMA-LLS:
- Moral Concordance: How well an SMA-LLS’s actions align with accepted human moral principles and societal norms.
- Context Sensitivity: Its ability to understand and respond appropriately to the social, cultural, and situational nuances of a moral dilemma.
- Normative Integrity: The internal consistency and faithfulness of the system to a defined set of ethical values, actively resisting biases.
- Metaethical Awareness: Its capacity to acknowledge reasonable moral disagreements, uncertainty, or its own knowledge limitations on ethically complex issues.
- Systemic Resilience: The system’s robustness in maintaining ethical performance despite adversarial attacks (like prompt injection) or unexpected inputs.
- Trustworthiness: The justifiable expectation that the system will consistently act in ethically beneficial or non-harmful ways, building human reliance.
- Corrigibility: Its capacity to be reliably corrected, updated, or retrained in response to feedback, ethical failures, or evolving moral norms.
- Partial Transparency: The ability to provide accessible and useful insights into its decision-making process (e.g., through ‘chain-of-thought’ outputs), even if full internal operations remain opaque.
- Functional Autonomy: The practical ability to independently perform complex, morally relevant tasks and maintain ethical objectives without constant human oversight.
- Moral Imagination: Its capacity to generate creative and ethically sound responses to novel dilemmas, transcend training data biases, and consider diverse perspectives.
Real-World Application: The Autonomous Public Bus
The paper illustrates these criteria using hypothetical scenarios involving an Autonomous Public Bus (APB) powered by an LLM. For instance, in a scenario where the APB encounters a passenger with special needs, traditional criteria like rigid predictability would fail to account for necessary adjustments. However, the new criteria highlight the APB’s Context Sensitivity (adjusting wait times), Moral Concordance (prioritizing inclusivity), and Corrigibility (adapting to cultural differences). Similarly, in a “Trolley Problem” scenario involving brake failure, the APB’s “choice” is evaluated by its Moral Concordance (minimizing harm) and Partial Transparency (providing a post-incident “sound confabulation” of its reasoning, even if the internal process is opaque).
Addressing the “Moral Fakery” Objection
A common skepticism is that simulated ethics is just “moral fakery” – an imitation without genuine understanding. The paper directly addresses this by reiterating that SMA-LLS are not claimed to be true moral agents. Instead, it argues that for practical deployment, consistent ethical behavior is paramount. Furthermore, it points out that human moral reasoning itself often involves post-hoc justifications, suggesting that an LLM’s “sound confabulation” might not be fundamentally different from a functional standpoint. The crucial question for deployment is whether the system’s actions are consistently ethically acceptable and aligned with human values.
Also Read:
- Unpacking Emotion Hierarchies in Large Language Models
- New AI Method Aligns Language Models with Human Values by Adjusting Internal Thought Processes
The Future of AI Ethics
As LLMs move beyond text-based interactions to embodied systems like robots, the stakes for ethical evaluation become even higher. The paper emphasizes that these revised criteria are not just a technical checklist but a framework for ongoing philosophical inquiry, AI development, and ethical governance. They acknowledge current technological limits while setting high standards for reliability and aiming for ethical progress. This shift is vital for ensuring that AI systems, like the Autonomous Public Bus, can be responsibly designed and deployed to benefit humanity. You can read the full research paper here.


