SoftMimic: Enabling Humanoid Robots to Interact Gently and Safely

TLDR: SoftMimic is a new framework that teaches humanoid robots to respond compliantly to external forces while performing tasks. Unlike traditional stiff controllers, SoftMimic uses a unique data augmentation approach with inverse kinematics and reinforcement learning to train policies that can absorb disturbances, generalize to varied tasks, and interact safely with their environment by adjusting their ‘stiffness’. This allows robots to move gracefully and safely even when encountering unexpected physical contact, without sacrificing motion quality.

Humanoid robots are becoming increasingly capable, learning to perform complex human-like motions through imitation. However, a significant challenge remains: how do these robots safely and effectively interact with the unpredictable real world, where unexpected bumps, pushes, or varied object sizes are common?

Traditional methods often train robots to rigidly follow a reference motion. While impressive for dynamic displays, this approach leads to stiff, aggressive corrections when the robot encounters an obstacle or an external force. Imagine a robot trying to pick up a box, but the box is slightly misplaced. A stiff robot might exert uncontrolled forces, potentially damaging itself, the object, or even a person.

Introducing SoftMimic: Learning Gentle Control

A new framework called SoftMimic addresses this critical issue by teaching humanoid robots to respond compliantly to external forces while maintaining balance and overall posture. Developed by Gabriel B. Margolis, Michelle Wang, Nolan Fey, and Pulkit Agrawal from the Improbable AI Lab at MIT, SoftMimic allows robots to controllably deviate from a reference motion based on a user-specified ‘stiffness’. This means a robot can be programmed to be very ‘soft’ and yield to forces, or more ‘stiff’ and resist them, depending on the task.

How SoftMimic Works

Instead of directly asking a reinforcement learning (RL) policy to discover compliant behaviors, which can be difficult, SoftMimic takes a clever learning-from-examples approach. First, it uses an inverse kinematics (IK) solver to generate a large dataset of ‘augmented’ motions. These augmented motions explicitly show how the robot *should* comply to various external forces while still preserving the overall style and balance of the original movement.

During training, the robot’s RL policy observes its own state and the original, non-compliant reference motion. However, it is rewarded for tracking the *pre-computed compliant trajectory* from the augmented dataset. This unique setup forces the policy to learn to infer external forces from its own sensors and react with the desired compliant behavior. This offline data generation process is highly efficient, allowing for rapid creation of diverse compliant scenarios.

Also Read:

Key Benefits and Real-World Impact

SoftMimic offers several significant advantages:

Enhanced Safety: The compliant controller is much safer when encountering unexpected contacts. Experiments show that a compliant policy can reduce collision forces by nearly half compared to standard stiff controllers, preventing damage to the robot or its environment.
Improved Generalization: A single motion reference can be generalized to handle variations in a task. For example, a robot trained with SoftMimic can pick up boxes of different sizes using the same motion, adapting gently to the object’s dimensions without prior knowledge.
Robustness to Disturbances: Whether it’s brushing against a wall, clipping an obstacle, or dealing with a misplaced object, SoftMimic policies handle disturbances gracefully, absorbing impacts rather than rigidly fighting them.
Controllable Stiffness: Users can command different stiffness levels, allowing the robot to perform delicate tasks (low stiffness) or exert more force when needed (high stiffness).
Maintains Motion Quality: Crucially, when there are no external forces, SoftMimic policies still achieve motion tracking performance comparable to state-of-the-art stiff baselines.

The framework has been validated through extensive simulations and real-world experiments on a Unitree G1 humanoid robot, demonstrating that these benefits transfer effectively to physical hardware. This work paves the way for humanoids to operate more safely and effectively alongside people and in complex, unstructured environments. For more details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

SoftMimic: Enabling Humanoid Robots to Interact Gently and Safely

Introducing SoftMimic: Learning Gentle Control

How SoftMimic Works

Key Benefits and Real-World Impact

Gen AI News and Updates

Deductive AI Secures $7.5 Million Seed Funding to Revolutionize Software Reliability with Intelligent SRE Agents

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

Generative AI Powers Next-Gen Autonomous Emergency Response

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates