spot_img
HomeResearch & DevelopmentSoftMimic: Enabling Humanoid Robots to Interact Gently and Safely

SoftMimic: Enabling Humanoid Robots to Interact Gently and Safely

TLDR: SoftMimic is a new framework that teaches humanoid robots to respond compliantly to external forces while performing tasks. Unlike traditional stiff controllers, SoftMimic uses a unique data augmentation approach with inverse kinematics and reinforcement learning to train policies that can absorb disturbances, generalize to varied tasks, and interact safely with their environment by adjusting their ‘stiffness’. This allows robots to move gracefully and safely even when encountering unexpected physical contact, without sacrificing motion quality.

Humanoid robots are becoming increasingly capable, learning to perform complex human-like motions through imitation. However, a significant challenge remains: how do these robots safely and effectively interact with the unpredictable real world, where unexpected bumps, pushes, or varied object sizes are common?

Traditional methods often train robots to rigidly follow a reference motion. While impressive for dynamic displays, this approach leads to stiff, aggressive corrections when the robot encounters an obstacle or an external force. Imagine a robot trying to pick up a box, but the box is slightly misplaced. A stiff robot might exert uncontrolled forces, potentially damaging itself, the object, or even a person.

Introducing SoftMimic: Learning Gentle Control

A new framework called SoftMimic addresses this critical issue by teaching humanoid robots to respond compliantly to external forces while maintaining balance and overall posture. Developed by Gabriel B. Margolis, Michelle Wang, Nolan Fey, and Pulkit Agrawal from the Improbable AI Lab at MIT, SoftMimic allows robots to controllably deviate from a reference motion based on a user-specified ‘stiffness’. This means a robot can be programmed to be very ‘soft’ and yield to forces, or more ‘stiff’ and resist them, depending on the task.

How SoftMimic Works

Instead of directly asking a reinforcement learning (RL) policy to discover compliant behaviors, which can be difficult, SoftMimic takes a clever learning-from-examples approach. First, it uses an inverse kinematics (IK) solver to generate a large dataset of ‘augmented’ motions. These augmented motions explicitly show how the robot *should* comply to various external forces while still preserving the overall style and balance of the original movement.

During training, the robot’s RL policy observes its own state and the original, non-compliant reference motion. However, it is rewarded for tracking the *pre-computed compliant trajectory* from the augmented dataset. This unique setup forces the policy to learn to infer external forces from its own sensors and react with the desired compliant behavior. This offline data generation process is highly efficient, allowing for rapid creation of diverse compliant scenarios.

Also Read:

Key Benefits and Real-World Impact

SoftMimic offers several significant advantages:

  • Enhanced Safety: The compliant controller is much safer when encountering unexpected contacts. Experiments show that a compliant policy can reduce collision forces by nearly half compared to standard stiff controllers, preventing damage to the robot or its environment.
  • Improved Generalization: A single motion reference can be generalized to handle variations in a task. For example, a robot trained with SoftMimic can pick up boxes of different sizes using the same motion, adapting gently to the object’s dimensions without prior knowledge.
  • Robustness to Disturbances: Whether it’s brushing against a wall, clipping an obstacle, or dealing with a misplaced object, SoftMimic policies handle disturbances gracefully, absorbing impacts rather than rigidly fighting them.
  • Controllable Stiffness: Users can command different stiffness levels, allowing the robot to perform delicate tasks (low stiffness) or exert more force when needed (high stiffness).
  • Maintains Motion Quality: Crucially, when there are no external forces, SoftMimic policies still achieve motion tracking performance comparable to state-of-the-art stiff baselines.

The framework has been validated through extensive simulations and real-world experiments on a Unitree G1 humanoid robot, demonstrating that these benefits transfer effectively to physical hardware. This work paves the way for humanoids to operate more safely and effectively alongside people and in complex, unstructured environments. For more details, you can read the full research paper here.

Ananya Rao
Ananya Raohttps://blogs.edgentiq.com
Ananya Rao is a tech journalist with a passion for dissecting the fast-moving world of Generative AI. With a background in computer science and a sharp editorial eye, she connects the dots between policy, innovation, and business. Ananya excels in real-time reporting and specializes in uncovering how startups and enterprises in India are navigating the GenAI boom. She brings urgency and clarity to every breaking news piece she writes. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -