Advancing Robotic Garment Handling: A Visuotactile System for Crumpled Clothes

TLDR: Researchers have developed a new dual-arm robotic system that can manipulate crumpled and suspended clothing in the air. It combines advanced vision, which understands garment parts even with occlusions and provides confidence estimates, with tactile sensing that learns and validates grasp points. This “confidence-aware” approach allows the robot to react to uncertainty, adapting its folding and hanging strategies for more robust and human-like garment manipulation.

Manipulating soft, deformable objects like clothing has long been a significant challenge for robots. Unlike rigid items, garments have complex, ever-changing shapes, variable material properties, and frequently hide their own features through self-occlusion, especially when crumpled or suspended. Traditional robotic systems often simplify this problem by flattening clothes first or requiring key features to be perfectly visible.

However, a team of researchers from Massachusetts Institute of Technology, Prosper AI, and Boston Dynamics has introduced a groundbreaking dual-arm robotic system designed to tackle these complexities head-on. Their new framework, detailed in their paper, “Reactive In-Air Clothing Manipulation with Confidence-Aware Dense Correspondence and Visuotactile Affordance,” allows robots to directly manipulate crumpled and suspended garments in mid-air, a capability not widely demonstrated before. You can read the full research paper here.

A Smarter Way to See and Touch

The core of this innovative system lies in its integration of advanced vision and tactile sensing, coupled with a reactive planning approach. It’s built on two main pillars:

First, a **confidence-aware dense visual correspondence** model. This sophisticated vision system is trained on a custom, high-fidelity simulated dataset of shirts, capturing intricate details like seams and hems. Unlike previous methods that struggle with ambiguities, this model uses a special “distributional loss” during training. This allows it to understand pixel-wise correspondences between a crumpled shirt and a flat, canonical version, even when dealing with garment symmetries (like two sleeves looking similar) or heavy occlusions. Crucially, it generates confidence estimates for each correspondence, telling the robot how certain it is about what it’s seeing. This uncertainty is vital for the robot’s decision-making.

Second, a **visuotactile grasp affordance network**. This network determines which regions of a garment are physically graspable. It’s initially trained in simulation and then fine-tuned using real-world, high-resolution tactile feedback from the robot’s grippers. This self-supervised learning ensures that the robot not only sees where to grasp but also understands if a grasp will actually succeed in picking up fabric. The same tactile classifier is used during execution to validate grasps in real-time.

Reactive and Adaptive Manipulation

These two components work together within a **reactive state machine**. This means the robot doesn’t follow a rigid, pre-programmed sequence of actions. Instead, it dynamically adapts its folding or hanging strategies based on the real-time confidence estimates from its vision system and the feedback from its tactile sensors. If the system has low confidence in a potential grasp point, it can defer the action, rotate the garment to get a better view, and re-evaluate. This ability to wait for reliable visual information allows the system to handle highly occluded configurations, both on a table and in the air.

For instance, when folding, the robot picks up the shirt, then queries different canonical regions (shoulder, sleeve, bottom) to find high-confidence, graspable points. If a grasp fails or confidence is low, it rotates the garment and tries again. Once two confident grasp points are secured, the robot can even tension the shirt using tactile feedback and perform the rest of the folding motions, aligning corners with vision.

Key Contributions and Promising Results

The researchers highlight several key technical contributions, including the creation of a parametrizable simulated dataset with realistic garment features, the development of the dense correspondence representation with distributional loss, the visuotactile affordance learning, and the complete reactive manipulation system.

In evaluations, the distributional correspondence model consistently outperformed traditional contrastive methods, especially in handling symmetric regions. The tactile grasp classifier achieved high accuracy (over 98%). The combined system successfully performed folding and hanging tasks, demonstrating its ability to grasp viable points even in challenging configurations. Notably, the system showed promising zero-shot generalization capabilities, successfully manipulating shirts with features (like hoods or buttons) not present in its training data, indicating a robust understanding of garment structure.

Also Read:

Towards More Human-Like Robot Interaction

This work represents a significant step towards more flexible and human-like garment manipulation by robots. By integrating confidence-aware perception and tactile feedback, the system can operate directly on crumpled, suspended clothes, overcoming many limitations of prior approaches. Beyond specific tasks, the dense, confidence-aware representation also serves as a generalizable intermediate layer, potentially enabling robots to learn grasp targets directly from human video demonstrations or interface with vision-language models for more semantically informed manipulation.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Advancing Robotic Garment Handling: A Visuotactile System for Crumpled Clothes

A Smarter Way to See and Touch

Reactive and Adaptive Manipulation

Key Contributions and Promising Results

Towards More Human-Like Robot Interaction

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates