Enabling Real-Time Robotic Grasping on Low-Power Edge Devices

TLDR: This research explores deploying a sophisticated robotic grasping perception model, Heatmap-Guided Grasp Detection (HGGD), onto the GreenWaves Technologies GAP9 RISC-V System-on-Chip, a low-power microcontroller. By employing hardware-aware optimizations like input size reduction, model partitioning, and quantization, the team successfully achieved real-time 6-DoF grasp detection. The study validates the feasibility of fully on-chip inference, demonstrating competitive performance on the GraspNet-1Billion dataset while significantly reducing memory footprint and maintaining practical inference speeds, highlighting the potential of MCUs for autonomous robotic manipulation at the edge.

Robotic grasping, the ability of robots to reliably pick up and manipulate objects of various shapes, sizes, and orientations, is a complex challenge. While humans perform this task effortlessly, replicating it in robots requires precise perception and control. Traditional deep learning models for grasp synthesis often demand significant computational power, typically relying on cloud or high-performance computing.

Bringing AI to the Edge

A recent trend in Deep Learning is Edge AI, which involves shifting computation from the cloud to resource-constrained devices like Microcontroller Units (MCUs) at the network’s edge. This approach enables low-latency, low-power inference, making real-time operations feasible in environments with limited resources. However, deploying complex deep learning models for real-time robotic grasping on MCUs presents significant challenges due to strict constraints on memory, processing power, and energy availability.

The Heatmap-Guided Grasp Detection (HGGD) Approach

Researchers from Maastricht University have explored the feasibility of performing robotic grasping inference directly at the edge. Their work focuses on implementing Heatmap-Guided Grasp Detection (HGGD), an end-to-end framework for detecting 6-Degrees-of-Freedom (6-DoF) grasp poses, on the GreenWaves Technologies GAP9 RISC-V System-on-Chip. The GAP9 is a low-power processor designed for efficient signal processing and optimized for Neural Network workloads, featuring RISC-V cores and a dedicated AI accelerator.

The HGGD architecture generates grasps by encoding grasp heatmaps from RGB-Depth (RGBD) images, which then guide the grasp generation process. It consists of two main models: AnchorNet, which extracts semantic features and generates heatmaps, and LocalNet, which uses these heatmaps to generate precise 6-DoF grasps. This architecture is known for its state-of-the-art performance and faster inference speed compared to other models.

Optimizing for Resource-Constrained Devices

To enable real-time execution on the GAP9 MCU, the researchers applied several hardware-aware optimization techniques:

Reducing Input Size: The input image resolution was significantly reduced from 640×360 to 320×160 pixels. This cut the input size by approximately 75%, leading to improved memory usage and reduced latency during convolution operations.
Pipeline Execution: The large model was partitioned into four smaller, functionally equivalent sub-models: ResNet-MCU, AnchorNet-MCU, PointNet-MCU, and LocalNet-MCU. These sub-models are processed sequentially, allowing for efficient data flow and reduced peak memory usage, as each sub-model is executed independently.
Quantization: The model’s weights were converted from float32 to int8 using the GAP9’s scaled quantization method. This reduced memory usage by a factor of four with minimal practical impact on performance.
Targeting Hardware Features: The team leveraged the GAP9’s NE16 Neural Network accelerator and parallel processing capabilities, optimizing data layout and minimizing transfer latency to maximize inference performance.

Also Read:

Experimental Validation and Results

The optimized model, named HGGD-MCU, was evaluated on the GraspNet-1Billion benchmark dataset. Experiments confirmed that the reduced pipeline delivered competitive results, performing closely to state-of-the-art approaches despite the lower input data resolution. The optimization techniques significantly reduced flash, RAM, and L2 memory consumption, ensuring the model fit within the GAP9’s limited memory capacity.

Regarding inference time, the GAP9 successfully processed the HGGD-MCU model with an average throughput of 740.47 milliseconds. While PointNet-MCU was identified as a computational bottleneck due to a hardcoded batch size, the overall inference speed is considered adequate for many robotic grasping tasks, especially where robotic arms operate at slower speeds. This study highlights the potential of the GAP9 as a reliable and efficient platform for edge AI tasks in robotics.

This research demonstrates the feasibility of deploying advanced robotic grasping perception models on low-power, resource-constrained MCUs, paving the way for more autonomous and real-time manipulation in various applications. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Enabling Real-Time Robotic Grasping on Low-Power Edge Devices

Bringing AI to the Edge

The Heatmap-Guided Grasp Detection (HGGD) Approach

Optimizing for Resource-Constrained Devices

Experimental Validation and Results

Gen AI News and Updates

Rockwell Automation Integrates NVIDIA Nemotron Nano for Edge-Based Generative AI in Industrial Settings

NVIDIA Introduces $249 Jetson Orin Nano Super Developer Kit for Accessible Generative AI

Autonomous AI Agents are Here: Why Your Data Strategy is Now Make-or-Break for Enterprise Success

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates