TLDR: Researchers have developed a new Deep Reinforcement Learning strategy called Contact-Aided Navigation (CAN) for flexible robotic endoscopes. This method uses real-time contact force feedback to help endoscopes navigate the complex, dynamic environment of the human stomach, achieving high success rates and precision even in challenging, unpredictable conditions. The approach significantly enhances navigation performance over prior methods by leveraging contact with deformable stomach walls.
Navigating the intricate and constantly moving environment of the human stomach with a flexible robotic endoscope (FRE) presents a significant challenge for medical procedures. Traditional manual endoscopy often struggles to reach certain areas due to the stomach’s natural tissue deformation, dynamic movements, and the non-intuitive nature of manual manipulation. This difficulty has spurred the development of FREs, which are a type of continuum robot designed for inherent compliance, allowing them to adapt their shape to the body’s complex internal pathways.
A key limitation of current FRE navigation is the restricted reach of the endoscope tip without leveraging contact with the stomach walls. Imagine trying to push a flexible rod to a specific point in a large, open space; it’s difficult to get precise control. However, if the rod can brace or push against a surface, that contact provides a stable point, or a ‘fulcrum,’ to extend its reach and improve stability. This fundamental insight is at the heart of a new strategy called Contact-Aided Navigation (CAN).
Researchers have introduced a novel deep reinforcement learning (DRL) based CAN strategy that specifically uses contact force feedback to enhance the endoscope’s motion stability and navigation precision. DRL is a powerful artificial intelligence technique where a system learns to make optimal decisions through trial-and-error interactions within an environment, much like how humans learn from experience.
The team developed a sophisticated training environment using a physics-based finite element method (FEM) simulation of a deformable stomach. This simulation is incredibly realistic, incorporating dynamic elements such as breathing, heartbeat, and subtle body movements, which are crucial for mimicking real-world clinical scenarios. The FRE agent was trained using the Proximal Policy Optimization (PPO) algorithm, a robust method for DRL.
The core innovation lies in integrating contact force feedback directly into the DRL framework. This means the endoscope doesn’t just try to avoid obstacles; it actively uses contact forces and interaction dynamics as critical guidance signals. The endoscope’s ‘state’ during learning includes its position relative to the target, its speed, cable lengths (which control its movement), a binary indicator of whether it’s in contact, and the actual contact force vector. This rich information allows the FRE to intelligently adapt its motion based on its interaction with the stomach wall.
The experimental results were highly promising. In both static (non-moving) and dynamic (moving) stomach environments, the CAN agent achieved an impressive 100% success rate, with an average error of just 1.6 mm between the endoscope’s tip and the target. This significantly outperformed baseline policies that did not utilize contact or force feedback. Even in challenging, unseen scenarios with stronger external disturbances, the CAN agent maintained an 85% success rate, demonstrating its remarkable adaptability and robustness.
A visual demonstration of the navigation process showed how the endoscope actively maneuvers towards a target. As the robot approaches the target, the contact forces between the endoscope and the stomach wall increase, highlighting the robot’s reliance on this contact-based guidance. This inverse relationship—closer to the target means greater force—underscores the importance of force information as a navigation signal.
While this DRL-based CAN method offers substantial advantages, the researchers acknowledge some limitations. Training in complex environments requires longer convergence times, indicating increased computational cost. Future research will focus on extending this state-based DRL policy to a multi-modal framework, integrating both force and visual information for even more robust navigation. Additionally, developing safe navigation algorithms to minimize the risk of tissue injury is an essential area for continued investigation, ensuring that CAN is both effective and non-invasive.
Also Read:
- Enhanced Control for Soft Robots: A Dual-Phase Reinforcement Learning Approach
- Learning to Levitate: An End-to-End Neural Approach for 6D MagLev
This groundbreaking work paves the way for more autonomous and precise endoscopic procedures, potentially transforming surgical diagnostics and treatment in the gastrointestinal tract. You can read the full research paper here.


