TLDR: A new AI-assisted Augmented Reality system uses deep learning-based object recognition to identify assembly components and display step-by-step instructions directly in the physical workspace. It highlights parts with bounding boxes, shows their target placement, and automatically advances steps, demonstrated effectively with LEGO sculpture assembly using a Microsoft HoloLens 2.
Imagine building a complex model or assembling furniture without ever having to consult a paper manual or a tiny digital diagram. Researchers have developed an innovative system that uses Artificial Intelligence and Augmented Reality to guide users through assembly tasks, making the process intuitive and efficient.
The research paper, titled “AI Assisted AR Assembly: Object Recognition and Computer Vision for Augmented Reality Assisted Assembly,” introduces a new workflow that leverages deep learning to identify different assembly components. This system then displays step-by-step instructions directly in your physical workspace, eliminating the need for manual searching or sorting of parts.
How Does It Work?
At its core, the system employs object recognition, a form of computer vision, to detect individual parts. For each step of an assembly, it highlights the necessary component by showing a bounding box around its current location and another indicating where it should be placed. Once the system detects that a step is completed, it automatically advances to the next instruction.
The technology behind this involves training an object recognition algorithm, specifically using the YOLOv5 model, on synthetic data. This data represents various LEGO components under different orientations and lighting conditions, ensuring robust detection in real-world scenarios. A device like the Microsoft HoloLens 2 AR headset captures video of the physical workspace, which is then processed by a server. The YOLOv5 model identifies 2D bounding boxes around the components, and these are then projected into the AR environment as 3D instructions.
A Practical Demonstration with LEGO
To showcase the system’s capabilities, the researchers conducted a case study involving the assembly of LEGO sculptures. They successfully built two distinct LEGO models—an ellipsoidal egg and a twisted wall—without needing any traditional 2D paper drawings or 3D digital models. This demonstration highlights the feasibility of using object recognition for AR-assisted assembly.
The interface is designed to reduce cognitive load. It only visualizes the geometry of the current layer being assembled, and 3D bounding boxes are shown only for components relevant to the current step. These boxes are also annotated with the component type, helping users quickly identify both the location and nature of the part.
Also Read:
- Experiential AI Learning: Instructors Share Insights on Engaging Non-STEM Students with Real-World Scenarios
- Enhancing Robot Navigation in Extreme Environments with Multimodal AI
The Future of Assembly
This research, conducted by Alexander Htet Kyaw, Haotian Ma, Sasa Zivkovic, and Jenny Sabin, represents a significant step forward in AR-assisted assembly. By connecting assembly instructions with the real-time location of components, it streamlines complex tasks and offers a glimpse into a future where assembly is more accessible and less prone to error. Future work aims to explore more complex assembly tasks, improve AR projection accuracy, and even integrate designs generated by 3D generative AI.
You can read the full research paper here: AI Assisted AR Assembly: Object Recognition and Computer Vision for Augmented Reality Assisted Assembly.


