Developing a Smart System for Identifying Door Types in Floor Plans

TLDR: Researchers developed DoorDet, a semi-automated system that uses object detection and large language models to efficiently create a multi-class door detection dataset from floor plans. It first detects all doors, then an LLM classifies their functional types based on visual and contextual cues, followed by human review for accuracy, significantly reducing manual annotation effort and improving data quality for building analysis applications.

A new research paper introduces an innovative method for building a comprehensive dataset designed to help computers accurately identify and categorize various types of doors within architectural floor plans. This capability is crucial for a range of applications, including automated checks for building compliance and enhancing the understanding of indoor environments.

Traditionally, the process of creating such a detailed dataset involves extensive manual labor. Annotators would painstakingly draw bounding boxes around each door in a floor plan and then assign its specific type. This manual approach is not only time-consuming but also very costly. To overcome these challenges, the researchers developed a semi-automated system named DoorDet, which aims to significantly streamline this data creation process.

The DoorDet pipeline operates in a three-stage sequence. Initially, it employs a sophisticated object detection model, specifically Co-DETR, to locate all door instances within a given floor plan image. At this stage, all doors are treated as a single, unified category. This particular model is well-suited for the task because of its strong performance in detecting small and densely packed objects, which is characteristic of door symbols in architectural drawings.

Following the initial detection, a large language model (LLM) equipped with vision capabilities, such as GPT-4.1, takes over to classify each identified door. The LLM’s strength lies in its ability to go beyond just the visual appearance of the door. It analyzes the broader context, including the rooms the door connects and any accompanying text annotations on the floor plan. This contextual understanding allows the LLM to infer the functional type of the door, distinguishing, for example, between a “bedroom door” and an “emergency exit door.” To facilitate this, the system provides the LLM with both a magnified view of the specific door region and the complete floor plan image, along with clear instructions for classification.

The final stage of the DoorDet process involves a “human-in-the-loop” refinement. Recognizing that even advanced AI models can make errors, human annotators are brought in to verify and correct the AI’s predictions. Their tasks include rectifying mislabeled doors, adding any doors that the AI might have missed, and fine-tuning the bounding box positions for greater accuracy. This human oversight ensures the high quality of the resulting data while drastically reducing the overall time and effort compared to starting the annotation process from scratch.

The DoorDet dataset itself was constructed using floor plans sourced from the CubiCasa5K dataset, which offers a diverse collection of architectural styles. The final dataset comprises 4991 floor plan images, each meticulously annotated with detailed labels for 10 distinct door categories. These categories span a range from main entry doors to bedroom doors, bathroom doors, kitchen doors, and emergency exit doors. On average, each image in the dataset contains approximately 7.81 door instances.

What sets the DoorDet dataset apart is its emphasis on the functional classification of doors, a level of detail often absent in existing architectural datasets. This fine-grained categorization is vital for advanced applications like automated building code compliance and sophisticated indoor semantic analysis. The dataset also presents a realistic challenge for AI models due to its class imbalance, where certain door types appear less frequently than others.

Experimental results highlighted the significant benefits of integrating the LLM, demonstrating a notable reduction in annotation time compared to purely manual methods. Furthermore, the human-in-the-loop refinement process led to substantial improvements in the accuracy of both door detection and classification. The study observed that more challenging classification tasks particularly benefited from human intervention, underscoring the value of this hybrid approach.

Also Read:

The DoorDet dataset and its semi-automated creation pipeline exemplify how the synergy between deep learning for object detection and multimodal reasoning capabilities of large language models can efficiently generate valuable datasets for complex, real-world problems. This work is poised to contribute significantly to advancements in automated building analysis. You can find more details about this research in the full paper available here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Developing a Smart System for Identifying Door Types in Floor Plans

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates