spot_img
HomeResearch & DevelopmentAI-Powered Shoulder Disorder Diagnosis with Everyday Cameras

AI-Powered Shoulder Disorder Diagnosis with Everyday Cameras

TLDR: This research introduces a low-cost, accessible method for preliminary shoulder disorder diagnosis using videos from consumer-grade cameras and Multimodal Large Language Models (MLLMs). Their Hybrid Motion Video Diagnosis (HMVDx) framework uses two MLLMs: one for action understanding (Gemini-1.5-Pro) and another for disease diagnosis (DeepSeek-R1), significantly improving diagnostic accuracy by 79.6% compared to direct video diagnosis. The study also proposes a new “Usability Index” to evaluate MLLMs in medical contexts, highlighting the potential of this technology for early detection, especially in areas with limited medical resources.

Shoulder disorders, such as frozen shoulder, are common conditions that affect many people globally, particularly the elderly and those who perform repetitive shoulder tasks. In areas where medical resources are scarce, getting an early and accurate diagnosis can be very challenging. This highlights a critical need for affordable and easily scalable diagnostic solutions.

A recent research paper introduces an innovative approach to address this problem: using videos captured by everyday consumer-grade cameras as the foundation for diagnosis. This method significantly reduces costs for users and makes preliminary diagnosis more accessible. The core of this research lies in the application of Multimodal Large Language Models (MLLMs) for diagnosing shoulder disorders.

Introducing HMVDx: A Hybrid Approach to Diagnosis

The researchers propose a novel framework called Hybrid Motion Video Diagnosis (HMVDx). This framework cleverly divides the complex task of diagnosis into two distinct parts: action understanding and disease diagnosis. Each part is handled by a different MLLM. This division of labor is crucial, as it reduces the complexity for each model and significantly improves diagnostic accuracy and reliability.

In the HMVDx framework, one MLLM (specifically, Gemini-1.5-Pro) is responsible for converting video information into detailed text descriptions of a patient’s actions. Following this, a separate reasoning large language model (DeepSeek-R1) takes these descriptions and, based on pre-set diagnostic rules, makes a judgment about the presence of a shoulder disorder. This sequential processing mimics how a medical professional might observe a patient’s movements before making a diagnosis.

Key Innovations and Contributions

Beyond the HMVDx framework itself, the research introduces several other important contributions:

  • Motion Trajectories Prompt Framework: This framework helps MLLMs understand human actions by analyzing orthopedic popular science videos and summarizing judgment actions and standards. It replaces numerical measurements with relative position descriptions to improve accuracy, making it easier for medical practitioners to adapt general LLMs for specific medical uses at a low cost.

  • Usability Index: Recognizing the limitations of traditional evaluation metrics in the medical field, the study proposes a new metric called the Usability Index. This index evaluates the effectiveness of MLLMs from the perspective of the entire medical diagnostic pathway, considering action recognition, movement diagnosis, and final diagnosis. It provides a more comprehensive tool for assessing the applicability of MLLMs in preliminary diagnosis.

Also Read:

Performance and Future Potential

Experimental comparisons showed that HMVDx significantly improved the accuracy of diagnosing shoulder joint injuries by 79.6% compared to direct video diagnosis methods. This demonstrates a substantial technical advancement for applying MLLMs to video understanding in medicine.

While HMVDx showed strong performance, especially in scenarios focusing on final judgment and logical consistency, the researchers acknowledge that current methods still face challenges in meeting the stringent requirements of real-world medical applications, particularly when considering the entire diagnostic process (Scenario 3). This indicates areas for future optimization, such as improving dynamic video analysis, body positioning accuracy, and action decomposition.

The study highlights the immense potential of low-cost MLLMs in assisting medical practitioners with diagnosing shoulder disorders. Future research could integrate advanced techniques like Supervised Fine-Tuning (SFT), Retrieval Augmented Generation (RAG), and Agentic AI to further enhance performance. Expanding research into multilingual environments and leveraging more diverse medical imaging and video data will also broaden the application scope and improve the model’s generalization ability.

This research represents a significant step towards creating accessible and affordable preliminary diagnostic tools for shoulder disorders, particularly beneficial for regions with limited medical resources. For more details, you can read the full paper here.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -