TLDR: BenCao is a ChatGPT-based multimodal AI assistant for Traditional Chinese Medicine (TCM) that integrates structured knowledge, diagnostic data, and expert feedback through natural language instruction tuning. It excels in TCM diagnostics, herb recognition, and constitution classification, outperforming general-domain LLMs. Deployed on the OpenAI GPTs Store, BenCao offers interpretable reasoning and scenario-based interactions, demonstrating a practical framework for aligning generative AI with traditional medical reasoning.
Traditional Chinese Medicine (TCM), with its rich history spanning over two millennia, continues to be a vital part of global healthcare. However, integrating modern artificial intelligence, particularly large language models (LLMs), into TCM has presented unique challenges. TCM relies heavily on holistic reasoning, implicit logic, and diverse diagnostic cues like visual and tactile information, which go beyond simple text-based understanding.
Existing LLMs designed for TCM have made strides in processing textual information but often fall short in incorporating multimodal data, providing clear explanations, and being practically applicable in clinical settings. To address these limitations, researchers have developed BenCao, a groundbreaking ChatGPT-based multimodal assistant specifically tailored for Traditional Chinese Medicine. You can read the full research paper here: BenCao: An Instruction-Tuned Large Language Model for Traditional Chinese Medicine.
BenCao stands out because it was trained using natural language instruction tuning, rather than extensive parameter retraining. This approach helps align the model’s reasoning with expert-level TCM practices and ethical standards. The system is built upon a comprehensive knowledge base containing over 1,000 classical and modern TCM texts. It also features a scenario-based instruction framework that allows for diverse interactions, a ‘chain-of-thought’ simulation mechanism for transparent reasoning, and a feedback refinement process involving licensed TCM practitioners.
Key Features of BenCao
BenCao integrates with external APIs for tasks such as tongue-image classification and multimodal database retrieval, enabling dynamic access to crucial diagnostic resources. This multimodal capability is a significant advancement, as TCM diagnosis often relies on visual cues like tongue appearance, which previous models struggled to incorporate.
The system operates across four fundamental scenarios:
- Learning of TCM Theory: Designed for students and enthusiasts, this mode explains core concepts by citing authoritative classical sources, ensuring reliability and traceability.
- Conditioning for Mild Health Discomforts: For common ailments like headaches or insomnia, BenCao provides preliminary analysis based on TCM syndrome differentiation and offers lifestyle, dietary, and rest recommendations. It always includes safety disclaimers, advising users to seek professional medical care if symptoms persist or worsen.
- Constitution Assessment and Tongue Diagnosis: Users can engage in an interactive questionnaire to understand their TCM constitution type. They can also upload tongue images for analysis, with results provided as preliminary references, not professional diagnoses.
- Daily Health Preservation and Seasonal Wellness Guidance: This scenario offers personalized health advice aligned with seasonal changes, drawing from TCM principles of harmony between nature and humans.
Interpretable Reasoning and Expert Refinement
To mimic the reasoning of experienced TCM clinicians, BenCao incorporates a Chain-of-Thought (CoT) simulation mechanism. This allows the model to present a structured and interpretable reasoning process, proactively asking for more information when needed. The reasoning follows stages like symptom recognition, pattern differentiation, treatment principle reasoning, and lifestyle recommendation generation.
A unique aspect of BenCao’s development is its human feedback-guided instruction refinement. Instead of retraining parameters, expert TCM physicians provided natural language feedback to optimize the model’s reasoning structure, linguistic precision, and ethical compliance. This iterative process ensures that BenCao’s responses align with professional TCM logic and standards.
Performance and Deployment
Evaluations show that BenCao achieved superior accuracy compared to general-domain and other TCM-domain models across various tasks. It performed particularly well in diagnostics, herb recognition (82.18% accuracy), and constitution classification (63.42% accuracy). These results highlight its enhanced ability to grasp domain-specific reasoning patterns in TCM.
BenCao has been deployed as an interactive intelligent agent on the OpenAI GPTs Store, making it accessible to a global audience. As of October 2025, it has facilitated nearly 1,000 user interaction sessions, demonstrating its practical applicability and ease of use.
Also Read:
- Adaptive Learning for Medical Text Understanding: The TACL Framework
- TraceCoder: A New Framework for Accurate and Explainable ICD Coding
Future Outlook
While BenCao represents a significant step forward, it is currently a research prototype. Its diagnostic accuracy and prescription capabilities are intentionally limited to prevent misuse. Future work aims to expand its multimodal integration, incorporating physiological signals and electronic medical records for more comprehensive reasoning. The development of open benchmarks and standardized evaluation protocols will also be crucial for ensuring safety, transparency, and reproducibility in TCM-domain LLMs.
In conclusion, BenCao showcases the potential of generative AI to promote medical diversity and drive global healthcare innovation by providing a trustworthy, knowledge-grounded, and human-aligned large language model for Traditional Chinese Medicine. It offers a practical and scalable framework for adapting foundation models to specialized medical domains.


