spot_img
HomeResearch & DevelopmentTianHui: A New Large Language Model Advancing Traditional Chinese...

TianHui: A New Large Language Model Advancing Traditional Chinese Medicine Applications

TLDR: TianHui is a newly developed domain-specific Large Language Model (LLM) for Traditional Chinese Medicine (TCM). It addresses limitations of previous TCM LLMs by integrating extensive TCM data and employing advanced training techniques like QLoRA and FlashAttention2. Built on DeepSeek-R1-Distill-Qwen-14B, TianHui demonstrated excellent performance across 12 diverse TCM application scenarios, outperforming other LLMs in tasks ranging from diagnosis to knowledge Q&A. The model and its resources are open-sourced, with future plans to incorporate multimodal capabilities and scale up parameters.

Large Language Models (LLMs) have transformed many fields, and their application in healthcare, particularly Traditional Chinese Medicine (TCM), holds immense promise. However, existing TCM LLMs often face limitations, primarily focusing on clinical practice and medical education, struggling with complex research tasks, and lacking comprehensive evaluation datasets and sufficient computational resources.

Introducing TianHui: A Specialized LLM for TCM

To overcome these challenges, researchers have developed TianHui, a new domain-specific LLM designed for diverse Traditional Chinese Medicine scenarios. TianHui aims to enhance the accuracy and professionalism of TCM knowledge processing and provide an intelligent solution for the systematic inheritance and large-scale application of TCM knowledge.

How TianHui Was Built

The development of TianHui involved a meticulous process of data collection, pre-processing, and a phased training strategy. A vast amount of TCM data was gathered, including academic literature, published books, and online public data. This extensive dataset was then pre-processed to create a substantial unsupervised dataset and over 600,000 question-and-answer pairs for supervised training.

The training strategy involved two main phases: Pre-Training (PT) and Supervised Fine-Tuning (SFT). To optimize computational resources and ensure training stability, three key technologies were integrated: Quantized Low-Rank Adaptation (QLoRA) for efficient fine-tuning, DeepSpeed Stage2 for distributed training optimization, and FlashAttention2 for accelerated computation. The base model chosen for TianHui was DeepSeek-R1-Distill-Qwen-14B, selected after evaluating several general models for its superior language understanding and reasoning capabilities.

Evaluating TianHui’s Performance

TianHui’s performance was rigorously evaluated using 12 different benchmark test datasets, covering a wide array of TCM-related application scenarios. These scenarios included tasks like answer prediction, TCM case diagnosis, entity extraction, herb or formula recommendation, acupuncture point recommendation, herbal chemical composition analysis, generation of Chinese patent medicine instructions, description of herbal pharmacological effects, TCM knowledge questions and answers, TCM reading comprehension, topic-led abstract writing, and abstract-driven topic generation.

The results were impressive. TianHui demonstrated excellent performance across all 12 scenarios. It ranked in the top three for each evaluation index in six test datasets (APQ, TCMCD, HFR, HCCA, DHPE, and TLAW) and achieved optimal performance in all indicators for the other six test datasets (TCMEE, APR, GCPMI, TCMKQA, TCMRC, and ADTG).

Also Read:

Comparison and Future Directions

When compared to other LLMs, including general, Chinese medical, and TCM-specific models, TianHui consistently showed superior overall performance. The research also included an ablation study to understand the impact of various hyperparameters on TianHui’s performance, leading to optimal settings for its configuration.

While TianHui represents a significant leap forward, the researchers acknowledge areas for future improvement. These include integrating multimodal processing capabilities to better align with real-world TCM diagnostic methods (observation, auscultation-olfaction, inquiry, and palpation), which involve visual, auditory, textual, and tactile data. Additionally, plans are in place to expand the model’s parameter scale by integrating more computational resources and training data, aiming for even better performance.

The code, data, and models for TianHui are open-sourced on GitHub and HuggingFace, fostering further research and development in the field. You can read the full research paper here: TianHui Research Paper.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -