ZhongJing-TCM-Benchmark

A comprehensive benchmark dataset for evaluating Traditional Chinese Medicine (TCM) common sense knowledge in Large Language Models.

Overview

ZhongJing-TCM is a pioneering dataset designed to evaluate Large Language Models' proficiency in Traditional Chinese Medicine. Named after the renowned physician Zhang ZhongJing, this benchmark comprises 12,000 clinically relevant questions spanning 175 topics across 9 TCM categories, stratified into three difficulty levels.

Key Features

Comprehensive Coverage: 12,000 clinically relevant questions
Diverse Topics: 175 unique topics across 9 TCM categories
Multiple Question Types: Single-choice, multiple-choice, and open-ended questions
Difficulty Levels: Three-tiered stratification
Expert Validation: Verified by multi TCM experts
High-Quality Data: Generated using innovative three-stage synthetic data generation strategy

Dataset Structure

ZhongJing-TCM-Benchmark/
├── data/
│   ├── train/
│   ├── validation/
│   └── test/
├── metadata/
│   ├── categories.json
│   └── topics.json
└── evaluation/
    ├── metrics/
    └── baselines/

Usage

from zhongjing_tcm import TCMDataset

# Load the dataset
dataset = TCMDataset(split='train')

# Get a sample question
question = dataset[0]
print(question.text)
print(question.options)
print(question.answer)
print(question.explanation)

Contributing

We welcome contributions to improve the dataset and evaluation metrics. Please feel free to submit issues and pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

We acknowledge the contributions of ancient Chinese medicine physicians, notably ZhongJing Zhang after whom our dataset is named. Special thanks to the nonprofit organization Future Medicine Philosophy (Ful-Phil) and all collaborating physicians who contributed to this research.

Citation

@article{anonymous2024zhongjing,
  title={ZhongJing-TCM: A Benchmark for Evaluating Traditional Chinese Medicine Common Sense Knowledge in Large Language Models},
  author={Anonymous},
  journal={ArXiv},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ZhongJing-TCM-Benchmark

Overview

Key Features

Dataset Structure

Categories

Usage

Contributing

License

Acknowledgements

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

ZhongJing-TCM-Benchmark

Overview

Key Features

Dataset Structure

Categories

Usage

Contributing

License

Acknowledgements

Citation