Medical Evaluation Sphere

Medical QA Benchmark 🤗

from datasets import load_dataset

ds = load_dataset("lavita/medical-eval-sphere")

# loading the benchmark into a data frame
df = ds['medical_qa_benchmark_v1.0'].to_pandas()

Notebooks Overview

The repository includes several Jupyter notebooks demonstrating various analyses and preprocessing steps. These notebooks are located in the notebooks folder. Below is a brief description of each notebook:

data-preprocessing.ipynb
Preprocessing queries, including deduplication, identifying medical questions, and filtering by language.
difficulty-level-analysis.ipynb
Analyzing the difficulty levels of medical questions.
llm-as-a-judge.ipynb
Evaluating medical questions using an LLM-as-a-judge framework.
similarity-analysis.ipynb
Performing inter- and intra-dataset similarity analysis and semantic deduplication.

Set up API Keys

Create a .env file in the root directory of the project and add the following lines:

ANTHROPIC_API_KEY=your_api_key_here
OPENAI_API_KEY=your_api_key_here
LABELBOX_API_KEY=your_api_key_here

Citation

@article{hosseini2024benchmark,
  title={A Benchmark for Long-Form Medical Question Answering},
  author={Hosseini, Pedram and Sin, Jessica M and Ren, Bing and Thomas, Bryceton G and Nouri, Elnaz and Farahanchi, Ali and Hassanpour, Saeed},
  journal={arXiv preprint arXiv:2411.09834},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
assets		assets
data/prompts		data/prompts
notebooks		notebooks
scripts		scripts
src		src
tests		tests
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Medical Evaluation Sphere

Medical QA Benchmark 🤗

Notebooks Overview

Set up API Keys

Citation

About

Releases

Packages

Languages

lavita-ai/medical-eval-sphere

Folders and files

Latest commit

History

Repository files navigation

Medical Evaluation Sphere

Medical QA Benchmark 🤗

Notebooks Overview

Set up API Keys

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages