REVO-LION: Evaluating and Refining Vision-Language Instruction Tuning Datasets
Ning Liao, Shaofeng Zhang, Renqiu Xia, Bo Zhang, Min Cao, Yu Qiao, Junchi Yan
The dataset will be released soon.
We propose the tune-cross-evaluation
paradigm, which firstly performs the systematic analysis on Vision-Language Instruction Tuning (VLIT) datasets. Based on the holistic evaluation, a comprehensive dataset namely REVO-LION
(REfining VisiOn-Language InstructiOn tuNing) is proposed based on public VLIT datasets. REVO-LION includes a training set, which can be adopted for developing an all-powerful VLIT model, and an evaluation set, which can serve as a stable yet convenient benchmark.