Skip to content

RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or images are included in this dataset).

Notifications You must be signed in to change notification settings

google-research-datasets/richhf-18k

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RichHF-18K Dataset

RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the original file name of the associated labeled images.

To cite our paper:

@inproceedings{richhf,
  title={Rich Human Feedback for Text-to-Image Generation},
  author={Youwei Liang and Junfeng He and Gang Li and Peizhao Li and Arseniy Klimovskiy and Nicholas Carolan and Jiao Sun and Jordi Pont-Tuset and Sarah Young and Feng Yang and Junjie Ke and Krishnamurthy Dj Dvijotham and Katie Collins and Yiwen Luo and Yang Li and Kai J Kohlhoff and Deepak Ramachandran and Vidhya Navalpakkam},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2024},
}

The labels to be released include subjective scores (e.g., aesthetics score), human-labeled heatmaps (e.g., artifact regions of distorted pixels) and misalignment tokens in the text prompts. The dataset contains 17,760 examples in Tensorflow Example format, consisting of 15,810 training examples, 995 development examples and 955 test examples. The dataset doesn't contain the original images, but only their filenames, which you can use to find the corresponding images from the original Pick-a-pic dataset.

The labels are annotated on generated images from Pick-a-pic v1: https://stability.ai/research/pick-a-pic

As the tfrecord files are stored with Git Large File Storage (LFS), before git clone the repo, you might need to install LFS: https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage

The tfrecord file can be loaded by tf.data.TFRecordDataset directly.

Each example contains the following 8 fields:

  • filename: The original image filename which can be mapped to images in pick-a-pic v1 dataset.
  • aesthetics_score: Aesthetics score.
  • artifact_score: Artifact score.
  • misalignment_score: Text-image misliagnment score.
  • overall_score: Overal score.
  • artifact_map: Artifact heatmap.
  • misalignment_map: Misalignment heatmap.
  • prompt_misalignment_label: Token-level labels for misaligned tokens in the prompt.

All scores are the higher the better. For example, higher overall score indicates higher overal quality, and higher artifact score indicates less artifacts in the image.

For how to parse the tfrecord file and match the misalignment labels to each token in the prompt, please see codes at https://github.com/google-research/google-research/tree/master/richhf_18k

About

RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or images are included in this dataset).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published