Skip to content

Latest commit

 

History

History
11 lines (8 loc) · 994 Bytes

README.md

File metadata and controls

11 lines (8 loc) · 994 Bytes

CCA-images-text

Canonical Correlation Analysis for joint representations of images and text based on [1].

Finds a common representation space for images and tags. Uses the MS COCO dataset, in particularly the captions given for the training data.

  • main/preprocess.py : computes tags for the training images and computes the features using the VGG16 network
  • main/pca_cca.py : computes a PCA on the training data and then performs a CCA to find the projection matrices
  • main/image_to_tags.py : finds the corresponding tags for the images on the validation data
  • main/tag_to_image.py : finds the corresponding images taking as tags the categories of COCO

[1] Yunchao Gong, Qifa Ke, Michael Isard, Svetlana Lazebnik. A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics. International Journal of Computer Vision, Volume 106 Issue 2, January 2014, Pages 210-233.