Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016 with some modifications.
A dataset with approximately 800000 synthetic scene-text images generated with this code can be found here.
Segmentation and depth-maps are required to use new images as background. Sample scripts for obtaining these are available here.
predict_depth.m
MATLAB script to regress a depth mask for a given RGB image; uses the network of Liu etal. However, more recent works (e.g., this) might give better results.run_ucm.m
andfloodFill.py
for getting segmentation masks using gPb-UCM.
For an explanation of the fields in dset.h5
(e.g.: seg
,area
,label
), please check this comment.
The 8,000 background images used in the paper, along with their segmentation and depth masks, have been uploaded here:
http://zeus.robots.ox.ac.uk/textspot/static/db/<filename>
, where, <filename>
can be:
imnames.cp
[180K]: names of filtered files, i.e., those files which do not contain textbg_img.tar.gz
[8.9G]: compressed image files (more than 8000, so only use the filtered ones in imnames.cp)depth.h5
[15G]: depth mapsseg.h5
[6.9G]: segmentation maps
Note: I do not own the copyright to these images.
wget http://www.robots.ox.ac.uk/~ankush/data.tar.gz
tar -xvf data.tar.gz