SynthText

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016 with some modifications.

Pre-generated Dataset

A dataset with approximately 800000 synthetic scene-text images generated with this code can be found here.

Adding New Images

Segmentation and depth-maps are required to use new images as background. Sample scripts for obtaining these are available here.

predict_depth.m MATLAB script to regress a depth mask for a given RGB image; uses the network of Liu etal. However, more recent works (e.g., this) might give better results.
run_ucm.m and floodFill.py for getting segmentation masks using gPb-UCM.

For an explanation of the fields in dset.h5 (e.g.: seg,area,label), please check this comment.

Pre-processed Background Images

The 8,000 background images used in the paper, along with their segmentation and depth masks, have been uploaded here: http://zeus.robots.ox.ac.uk/textspot/static/db/<filename>, where, <filename> can be:

imnames.cp [180K]: names of filtered files, i.e., those files which do not contain text
bg_img.tar.gz [8.9G]: compressed image files (more than 8000, so only use the filtered ones in imnames.cp)
depth.h5 [15G]: depth maps
seg.h5 [6.9G]: segmentation maps

Note: I do not own the copyright to these images.

Download corpus and font/color models

wget http://www.robots.ox.ac.uk/~ankush/data.tar.gz

tar -xvf data.tar.gz

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
prep_scripts		prep_scripts
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
colorize3_poisson.py		colorize3_poisson.py
common.py		common.py
generate_synth_data.ipynb		generate_synth_data.ipynb
invert_font_size.py		invert_font_size.py
poisson_reconstruct.py		poisson_reconstruct.py
ransac.py		ransac.py
requirements.txt		requirements.txt
samples.png		samples.png
synth_utils.py		synth_utils.py
synthgen.py		synthgen.py
text_utils.py		text_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SynthText

Pre-generated Dataset

Adding New Images

Pre-processed Background Images

Download corpus and font/color models

To generate images follow instructions in `generate_synth_data.ipynb`

About

Releases

Packages

Languages

License

youscan/SynthText

Folders and files

Latest commit

History

Repository files navigation

SynthText

Pre-generated Dataset

Adding New Images

Pre-processed Background Images

Download corpus and font/color models

To generate images follow instructions in generate_synth_data.ipynb

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

To generate images follow instructions in `generate_synth_data.ipynb`

Packages