Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

SPIN

1 Introduction

This code repository contains an implementation of (SPIN: SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition (AAAI2021) ). Chromatic diffculties in complex scenes have not been paid much attention on. We introduce a new learnable geometric-unrelated module,the Structure-Preserving Inner Offset Network (SPIN), which allows the color manipulation of source data within the network.

2. Preparing Dataset

Train Dataset

Dataset Samples Description Release
MJSynth 8919257 Scene text recognition synthetic data set Link
SynText 7266164 A synthesized by scene text dataset, and the text is cropped from the large image Link

Validation Dataset

Test Set Instance Number Note
IIIT5K 3000 regular
SVT 647 regular
IC03_860 860 regular
IC13_857 857 regular
IC15_1811 1811 irregular
SVTP 645 irregular
CUTE80 288 irregular

Test Dataset

Test Set Instance Number Note
IIIT5K 3000 regular
SVT 647 regular
IC03_860 860 regular
IC13_857 857 regular
IC15_1811 1811 irregular
SVTP 645 irregular
CUTE80 288 irregular

3 Getting Started

Preparation

A quick start is to use above lmdb-formatted datasets that contain the full benchmarks for scene text recognition tasks as belows.

Data Type: LMDB

File storage format:
   |-- train           
   |   |-- MJ
   |   |-- ST
   |-- validation
   |   |-- mixture
   |-- evaluation
   |   |-- mixture

Training

Run the following bash command in the command line,

cd .
bash ./train.sh 

We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add --no-validate command.

Evaluation

Run following scripts to compare different rectification modules.

cd .
bash ./test_scripts/test_affine.sh
cd .
bash ./test_scripts/test_tps.sh
cd .
bash ./test_scripts/test_spin.sh
cd .
bash ./test_scripts/test_gaspin.sh

4 Results

Evaluation

Methods Regular Text Irregular Text Download
Name IIIT5K SVT IC03 IC13 IC15 SVTP CUTE80 Config Model
Affine(Report) - - - - - - -

-

-

Affine 94.0 87.9 93.6 94.4 81.2 80.9 83.0

Config

pth [Link] (Access Code: 5nr1)

TPS(Report) 87.9 87.5 94.9 93.6 77.6 79.2 74.0

-

-

TPS 94.2 90.4 94.5 95.0 82.1 82.6 83.7

Config

pth [Link] (Acceess Code: 024F)

SPIN(Report) 94.7 87.6 93.4 91.5 79.1 79.7 85.1

-

-

SPIN 94.7 89.8 93.4 94.1 80.7 82.2 83.7

Config

pth [Link] (Code:M45z)

GA-SPIN(Report) 94.7 90.3 94.4 92.8 82.2 82.8 87.5

-

-

GA-SPIN 94.6 89.0 93.3 94.2 80.7 83.0 84.7

Config

pth [Link] (Access Code: 12q3)

Visualization

Here is the picture for result visualization.

visualization

bad_cases

Citation

@article{SPIN,
  title={SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition},
  author={Chengwei Zhang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Yi Niu, Fei Wu},
  journal={AAAI},
  year={2021}
}

License

This project is released under the Apache 2.0 license

Contact

If there is any suggestion and problem, please feel free to contact the author with [email protected].