Skip to content

Current Experiments on Chinese English dataset

Jetic Gu edited this page Jul 16, 2017 · 2 revisions

Experiments on Chinese English dataset

0. Machine

CPU: 3.5 GHz 6-Core Intel Xeon E5 with 12 MB L3‑Cache(Turbo Boost to 3.9 GHz)

Memory: 32GB 1866 MHz DDR3 ECC

1. Experiments on Chinese/English datasets

Datasize: 20k sentences for training, 2k sentences for testing.

Source words/tags: [48549, 33]

Target words/tags: [25535, 54]

IBM1

Time: 197sec(Python), 133sec(Cython)
Decode: 898/1243(Python/Cython) sentences per sec
Precision = 0.508552336309
Recall    = 0.433699072893
AER       = 0.531847499254
F-score   = 0.468152500746

IBM1 With Alignment Type

Stage1:  580sec(Python),  408sec(Cython)
Stage2: 1156sec(Python),  875sec(Cython)
Total:  1736sec(Python), 1283sec(Cython)
Decode: 138/177(Python/Cython) sentences per sec
Precision = 0.589088914377
Recall    = 0.502381559922
AER       = 0.457708816306
F-score   = 0.542291183694

HMM

Stage1: 204sec(Python), 118sec(Cython)
Stage2: 377sec(Python), 292sec(Cython)
Total:  581sec(Python), sec(Cython)
Decode: 193/210(Python/Cython) sentences per sec
Precision = 0.69972164231
Recall    = 0.513141107425
AER       = 0.407919917562
F-score   = 0.592080082438

HMM With Alignment Type

Stage1.1: 150sec(Python),    75sec(Cython)
Stage1.2: 687sec(Python),   503sec(Cython)
Stage2.1: 200sec(Python),   120sec(Cython)
Stage2.2: 1202sec(Python),  917sec(Cython)
Total:    2239sec(Python), 1615sec(Cython)
Decode: 87/106(Python/Cython) sentences per sec
Precision = 0.72647484167
Recall    = 0.619545802501
AER       = 0.331236945394
F-score   = 0.668763054606

2. Experiments with Toutanova(2002) models on Chinese/English datasets

The following experiments used the exact same dataset as above.

Datasize: 20k sentences for training, 2k sentences for testing.

Source words/tags: [48549, 33]

Target words/tags: [25535, 54]

Toutanova1: POS Tags for translation probability

Stage1: 125sec(Cython)
Stage2:  76sec(Cython)
Stage3: 442sec(Cython)
Total:  643sec(Cython)
Decode: 194 sentences per sec
Precision = 0.668445331006
Recall    = 0.570043378413
AER       = 0.384664822742
F-score   = 0.615335177258

Toutanova2: POS Tags for translation probability

Stage1: 126sec(Cython)
Stage2: 941sec(Cython)
Total: 1067sec(Cython)
Decode: 145 sentences per sec
Precision = 0.679543459175
Recall    = 0.510206685379
AER       = 0.417175753307
F-score   = 0.582824246693

Toutanova3: POS Tags for translation probability

Stage1: 127sec(Cython)
Stage2: 393sec(Cython)
Total:  520sec(Cython)
Decode: 187 sentences per sec
Precision = 0.700871781504
Recall    = 0.514565790593
AER       = 0.406559990191
F-score   = 0.593440009809

NULL Extension in Translation Model

Stage1: 130sec(Cython)
Stage2: 356sec(Cython)
Total:  486sec(Cython)
Decode: 167 sentences per sec
Precision = 0.732126891027
Recall    = 0.435293867483
AER       = 0.454026590567
F-score   = 0.545973409433