You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to compare the performance of several g2p systems, so I download the CPP dataset, and try to reproduce the result showed in this repo. But I got much worse acc.
For g2pM v0.1.2.5,I got 92.9% for train set, 92.1% for dev set, and 91.6% for test set. Even ignore the tone information, the accs are: 96.6%, 96.1% 96.0% for train, dev and test set.
For pypinyin v0.36.0, I got 79.2%, 78.7%, 79.1% with tone, and 89.4%, 89.1%, 89.3% without tone.
To be more clear:
The full sentence was fed to each system, to got the pinyin result.
Then extract the predict as re.findall(r'▁ ([a-z0-9:]+) ▁', pinyin)[0].
Finally, the acc was calculated as np.array([i == j for i, j in zip(pred, gt)]).
I'd like to know how do you get the acc value?
Attachment is the prediction for test set.
If any mistake in the computation, please point it out. Thanks,
The text was updated successfully, but these errors were encountered:
I want to compare the performance of several g2p systems, so I download the CPP dataset, and try to reproduce the result showed in this repo. But I got much worse acc.
For g2pM v0.1.2.5,I got 92.9% for train set, 92.1% for dev set, and 91.6% for test set. Even ignore the tone information, the accs are: 96.6%, 96.1% 96.0% for train, dev and test set.
For pypinyin v0.36.0, I got 79.2%, 78.7%, 79.1% with tone, and 89.4%, 89.1%, 89.3% without tone.
To be more clear:
re.findall(r'▁ ([a-z0-9:]+) ▁', pinyin)[0]
.np.array([i == j for i, j in zip(pred, gt)])
.I'd like to know how do you get the acc value?
Attachment is the prediction for test set.
If any mistake in the computation, please point it out. Thanks,
The text was updated successfully, but these errors were encountered: