Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label length in prediction #18

Open
Venky-JING opened this issue May 10, 2020 · 1 comment
Open

Label length in prediction #18

Venky-JING opened this issue May 10, 2020 · 1 comment

Comments

@Venky-JING
Copy link

Hello!
I'm sorry, I have a question about the label length when predict the test data.

The predicted tag length seems to be the longest sentence length in each batch, but what I want is the predicted tag of the actual length of each sentence.

This is the comparison between the real label length and the predicted label length by qebrain:
real:32 openkiwi:32 qebrain:32
real:27 openkiwi:27 qebrain:32
real:18 openkiwi:18 qebrain:32
real:22 openkiwi:22 qebrain:32
real:14 openkiwi:14 qebrain:32
real:14 openkiwi:14 qebrain:32
real:22 openkiwi:22 qebrain:32
real:22 openkiwi:22 qebrain:32
real:9 openkiwi:9 qebrain:32
real:11 openkiwi:11 qebrain:32
real:20 openkiwi:20 qebrain:32
real:25 openkiwi:25 qebrain:32
real:22 openkiwi:22 qebrain:32
real:13 openkiwi:13 qebrain:32
real:20 openkiwi:20 qebrain:32
real:22 openkiwi:22 qebrain:32
real:19 openkiwi:19 qebrain:58
real:16 openkiwi:16 qebrain:58
real:12 openkiwi:12 qebrain:58
real:12 openkiwi:12 qebrain:58
real:25 openkiwi:25 qebrain:58
real:32 openkiwi:32 qebrain:58
real:17 openkiwi:17 qebrain:58
real:30 openkiwi:30 qebrain:58
real:25 openkiwi:25 qebrain:58
real:26 openkiwi:26 qebrain:58
real:58 openkiwi:58 qebrain:58
real:18 openkiwi:18 qebrain:58
real:24 openkiwi:24 qebrain:58
real:19 openkiwi:19 qebrain:58
real:29 openkiwi:29 qebrain:58
real:13 openkiwi:13 qebrain:58
real:8 openkiwi:8 qebrain:43
real:15 openkiwi:15 qebrain:43
real:19 openkiwi:19 qebrain:43
real:18 openkiwi:18 qebrain:43
real:20 openkiwi:20 qebrain:43
real:43 openkiwi:43 qebrain:43
real:17 openkiwi:17 qebrain:43
real:18 openkiwi:18 qebrain:43
real:15 openkiwi:15 qebrain:43
real:14 openkiwi:14 qebrain:43
real:19 openkiwi:19 qebrain:43
real:28 openkiwi:28 qebrain:43
real:23 openkiwi:23 qebrain:43
real:16 openkiwi:16 qebrain:43
real:12 openkiwi:12 qebrain:43
real:14 openkiwi:14 qebrain:43

How can I deal with this problem? Thank you very much

@lovecambi
Copy link
Owner

Because the training is batched, it is required to pad short sentences such that the current batch of data have the same length. You need to do some post-processing to remove padded positions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants