Training on texts with different lengths #292

Magdiel3 · 2020-05-25T20:12:21Z

How should I handle variation in text length (words for each line in training file)? Is it okay to just train with these differences or should I perform any normalization tasks to the text lengths before?

I am working on classifying words to a text that better fits them (i.e. relate the word electronics to text that mention or are about this topic). I'm just training on trainMode 0 with the text as data and the name of the text source as the label. The length of each text variate in range from 1 to 700 words. (Median of 74 words and std of 96 words).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training on texts with different lengths #292

Training on texts with different lengths #292

Magdiel3 commented May 25, 2020

Training on texts with different lengths #292

Training on texts with different lengths #292

Comments

Magdiel3 commented May 25, 2020