About statistics of News14 #1

kimwongyuda · 2023-06-01T02:35:31Z

Thank you for your nice work.

I compute # of documents and # of stories by using dataset from https://github.com/Priberam/news-clustering.

The number of documents is same with that in paper, 16136.
However, the number of stories is different with that in paper, 733 vs 788.
I counted unique values of "cluster" keys from the dataset.

Also, since dev set and test set in the dataset are time-independent, if there are same cluster values between dev set and testset, i regard them as distinct values.

But there are just 7 overlapping cluster values.

Could you tell me how you counted the number of stories?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About statistics of News14 #1

About statistics of News14 #1

kimwongyuda commented Jun 1, 2023

About statistics of News14 #1

About statistics of News14 #1

Comments

kimwongyuda commented Jun 1, 2023