Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Third version of Train Triples QID PID Format that mimics triples.train.full.tsv.gz #21

Open
seanmacavaney opened this issue Jan 21, 2021 · 2 comments

Comments

@seanmacavaney
Copy link
Contributor

Per discussion here: 4695a71

The current version of qidpidtriples.train.full.2.tsv.gz has the same records as triples.train.full.tsv.gz, but they are in a different order.

It would be nice for these to be consistent so that those using these files as the training data sequence can control for the order of training in experiments.

@seanmacavaney
Copy link
Contributor Author

fwiw it appears that the version of the qid/pid triples file prior to 4695a71 did have the triples in the same order as triples.train.full.tsv.gz (but some records were missing, which is what the change was about).

@seanmacavaney
Copy link
Contributor Author

I think there are compelling reasons to have the qidpidtruples file in the same order as the triples file. But I also understand that this may seem somewhat pedantic and not be seen as a priority.

If I built this file for you, would you host it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant