Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode in data #209

Open
timechess opened this issue Oct 3, 2024 · 1 comment
Open

Unicode in data #209

timechess opened this issue Oct 3, 2024 · 1 comment

Comments

@timechess
Copy link

I noticed that all the unicode characters in the dataset are ascii-encoded Seems like you didn't set ensure_ascii=False. I wonder if this will affect the performance of premise selection. After all, in practical applications, the unicode characters in the proof state are not ascii-encoded.

@yangky11
Copy link
Member

yangky11 commented Oct 4, 2024

Thanks for noticing that. I'll try to re-run the experiments to see if the encoding makes a difference when I get a chance, though it's currently not our priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants