Unicode in data #209

timechess · 2024-10-03T13:26:38Z

I noticed that all the unicode characters in the dataset are ascii-encoded Seems like you didn't set ensure_ascii=False. I wonder if this will affect the performance of premise selection. After all, in practical applications, the unicode characters in the proof state are not ascii-encoded.

The text was updated successfully, but these errors were encountered:

yangky11 · 2024-10-04T15:26:39Z

Thanks for noticing that. I'll try to re-run the experiments to see if the encoding makes a difference when I get a chance, though it's currently not our priority.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unicode in data #209

Unicode in data #209

timechess commented Oct 3, 2024

yangky11 commented Oct 4, 2024

Unicode in data #209

Unicode in data #209

Comments

timechess commented Oct 3, 2024

yangky11 commented Oct 4, 2024