You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In mini_pubtabnet_examples.jsonl, structure tokens have '[', ']',like '[', ']', which is different from origin annotations in pubtabnet. Could you share your data processing python scripts about this.
The text was updated successfully, but these errors were encountered:
This issue is the same as #30 . Unitable has preprocessed the original Pubtabnet dataset, using <td>[]</td> to represent a non-empty cell and <td></td> to represent an empty cell. The processed format is similar to mini_pubtabnet_examples.jsonl. Once familiar with the data format, this preprocessing task should not be difficult to implement.
In mini_pubtabnet_examples.jsonl, structure tokens have '[', ']',like '[', ']', which is different from origin annotations in pubtabnet. Could you share your data processing python scripts about this.
The text was updated successfully, but these errors were encountered: