You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
unitable uses two different models to predict html tokens and bbox tokens separately. I want to know if you have tested a model that outputs html+bbox tokens at the same time, with the data structure as follows:
html only output: ["<td>", "</td>"]
bbox only output: ["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
Hi @Sanster, thanks for the question! There are two main reasons that UniTable is not combining html+bbox right now:
UniTable v1.0 is based on vanilla transformer, so longer input length will significantly increase the memory consumption and decrease the training batch size.
The html branch and bbox branch are independent to each other. Thus, we can infer in parallel in real-world production instead of sequentially if merged in one output sequence.
Btw, thanks for the PR. I will check that later in the fall. 😊
@ShengYun-Peng, for the point 2, in case that the no. of detected bbox and the no. of detected [] are not aligned, if missing the bbox in html branch, the function build_table_from_html_and_cell is not able to match the correct cell with content when building the html.
unitable uses two different models to predict
html
tokens andbbox
tokens separately. I want to know if you have tested a model that outputshtml+bbox
tokens at the same time, with the data structure as follows:["<td>", "</td>"]
["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
["<td>", "bbox-1", "bbox-2", "bbox-3", "bbox-4", "</td>", "<td></td>"]
The main points for doing this are:
I would love to know your thoughts on this, thank you.
The text was updated successfully, but these errors were encountered: