Questions about joint training with html and bbox. #17

Sanster · 2024-06-19T09:48:04Z

unitable uses two different models to predict html tokens and bbox tokens separately. I want to know if you have tested a model that outputs html+bbox tokens at the same time, with the data structure as follows:

html only output: ["<td>", "</td>"]
bbox only output: ["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
html+bbox output: ["<td>", "bbox-1", "bbox-2", "bbox-3", "bbox-4", "</td>", "<td></td>"]

The main points for doing this are:

The inference efficiency can be improved.
Hopefully, the correspondence between html tags and bbox will be more accurate.

I would love to know your thoughts on this, thank you.

The text was updated successfully, but these errors were encountered:

ShengYun-Peng · 2024-06-22T22:19:03Z

Hi @Sanster, thanks for the question! There are two main reasons that UniTable is not combining html+bbox right now:

UniTable v1.0 is based on vanilla transformer, so longer input length will significantly increase the memory consumption and decrease the training batch size.
The html branch and bbox branch are independent to each other. Thus, we can infer in parallel in real-world production instead of sequentially if merged in one output sequence.

Btw, thanks for the PR. I will check that later in the fall. 😊

boostarcher · 2024-06-29T06:47:44Z

@ShengYun-Peng, for the point 2, in case that the no. of detected bbox and the no. of detected [] are not aligned, if missing the bbox in html branch, the function build_table_from_html_and_cell is not able to match the correct cell with content when building the html.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about joint training with html and bbox. #17

Questions about joint training with html and bbox. #17

Sanster commented Jun 19, 2024

ShengYun-Peng commented Jun 22, 2024

boostarcher commented Jun 29, 2024

Questions about joint training with html and bbox. #17

Questions about joint training with html and bbox. #17

Comments

Sanster commented Jun 19, 2024

ShengYun-Peng commented Jun 22, 2024

boostarcher commented Jun 29, 2024