Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about joint training with html and bbox. #17

Open
Sanster opened this issue Jun 19, 2024 · 2 comments
Open

Questions about joint training with html and bbox. #17

Sanster opened this issue Jun 19, 2024 · 2 comments

Comments

@Sanster
Copy link

Sanster commented Jun 19, 2024

unitable uses two different models to predict html tokens and bbox tokens separately. I want to know if you have tested a model that outputs html+bbox tokens at the same time, with the data structure as follows:

  • html only output: ["<td>", "</td>"]
  • bbox only output: ["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
  • html+bbox output: ["<td>", "bbox-1", "bbox-2", "bbox-3", "bbox-4", "</td>", "<td></td>"]

The main points for doing this are:

  1. The inference efficiency can be improved.
  2. Hopefully, the correspondence between html tags and bbox will be more accurate.

I would love to know your thoughts on this, thank you.

@ShengYun-Peng
Copy link
Contributor

Hi @Sanster, thanks for the question! There are two main reasons that UniTable is not combining html+bbox right now:

  1. UniTable v1.0 is based on vanilla transformer, so longer input length will significantly increase the memory consumption and decrease the training batch size.
  2. The html branch and bbox branch are independent to each other. Thus, we can infer in parallel in real-world production instead of sequentially if merged in one output sequence.

Btw, thanks for the PR. I will check that later in the fall. 😊

@boostarcher
Copy link

@ShengYun-Peng, for the point 2, in case that the no. of detected bbox and the no. of detected [] are not aligned, if missing the bbox in html branch, the function build_table_from_html_and_cell is not able to match the correct cell with content when building the html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants