a question about ViTSTR #5

Danee-wawawa · 2021-06-03T11:31:57Z

Hi, thank you for your work. This is a very meaningful job. Regarding algorithm design, I have a question.
You convert an input image into patches firstly, if some characters are cut off or some patch contains multiple characters, will it have an impact?
Looking forward to your reply.

roatienza · 2021-06-03T13:10:18Z

The image is divided into non-overlapping patches. A patch may contain 0 or more character or even partial characters only.
With position embedding, the transformer is able to figure out the parts of a whole. So, it has no impact.
Not tried and something that can be experimented on: overlapping patches and smaller patches as done in DINO.

Danee-wawawa · 2021-06-04T06:57:09Z

OK, thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a question about ViTSTR #5

a question about ViTSTR #5

Danee-wawawa commented Jun 3, 2021

roatienza commented Jun 3, 2021

Danee-wawawa commented Jun 4, 2021

a question about ViTSTR #5

a question about ViTSTR #5

Comments

Danee-wawawa commented Jun 3, 2021

roatienza commented Jun 3, 2021

Danee-wawawa commented Jun 4, 2021