Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Affine grid parameters to crop predicted Bounding Boxes without transformation #79

Open
AniketGurav opened this issue Sep 7, 2022 · 0 comments

Comments

@AniketGurav
Copy link

I have difficulty in understanding the parameters of affine_grid.

The Corresponding line number is 233,234 in train.py

As per my understanding, the following things are happening in function process_boxes to which the above line belongs.
1. Localization part of the network has already predicted all the BB of scene text.
2. While iterating through all BB predicted STN (Spatial Transformer Network)is used to crop the specific text word only from the entire image.
3. cropped images are passed through OCR .
4. The OCR loss is backpropagated
The affine_grid which is part of STR has parameters theta (line 233 in train.py)

This theta is 2*3 matrix where the last column is the center coordinate of the predicted crop, remaining 1st two columns help to do transformations like rotation, etc.

When the above part is used I found cropped image gets distorted due to affine_grid transformation and this may affect the ocr output.

What I want is only cropped text image without any transformation using STN (Affine_grid), I have tried following values for the theta matrix

[ 1 0 predX
0 1 predY ]

Where predX and predY are centres of predicted bounding boxes.

After applying this also crops are a few times unrecognizable or look significantly different.

So inshort can you suggest the parameters of theta
such that it only crops the predicted BB by network
without any transformation.

@AniketGurav AniketGurav changed the title Affine grid parameters Affine grid parameters to crop predicted Bounding Boxes without transformation Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant