-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: run OCR extraction on every new image #543
Conversation
The parsed host list was ['"localhost', '127.0.0.1"']
data["created_at"] = int(time.time()) | ||
|
||
with gzip.open(ocr_json_path, "wt") as f: | ||
f.write(json.dumps(data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it stores the result in a jsonl.gz file next to the image ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we re-run on the same image, will it override ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it stores the result in a jsonl.gz file next to the image ?
Yes exactly!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we re-run on the same image, will it override ?
It depends on the value of override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ouiii bien vu !
In this script openfoodfacts-server/blob/main/scripts/run_ocr.py you also run "features": [
{"type": "TEXT_DETECTION"},
{"type": "LOGO_DETECTION"},
{"type": "LABEL_DETECTION"},
{"type": "SAFE_SEARCH_DETECTION"},
{"type": "FACE_DETECTION"},
], not only TEXT_DETECTION, would this be of interest here as well? |
todo : avoid running OCR extraction in the testruns + delete test image ? |
@@ -75,6 +77,10 @@ def upload(self, request: Request) -> Response: | |||
status=status.HTTP_400_BAD_REQUEST, | |||
) | |||
file_path, mimetype, image_thumb_path = store_file(request.data.get("file")) | |||
async_task( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to refactor this bit to add it to the post_save signal instead
similar to what is done with locations (OSM) & products (OFF)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #549
Fixes #320