Skip to content

Commit

Permalink
Merge branch 'main' into batch-job-spellcheck
Browse files Browse the repository at this point in the history
  • Loading branch information
jeremyarancio committed Sep 4, 2024
2 parents 7c92836 + 0ad0c56 commit cb49cd9
Show file tree
Hide file tree
Showing 10 changed files with 113 additions and 68 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/container-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,7 @@ jobs:
# This is the OSM proxy server, that does have an ipv4 address
echo "SSH_PROXY_HOST=45.147.209.254" >> $GITHUB_ENV
echo "SSH_USERNAME=off" >> $GITHUB_ENV
# We use 'raphael' user as we don't have an off user yet on the proxy machine
echo "SSH_PROXY_USERNAME=raphael" >> $GITHUB_ENV
echo "SSH_PROXY_USERNAME=off" >> $GITHUB_ENV
echo "SSH_PROTOCOL=tcp" >> $GITHUB_ENV
echo "SSH_HOST=10.3.0.200" >> $GITHUB_ENV
echo "ROBOTOFF_INSTANCE=prod" >> $GITHUB_ENV
Expand Down
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
# Changelog

## [1.50.4](https://github.com/openfoodfacts/robotoff/compare/v1.50.3...v1.50.4) (2024-09-03)


### Bug Fixes

* fix product weight detection bug ([fe7e758](https://github.com/openfoodfacts/robotoff/commit/fe7e75814bbbce42e6d6ee86e73102d58c5eff1b))
* only run nutrition table detection for food type ([c019df9](https://github.com/openfoodfacts/robotoff/commit/c019df94b99c7260308b9aeb54d43b685668a57a))
* remove Hacendado store ([e946444](https://github.com/openfoodfacts/robotoff/commit/e946444ee1dabd67b4269d18fafa7639fbc2bac9))
* remove unused class ([22fae32](https://github.com/openfoodfacts/robotoff/commit/22fae32f798240bdfecaf04f0a7adfd08fa41d23))


### Technical

* add tmp volume for ES ([6225bcb](https://github.com/openfoodfacts/robotoff/commit/6225bcb02273aa72f40caa0d837676cd03ccd31f))
* add volume for /tmp ([15130d2](https://github.com/openfoodfacts/robotoff/commit/15130d21747592f816360928f93f5a5e4db84569))
* **deps:** bump sentry-sdk from 1.14.0 to 2.8.0 ([f991bf5](https://github.com/openfoodfacts/robotoff/commit/f991bf536300660613c97f43b294b50a7740f5f0))
* improve robotoff documentation ([b9365a7](https://github.com/openfoodfacts/robotoff/commit/b9365a7cb9c91e25ca0eb2688c95b15506fac2c4))
* make api depends on elasticsearch in docker-compose.yml ([bf99ca8](https://github.com/openfoodfacts/robotoff/commit/bf99ca8f75b1adc8b0fa2f6032cc1e68992220d9))

## [1.50.3](https://github.com/openfoodfacts/robotoff/compare/v1.50.2...v1.50.3) (2024-08-21)


Expand Down
1 change: 0 additions & 1 deletion data/ocr/store_regex.txt
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,6 @@ Franprix
Fred Meyer
Froiz
Giant Eagle
Hacendado
Haggen
Harris Teeter
Harrods
Expand Down
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ services:
depends_on:
- postgres
- redis
- elasticsearch

update-listener:
<<: *robotoff-base
Expand Down
86 changes: 49 additions & 37 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ ignore_missing_imports = true

[tool.poetry]
name = "robotoff"
version = "1.50.3"
version = "1.50.4"
description = "Real-time and batch prediction service for Open Food Facts."
authors = ["Open Food Facts Team"]
license = "GNU Affero General Public License v3"
Expand Down Expand Up @@ -85,7 +85,7 @@ pandas = "~2.2.2"
pyarrow = "~17.0.0"

[tool.poetry.dependencies.sentry-sdk]
version = "~1.14.0"
version = ">=1.14,<2.9"
extras = ["falcon"]

[tool.poetry.dev-dependencies]
Expand Down
4 changes: 0 additions & 4 deletions robotoff/insights/annotate.py
Original file line number Diff line number Diff line change
Expand Up @@ -717,7 +717,3 @@ def annotate(
auth=auth,
is_vote=is_vote,
)


class InvalidInsight(Exception):
pass
20 changes: 18 additions & 2 deletions robotoff/prediction/ocr/product_weight.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,23 @@ def is_valid_weight(weight_value: str) -> bool:
return True


def is_extreme_weight(normalized_value: float, unit: str) -> bool:
def is_extreme_weight(
normalized_value: float, unit: str, count: int | None = None
) -> bool:
"""Return True if the weight is extreme, i.e is likely wrongly detected.
If considered extreme, a prediction won't be generated.
:param normalized_value: the normalized weight value
:param unit: the normalized weight unit
:param count: the number of items in the pack, if any
:return: True if the weight is extreme, False otherwise
"""
if count is not None and int(count) > 20:
# More than 20 items in a pack is quite unlikely for
# a consumer product
return True

if unit == "g":
# weights above 10 kg
return normalized_value >= 10000 or normalized_value <= 10
Expand Down Expand Up @@ -200,7 +216,7 @@ def process_multi_packaging(match) -> Optional[dict]:
normalized_value, normalized_unit = normalize_weight(value, unit)

# Check that the weight is not extreme
if is_extreme_weight(normalized_value, normalized_unit):
if is_extreme_weight(normalized_value, normalized_unit, count):
return None

text = f"{count} x {value} {unit}"
Expand Down
16 changes: 9 additions & 7 deletions robotoff/workers/tasks/import_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,13 +148,15 @@ def run_import_image_job(product_id: ProductIdentifier, image_url: str, ocr_url:
image_url=image_url,
ocr_url=ocr_url,
)
enqueue_job(
run_nutrition_table_object_detection,
get_high_queue(product_id),
job_kwargs={"result_ttl": 0},
product_id=product_id,
image_url=image_url,
)

if product_id.server_type.is_food():
enqueue_job(
run_nutrition_table_object_detection,
get_high_queue(product_id),
job_kwargs={"result_ttl": 0},
product_id=product_id,
image_url=image_url,
)

# Run UPC detection to detect if the image is dominated by a UPC (and thus
# should not be a product selected image)
Expand Down
Loading

0 comments on commit cb49cd9

Please sign in to comment.