Percentage estimation of ingredients #10704

laurenskz · 2024-08-15T05:43:55Z

Problem

The current estimation of ingredients percentages is a good step in the right direction but there is a lot of room for improvement. For example: Suppose we know that a product has the second ingredient sugar and the total carbs per 100 grams are 14. Then we know that the second ingredient <14g/100g. This implies the first ingredient is more than 14%. This is currently overlooked and we make arbitrary assumptions about ratios.

Proposed solution

There is a paper published which uses all available information about mandatory nutrients combined with linear optimization techniques: https://www.sciencedirect.com/science/article/pii/S0889157522001260.

The general idea is as follows. We have n ingredients, we have k known nutrients( mandatory labeling) :
For eachingredient_j (1<=j<=n):
Fetch nutrition info of ingredient_j (since it is simple ingredient there should be something in db). Let nutrient_j_k be nutrient value per 100g of nutrient k for food j.
Then we declare:
quantity_j as double with range (0,1)
add constraint if j>1: quantity_j <= quantity_{j-i}
if food has known percentage add constraint for that.
Now to the smart part:

Let total_nutrient_k be the value of total_nutrient_k for the total product. Then for each k that is known we add constraint:

quantity_1 * nutrient_1_k + quantity_2 * nutrient_2_k + ... + quantity_n * nutrient_n_k > 0.99 * total_nutrient_k
quantity_1 * nutrient_1_k + quantity_2 * nutrient_2_k + ... + quantity_n * nutrient_n_k < 1.01 * total_nutrient_k

Add following constraints:

sum_i quantity_i >0.99
sum_i quantity_i < 1.01

Now we have a system of linear constraints that can easily be solved by a LP solver. The resulting quantities will sum to one. And this should utilize all available information.

Additional context

The authors note the following: A study with known ingredient compositions shows that estimates are within a 0.9% difference of products’ actual recipes. This would be a huge improvement.

Code pointers

This could very easily be implemented in python using ortools library: https://developers.google.com/optimization/lp/lp_example#python_7 .

Number of products impacted

All products composed of multiple ingredients

Time per product

More accuracy for end user

The text was updated successfully, but these errors were encountered:

teolemon added this to 🍊 Open Food Facts Server issues Aug 15, 2024

github-project-automation bot moved this to To discuss and validate in 🍊 Open Food Facts Server issues Aug 15, 2024

teolemon assigned stephanegigandet Aug 15, 2024

teolemon added this to Recipe Estimator Prototype and Ingredient analysis Aug 15, 2024

github-project-automation bot moved this to Todo in Recipe Estimator Prototype Aug 15, 2024

github-project-automation bot moved this to To do in Ingredient analysis Aug 15, 2024

teolemon added the 🥗🔍 Ingredients analysis https://wiki.openfoodfacts.org/Ingredients_Extraction_and_Analysis label Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Percentage estimation of ingredients #10704

Percentage estimation of ingredients #10704

laurenskz commented Aug 15, 2024

Percentage estimation of ingredients #10704

Percentage estimation of ingredients #10704

Comments

laurenskz commented Aug 15, 2024

Problem

Proposed solution

Additional context

Code pointers

Number of products impacted

Time per product