ProductOpener::Packaging
Extract packaging data from packaging info / recycling instructions photo.
This function creates regular expressions that match all variations of packaging shapes, materials etc. that we want to recognize in packaging text.
This function parses a single phrase (e.g. "5 25cl transparent PET bottles") and returns a packaging object with properties like units, quantity, material, shape etc.
If the text is prefixed by a 2-letter language code followed by : (e.g. fr:), the language overrides the $text_language parameter (often set to the product language).
This is useful in particular for packaging tags fields added by Robotoff that are prefixed with the language.
It will also be useful when we taxonomize the packaging tags (not taxonomized as of 2022/03/04): existing packaging tags will be prefixed by the product language.
Can be overrode if the text is prefixed with a language code (e.g. fr:boite en carton)
Packaging object (hash) reference with optional properties: recycling, material, shape
Given a text like "couvercle en métal", this function tries to guess the language of the text based on how well it matches the packaging taxonomies.
One use is to convert packaging tags for which we don't have a language to a version prefixed by the language.
Candidate languages are provided in an ordered list, and the function returns the one that matches more properties (material, shape, recycling). In case of a draw, the priority is given according to the order of the list.
- undef if no match was found - or language code of the better matching language
Check and taxonomize packaging component data (e.g. from the product WRITE API, or from the web edit form)
The API response object is used to return warnings and errors to the caller. If a warning or error is found (e.g. an unrecognized input), it is returned in the "warnings" or "errors" array of the response object.
A taxonomized packaging structure corresponding to the input packaging structure.
Use rules to add more properties or more precise properties to a packaging component. Some rules may depend on the product. (e.g. if the product category is "en:coffees", and the shape of the packaging component is "en:capsule", we assume the shape is "en:coffee-capsule")
This function adds the data for a packaging component to the packagings data structure, or if the packaging component data is compatible with an existing component of the packagings structure, the data is combined.
The API response object is used to return warnings and errors to the caller. If a warning or error is found (e.g. an unrecognized input), it is returned in the "warnings" or "errors" array of the response object.
20221104: - the number field was renamed to number_of_units - the quantity field was renamed to quantity_per_unit
rename old fields this code can be removed once all products have been updated
Re-canonicalize the shape, material and recycling properties of packaging components. This is useful in particular if the corresponding taxonomies have changed.
Set packaging_(shapes|materials|recycling)_tags fields, with values from the packaging components of the product.
Set some tags in the /misc/ facet so that we can track the products that have (or don't have) packaging data.
Return the parent material (glass, plastics, metal, paper or cardboard) of a material. Return unknown if the material does not match one of the parents, or if not defined.
Aggregate the weights of each packaging component by parent material (glass, plastics, metal, paper or cardboard)
Compute stats for the parent materials of a product: - % of the weight of a material over the weight of all packaging - weight of packaging per 100g of product
Also compute the main parent material.
This function populates the packagings structure with data extracted from the packaging_text field. It is used only when there is no pre-existing data in the packagings structure.
This function analyzes all the packaging information available for the product:
- the existing packagings data structure - the packaging_text entered by users or retrieved from the OCR of recycling instructions (e.g. "glass bottle to recycle, metal cap to discard") - labels (e.g. FSC)
And combines them in an updated packagings data structure.
Note: as of 2022/11/29, the "packaging" tags field is not used as input.
Note: as of 2023/02/13, the "packaging_text" field is used as input only if there isn't an existing packagings data structure. This is to avoid double counting some packaging elements that may be referred to using different shapes (e.g. pot vs jar, or sleeve vs box etc.)