ProductOpener::Images

NAME
DESCRIPTION
Product images on disk
SUPPORTED IMAGE TYPES
- VALID IMAGE TYPES
FUNCTIONS

ProductOpener::Images is used to: - upload product images - select and crop product images - run OCR on images - display product images

Product images on disk

Product images are stored in html/images/products/[product barcode split with slashes]/

SUPPORTED IMAGE TYPES

VALID IMAGE TYPES

Depending on the product type, different image types are allowed. e.g. food, pet food and beauty products have an "ingredients" image type.

FUNCTIONS

get_image_type_and_image_lc_from_imagefield ($imagefield)

We used to identify selected images with a field called "imagefield" which was of the form [image type]_[language code]. In some very old products revisions (e.g. from 2012), we had values with only the image type (e.g. "front").

This function splits the field name into its components, and is used to maintain backward compatibility.

Arguments

$imagefield e.g. "front_fr"

Return values

display_select_crop ($object_ref, $image_type, $image_lc, $language, $request_ref) {

This function is used in the product edit form to display the select cropper with the images that are already uploaded.

display_select_crop_init ($object_ref)

This function is used to generate the code to initialize the select cropper in the product edit form with the images that are already uploaded.

get_code_and_imagefield_from_file_name ( $l $filename )

This function is used to guess if an image is the front of the product, its list of ingredients, or the nutrition facts table, based on the filename.

It is used in particular for bulk upload of photos sent by manufacturers. The file names have many different formats, but they very often include the barcode of the product, and sometimes an indication of what the image is.

Producers are advised to use the [front|ingredients|nutrition|packaging]_[language code] format, but in practice we receive many other names.

The %file_names_to_imagefield_regexps structure below contains some patterns used to guess what the image is about.

generate_resized_images ($path, $filename, $image_source, $sizes_ref, @sizes)

For selected images, we resize to 100, 200, and 400 pixels maximum width or height.

A reference to a hash that will be filled with the sizes of the generated images.

@sizes

An array of sizes to generate. The sizes are the maximum width or height of the image.

Return values

The function returns the error code from ImageMagick if there was an error writing the image.

process_image_upload ( $product_ref, $imagefield, $user_id, $time, $comment, $imgid_ref, $debug_string_ref )

Process an image uploaded to a product (from the web site, from the API, or from an import):

- Read the image - Create a JPEG version - Create resized versions - Store the image in the product data

Arguments

Product ref $product_ref

Image field $imagefield

Indicates what the image is and its language, or indicate a path to the image file (for imports and when uploading an image with a barcode)

User id $user_id

Timestamp of the image $time

Comment $comment

Reference to an image id $img_id

Debug string reference $debug_string_ref

Return values

-2: imgupload field not set -3: we have already received an image with this file size -4: the image is too small -5: the image file cannot be read by ImageMagick

process_image_upload_using_filehandle ($product_ref, $filehandle, $user_id, $time, $comment, $imgid_ref, $debug_string_ref)

- the process_image_upload() function above when the image is uploaded with a CGI multipart form data encoded field (product form + API <= v2) - APIProductImagesUpload.pm for API v3

Arguments

Product ref $product_ref

File handle $filehandle to the image data

User id $user_id

Timestamp of the image $time

Comment $comment

Reference to an image id $imgid_ref

Debug string reference $debug_string_ref

Return values

-2: imgupload field not set -3: we have already received an image with this file size -4: the image is too small -5: the image file cannot be read by ImageMagick

remove_images_by_prefix ( $product_ref, $prefix )

This function removes images files from a product by a given prefix (matching uploaded images or selected images). The image files are moved to a deleted directory.

Arguments

$product_ref

$prefix

For selected images, the prefix is the image type and language code + the product revision. e.g. "ingredients_en.5" or "nutrition_fr.6".

delete_uploaded_image_and_associated_selected_images ( $product_ref, $imgid )

Note: the corresponding product is not saved by this function, it should be saved by the caller. We do not save it in this function so that we can delete multiple images and save the updated product only once.

process_image_move ( $user_id, $code, $imgids, $move_to, $ownerid )

The product code to which the image is moved, or 'trash' if the image is deleted.

$ownerid

Return values

The function returns an error message if there was an error, or undef if the operation was successful.

same_image_generation_parameters ( $generation_1_ref, $generation_2_ref )

This function checks if the image generation parameters are the same for two images.

normalize_generation_ref ( $generation_ref )

This function normalizes the generation_ref so that we store only useful values.

- If the image is not rotated, we don't store the angle. - If the image is not cropped, we don't store the coordinates. - If the image is not normalized, we don't store the normalize value. - If the image is not processed white magic, we don't store the white magic value.

Arguments

$generation_ref

Return values

process_image_crop ( $user_id, $product_ref, $image_type, $image_lc, $imgid, $angle, $normalize, $white_magic, $x1, $y1, $x2, $y2, $coordinates_image_size )

Select and possibly crop an uploaded image to represent the front, ingredients, nutrition or packaging image in a specific language.

Return values

get_image_url ($product_ref, $image_ref, $size)

Note: $image_ref in selected.images.[image type].[image code] does not contain the id field with the image type and language code (which are keys) It must be added to the image_ref before calling this function.

get_image_in_best_language ($product_ref, $image_type, $target_lc)

We return the image object in the best language available for the image type, in the order of preference: - $target_lc - main language of the product - English - any other available language (if any), in alphabetical order

Arguments

- $product_ref: the product reference - $image_type: the image type (front, ingredients, nutrition, packaging) - $target_lc: the target language code - $image_lc_ref: a reference to return the language code of the image (optional)

Return values

- the image reference in the best language available, with an added "id" field containing the image type and language code (e.g. "front_en")

add_images_urls_to_product ($product_ref, $target_lc, $specific_image_type = undef)

If it exists, the image for the target language will be returned, otherwise we will return the image in the main language of the product.

Optional parameter to specify the type of image to add. Default is to add all types.

data_to_display_image ( $product_ref, $image_type, $target_lc )

The resulting data structure can be passed to a template to generate HTML or the JSON data for a knowledge panel.

Arguments

Product reference $product_ref

Image type $image_type: one of [front|ingredients|nutrition|packaging]

Language code $target_lc

Return values

- Reference to a data structure with needed data to display. - undef if no image is available for the requested image type

display_image ( $product_ref, $image_type, $target_lc, $size )

Arguments

Product reference $product_ref

Image type $image_type: one of [front|ingredients|nutrition|packaging]

Language code $target_lc

Size $size: one of $thumb_size, $small_size, $display_size

Return values

select_ocr_engine ($requested_ocr_engine)

Select the OCR engine to use based on the requested OCR engine and the available engines.

If the requested OCR engine is not available, we return the first available one.

Arguments

$requested_ocr_engine

Return values

- 'google_cloud_vision' if Google Cloud Vision API key is available - 'tesseract' if Tesseract OCR is available - undef if no OCR engine is available

extract_text_from_image( $product_ref, $id, $field, $ocr_engine, $results_ref )

Perform OCR for a specific image (either a source image, or a selected image) and return the results.

OCR can be performed with a locally installed Tesseract, or through Google Cloud Vision.

In the case of Google Cloud Vision, we also store the results of the OCR as a JSON file (requested through HTTP by Robotoff).

Arguments

product reference $product_ref

id of the image $id

Either a number like 1, 2 etc. to perform the OCR on a source image (1.jpg, 2.jpg) or a field name in the form of [front|ingredients|nutrition|packaging]_[2 letter language code].

field name $field

Field to update in the product object. e.g. ingredients_text_from_image, nutrition_text_from_image, packaging_text_from_image

Requested OCR engine $requested_ocr_engine

Either "tesseract" or "google_cloud_vision". Note: if the requested OCR engine is not available, we will select the first available one.

Results reference $results_ref

send_image_to_cloud_vision ($image_path, $json_file, $features_ref, $gv_logs)

Arguments

$image_path - str path to image

$json_file - str path to the file where we will store OCR result as gzipped JSON

$features_ref - hash reference - the "features" parameter of Google Cloud Vision

This determine which detection will be performed. Remember each feature is a cost.

@CLOUD_VISION_FEATURES_FULL and @CLOUD_VISION_FEATURES_TEXT are two constant you can use.