ProductOpener::Export - export products data in CSV format
ProductOpener::Export
is used to export the data of all populated fields of products matching a given MongoDB search query in Open Food Facts CSV format (UTF-8 encoding,
tab separated).
use ProductOpener::Export qw/:all/; export_csv( { filehandle=>*STDOUT, query=>{ countries_tags=>"en:france", labels_tags=>"en:organic" } });
Only columns that are not completely empty will be included in the resulting CSV file. This is to avoid generating CSV files with thousands of empty columns (e.g. all possible nutrients and all the language specific fields like ingredients_text_[language code] for all the hundreds of possible languages.
Fields that are computed from other fields are not directly provided by users or producers are not exported by default. They can be exported by passing a list of extra fields:
export_csv( { filehandle=>$fh, extra_fields=>[qw(nova_group nutrition_grade_fr)] });
It is also possible to restrict the set of fields to be exported:
export_csv( { filehandle=>$fh, fields=>[qw(code ingredients_text_en additives_tags)] });
This module is used in particular to export product data provided by manufacturers on the producers platform so that it can then be imported in the public database.
In the producers platform, the export_csv
function is executed through a Minion worker.
It is also used in the scripts/export_csv_file.pl
script.
Use the list of fields from Product::Opener::Config::options{import_export_fields_groups}
and the list of nutrients from Product::Opener::Food::nutriments_tables
to list fields that need to be exported.
The results of the query are scanned a first time to compute the list of non-empty columns.
The results of the query are scanned a second time to output the CSV file.
This 2 phases approach is done to avoid having to store all the products data in memory.
If the fields to exports are specified with the fields
parameter, the first phase is skipped.
export_csv()
outputs data in CSV format for products matching a query.
Only non empty columns are included. By default, fields that are computed from other fields are not included, but extra fields can be exported using the third OPTIONS argument.
Arguments are passed through a single hash reference with the following keys:
The file handle can be to a file on disk, to STDOUT etc.
Hash ref that specifies the query that will be passed to MongoDB. Each key value pair will be used to filter products with matching field values.
export_csv( { filehandle=>$fh, query => { categories_tags => "en:beers", ingredients_tags => "en:wheat" }});
Array ref that specifies a list of additional fields to export, including fields that are computed from other fields such as the NOVA group or the Nutri-Score nutritional grade.
Columns for the extra fields will be added after the columns for the populated fields from user and producers.
export_csv({ filehandle=>$fh, extra_fields => [qw(nova_group nutrition_grade_fr)] });
Array ref that specifies the exact list of fields to export. Only the specified fields will be exported.
export_csv({ filehandle=>$fh, fields => [qw(code ingredients_text_en additives_tags)] });
If defined and not null, specifies to export local file paths for selected images for front, ingredients and nutrition in all languages.
This option is used in particular for exporting from the producers platform and importing to the public database.
Obsolete products are in the products_obsolete collection.
Count of the exported documents.
Note: if the max_count parameter is passed