ProductOpener::Tags - multilingual tags taxonomies (hierarchies of tags)
ProductOpener::Tags
provides functions to build multilingual tags taxonomies from source files,
to use those taxonomies to canonicalize lists of tags,
and to display them in different languages.
use ProductOpener::Tags qw/:all/;
..
..
This defines which are the fields that are list of values. To this initial list, taxonomized fields will be added by retrieve_tags_taxonomy
Return the value of a property for the first tag of a list that has this property.
Return the value of an inherited property for the first tag of a list that has this property, and the corresponding matching tag.
Return the value of a property for the first tag of a list that has this property that matches the regexp.
Iterating from the most specific category, try to get a property for a tag by exploring the taxonomy (using parents).
The property value if found.
The matching category id if we found a property value.
Try to get a set of properties for a tag by exploring the taxonomy (using parents).
This methods take into account if a property is defined as "undef" (but it cuts value only for the considered branch and might still lead to a value if there are multiple parents branches).
Warning: The algorithm is a bit rough and my not work as you would expect on a DAG. It does not (currently) respect exploration of nodes that joins from multiple parent (in those case you would expect to first explore children from both branches). If we want to change the algorithm for this to work we should first explore parents, and then decide the order, but this methods is more eager to save time.
If may search a description:fr but if fallback is ['xx', 'en'] and we find a description:xx or description:en property we will use this value.
A ref to a hashmap where keys are property names and values are found value. If a property name is not present it means it was not found.
Retrieve properties of a series of tags given in $tagids_ref
and return them, but grouped by $prop_name
, also fetching $props_ref
and $inherited_props_ref
A ref to a hashmap, where keys are property $prop_name
values, and values are in turn hashmaps where keys are tag ids, and values are a hashmap with of properties and their values.
Tags with undefined property are with group under "undef" value.
we asks for quality tags, grouped by fix_action, while getting descriptions { "add_nutrition_facts" => { "en:kcal-does-not-match-other-nutrients" => { "description:en" => "Kcal is not matching value computed from other nutriments" }, "en:kcal-does-not-match-kj" => { "description:en" => "Kcal is not matching kJ value" }, }, "add_categories" => { "en:detected-category-baby-milk" { "description:en" => "Detected category … may be missing baby milks" } } }
Remove stopwords (that are specific to each category) from the start or end of a string that has not been normalized. This function differs from remove_stopwords() that works on normalized tags instead of strings and that also removes stopwords in the middle.
The type of the tag (e.g. categories, labels, allergens)
The language the string is in.
The string to remove stopwords from.
Remove stopwords (that are specific to each category) from a normalized tag.
The type of the tag (e.g. categories, labels, allergens)
The language the tagid is in.
Lowercased, unaccented depending on language, non-alphanumeric chars turned to dash.
Sanitize a taxonomy line before processing
Search for "current tag" (tag at start of line) for a given tag
An optional prefix to display errors if we had to use stopwords / plurals.
If empty, no warning will be displayed.
Build taxonomy from the taxonomy file
Taxonomy will be stored in global hash maps under the entry $tagtype
Like "categories", "ingredients"
Build all taxonomies, including the test taxonomy
Generate an extract of the taxonomy for a specific set of tags.
Options: - fields: comma separated lists of fields (e.g. "name,description,vegan:en,inherited:vegetarian:en" )
Properties can be requested with their name (e.g."description") or name + a specific language (e.g. "vegan:en"). Only properties directly defined for the entry are returned. To include inherited properties from parents, prefix the property with "inherited:" (e.g. "inherited:vegan:en").
- include_parents: include entries for all direct parents of the requested tags - include_children: include entries for all direct children of the requested tags
Languages for which we want to extract names, synonyms, properties.
Initialize all taxonomies. This function is called when the Tags.pm module is loaded, in order to load all available taxonomies, as most scripts / modules that load Tags.pm expect taxonomies to be loaded.
It is also called by lib/startup_apache.pl startup script with the $die_if_some_taxonomies_cannot_be_loaded set to 1.
If set to 1, the function will die if some taxonomies cannot be loaded.
Returns a link to the canonicalized tag
Can be - to indicate that the tag is a negative tag
If an image is associated to a tag, return its relative url, otherwise return undef.
The desired language for the image. If an image is not available in the target language, it can be returned in English or in the tag language.
The type of the tag (e.g. categories, labels, allergens)
Generates a comma separated list of tags in the target language, with links and images.
The type of the tag (e.g. categories, labels, allergens)
Reference to a list of tags. (usually the *_tags field corresponding to the tag type)
Generates a comma separated (with a space after the comma) list of tags in the target language.
The type of the tag (e.g. categories, labels, allergens)
Reference to a list of tags. (usually the *_tags field corresponding to the tag type)
The tags are expected to be in their canonical format.
Canonicalize a string to check if matches an entry in a taxonomy, and die otherwise.
This function is used during initialization, to check that some initialization data has matching entries in taxonomies.
The language of the string.
The type of the tag (e.g. categories, labels, allergens)
The string that we want to match to a tag.
A reference to a variable that will be assigned 1 if we found a matching taxonomy entry, or 0 otherwise.
If the string could be matched to an existing taxonomy entry, the canonical id for the entry is returned.
Otherwise, the function dies.
Canonicalize a string to check if matches an entry in a taxonomy
The language of the string.
The type of the tag (e.g. categories, labels, allergens)
The string that we want to match to a tag.
A reference to a variable that will be assigned 1 if we found a matching taxonomy entry, or 0 otherwise.
If the string could be matched to an existing taxonomy entry, the canonical id for the entry is returned.
Otherwise, we return the string prefixed with the language code (e.g. en:An unknown entry)
Return all entries in a taxonomy.
- undef is the taxonomy does not exist or is not loaded - or a list of all tags
Return all synonyms (including extended synonyms) in a specific language for a taxonomy entry.
- undef is the taxonomy does not exist or is not loaded, or if the tag does not exist - or a list of all synonyms
Return the name of a tag for displaying it to the user. This function builds a cache of the resulting names, in order to reduce execution time. The cache is an ever-growing hash of input parameters. This function should only be used in batch scripts, and not in code called from the Apache mod_perl processes.
The tag translation if it exists in target language, otherwise, the tag id.
Return the name of a tag for displaying it to the user
The tag translation if it exists in target language, otherwise, the tag id.
A version of display_taxonomy_tag that removes eventual language prefix
The tag translation if it exists in target language, otherwise, the tag in its primary language
Return a relative link to a tag page.
Can be - to indicate that the tag is a negative tag
Create regular expressions that will match entries of a taxonomy.
The type of the tag (e.g. categories, labels, allergens)
Either "unique_regexp" to get one single regexp for all entries of one language.
Or "list_of_regexps" to get a list of regexps (1 per entry) for each language. For each entry, we return an array with the entry id, and the the regexp for that entry. e.g. ['en:coffee',"coffee|coffees"]
A reference to a hash to enable options to indicate how to match:
- add_simple_plurals : in some languages, like French, we will allow an extra "s" at the end of entries - add_simple_singulars: same with removing the "s" at the end of entries - match_space_with_dash: spaces or dashes in entries will match either a space or a dash (e.g. "South America" will match "South-America")
Comparison function for canonical tags entries in a taxonomy.
To be used as a sort function in a sort() call.
Each tag is converted to a string, by priority: 1 - the tag name in the target language 2 - the tag name in the xx language 3 - the tag id
The type of the tag (e.g. categories, labels, allergens)
Fetch knowledge content as HTML about additive, categories,...
This content is used in knowledge panels.
Content is stored as HTML files in `${lang_dir}/${target_lc}/knowledge_panels/${tagtype}`. We first check the existence of a file specific to the country specified by `${target_cc}`, with a fallback on `world` otherwise. This is useful to have a more specific description for some countries compared to the `world` base content.
The type of the tag (e.g. categories, labels, allergens)
The tag we want to match, with language prefix (ex: `en:e255`).
The user language as a 2-letters code (fr, it,...)
The user country as a 2-letters code (fr, it, ch) or `world`
If a content exists for the tag type, tag value, language code and country code, return the HTML text, return undef otherwise.