Explain configuration file#

The idea of Search-a-licious is that you provide all your configurations details in a central file, and all the rest works (at least for main scenarios).

The configuration file is a YAML file.

One configuration, multiple datasets#

A Search-a-licious instance only have one configuration file, but is capable of serving multiple datasets

It provides a section for each index you want to create (corresponding to a dataset).

If you have more than one dataset, one must be declared the default (see default_index)

Main sections#

For each indexe the main sections are:

index: some configuration of the Elasticsearch index
fields: the fields you want to put in the index, their type and other configurations
taxonomy: definitions of taxonomies that are used by this index
redis_stream_name and document_fetcher: if you use continuous updates, you will need to define one
preprocessor and result_processor are two fields enabling to handle specificities of your dataset.
scripts: to use sort by script (see How to use scripts)

Index configuration#

Search-a-licious is really based upon Elasticsearch,

This section provides some important fields to control the way it is used.

id_field_name is particularly important as it must contain a field that uniquely identifies each items. If you don't have such field, you might use preprocessor to compute one. It is important to have such an id to be able to use continuous updates.

last_modified_field_name is also important for continuous updates to decide where to start the event stream processing.

Fields#

This is one of the most important section.

It specifies what will be stored in your index, which fields will be searchable, and how.

You have to plan in advance how you configure this.

Think well about:

fields you want to search and how you want to search them
which information you need to display in search results
what you need to sort on
which facets you want to display
which charts you need to build

Changing this section will probably involve a full re-indexing of all your items.

Some typical configurations for fields:

A tags field that as values that are searched as an exact value (aka keyword), eg. a tag:

tags:
    type: keyword

An ingredients field that is used for full text search when no field is specified:

ingredients:
    type: text
    full_text_search: true

A field product_name that is used for full text search, but with multilingual support:

product_name:
    full_text_search: true
    type: text_lang

A scans_n field is an integer field:

scans_n:
    type: integer

A specific_warnings field that is used for full text search, but only when you specify the field:

specific_warnings:
    type: text

A field brands_tags that needs to be split in multiple values (according to split_separator option):

brands_tags:
    type: keyword
    split: true

A field labels_tags, that is used for exact match but with support of a taxonomy, and that can be used for faceting, and bar graph generation:

labels_tags:
    type: keyword
    taxonomy_name: label
    bucket_agg: true

Read more in the reference documentation.

Document fetcher, pre-processors and post-processors#

It is not always straight forward to index an item.

Search-a-licious offers a way for you to customize some critical operations using Python code.

preprocessor adapts you document before being indexed
whereas result_processor adapts each result returned by a search, keep it lightweight !
document_fetcher is only used for continuous updates to fetch documents using an API

Read more in the reference documentation.

Scripts#

You can also add scripts for sorting documents. See How to use scripts.