docs: how to use query language (#224)

A small first doc on query language…
openfoodfacts · Jul 25, 2024 · 5505542 · 5505542
1 parent 763c7d3
commit 5505542
Show file tree

Hide file tree

Showing 2 changed files with 84 additions and 43 deletions.
diff --git a/README.md b/README.md
@@ -149,49 +149,7 @@ You should also import taxonomies:
 
 ### Using sort script
 
-In your index configuration, you can add scripts, used for personalized sorting.
-
-For example:
-```yaml
-    scripts:
-      personal_score:
-        # see https://www.elastic.co/guide/en/elasticsearch/painless/8.14/index.html
-        lang: painless
-        # the script source, here a trivial example
-        source: |-
-          doc[params["preferred_field"]].size > 0 ? doc[params["preferred_field"]].value : (doc[params["secondary_field"]].size > 0 ? doc[params["secondary_field"]].value : 0)
-        # gives an example of parameters
-        params:
-          preferred_field: "field1"
-          secondary_field: "field2"
-        # more non editable parameters, can be easier than to declare constants in the script
-        static_params:
-          param1 : "foo"
-```
-
-You then have to import this script in your elasticsearch instance, by running:
-
-```bash
-docker compose run --rm api python -m app sync-scripts
-```
-
-You can now use it with the POST API:
-```bash
-curl -X POST http://127.0.0.1:8000/search \
-  -H "Content-type: application/json" \
-  -d '{"q": "", "sort_by": "personal_score", "sort_params": {"preferred_field": "nova_group", "secondary_field": "last_modified_t"}}
-```
-
-Or you can now use it inside a the sort web-component:
-```html
-  <searchalicious-sort auto-refresh>
-    <searchalicious-sort-script script="personal_score" parameters='{"preferred_field": "nova_group", "secondary_field": "last_modified_t"}}'>
-      Personal preferences
-    </searchalicious-sort-script>
-  </searchalicious-sort>
-```
-even better the parameters might be retrieved for local storage.
-
+See [How to use scripts](./docs/users/how-to-use-scripts.md)
 
 ## Thank you to our sponsors !
 

diff --git a/docs/users/explain-query-language.md b/docs/users/explain-query-language.md
@@ -0,0 +1,83 @@
+# Explain Query Language
+
+The idea of Search-a-licious is to provide a powerfull yet easy to use API,
+through the use of a well proven language: Lucene Query Language.
+
+While Elasticsearch provides a way to use this language in the queries,
+it has some important limitations like the lack of support for nested and object fields.
+
+Thanks to the [luqum library](https://github.com/jurismarches/luqum),
+Search-a-licious is able to use Lucene Query Language in a broader way.
+
+Search-a-licious also use luqum to introspect the query
+and transform it to add features corresponding to your configuration
+and leverage taxonomies and other peculiarities.
+
+It enables for example taking into account synonyms,
+or adding the languages to query about on the fly
+without the need to complexify the query to much at API level.
+
+## Query syntax
+
+The query syntax is quite simple, you can either:
+
+* query for simple word in the default texts fields (those having `full_text_search` property in your configuration)
+  by simply having the word in your query:
+  ```
+  chocolate
+  ```
+  Entries with text `chocolate`
+* or match exactly a full sentences by using quotes:
+  ```
+  "dark chocolate"
+  ```
+  Entries with text `dark chocolate`
+* you can also match a word or phrase in a specific field by using the field name, followed by a colon:
+  ```
+  labels:organic
+  ```
+  Entries with labels containing `organic`
+* you can have more than one term, the query will try to match all terms:
+  ```
+  "dark chocolate" labels:organic
+  ```
+  Entries with text `dark chocolate` and labels containing `organic`
+* you can combine queries with `AND`, `OR` and `NOT` operators, and use parenthesis to group them:
+  ```
+  "dark chocolate AND (labels:organic OR labels:vegan) AND NOT nutriscore:(e OR d)"
+  ```
+  Entries with text `dark chocolate`, labels containing `organic` or `vegan`, and Nutri-Score not `e` or `d`
+* you can query a sub field by using "." or ":":
+  ```
+  nutrients.sugar_100g:[10 TO 15]
+  ```
+  equivalent to:
+  ```
+  nutrients:sugar_100g:[10 TO 15]
+  ```
+  Entries with sugar between 10 and 15 grams per 100g
+* in range you can use * for unbounded values:
+  ```
+  nutrients.sugar_100g:[* TO 20] AND nutrients.proteins_100g:[2 TO *]
+  ```
+  Entries with sugar below 20 g and proteins above 2g for 100g
+* match field existence with `*`:
+  ```
+  nutriscore:*
+  ```
+  Entries with Nutri-Score computed
+
+## Different type of fields
+
+When you created you configuration, you defined different fields types.
+It's important because the matching possibilities are not the same.
+
+In particular for text entries, there are two types of fields:
+* keyword fields, that are used for exact matching
+* full text fields, where part of the text can be matched, with complex matching possibilities
+
+There are also numeric and date fields, that can be used for range matching, or to make computations.
+
+## Full text queries
+
+**FIXME** add more on how we transform queries