Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/main' into deprecate_range_params
Browse files Browse the repository at this point in the history
  • Loading branch information
mayya-sharipova committed Sep 24, 2024
2 parents eafb33c + a3806cd commit 5c6cfb0
Show file tree
Hide file tree
Showing 117 changed files with 3,987 additions and 499 deletions.
4 changes: 0 additions & 4 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,3 @@ server/src/main/java/org/elasticsearch/threadpool @elastic/es-core-infra
# Security
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/privilege @elastic/es-security
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/store/ReservedRolesStore.java @elastic/es-security

# Analytical engine
x-pack/plugin/esql @elastic/es-analytical-engine
x-pack/plugin/esql-core @elastic/es-analytical-engine
6 changes: 6 additions & 0 deletions docs/changelog/112972.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 112972
summary: "ILM: Add `total_shards_per_node` setting to searchable snapshot"
area: ILM+SLM
type: enhancement
issues:
- 112261
5 changes: 5 additions & 0 deletions docs/changelog/113013.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 113013
summary: Account for `DelayedBucket` before reduction
area: Aggregations
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/113158.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 113158
summary: Adds a new Inference API for streaming responses back to the user.
area: Machine Learning
type: enhancement
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/113183.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 113183
summary: "ESQL: TOP support for strings"
area: ES|QL
type: feature
issues:
- 109849
6 changes: 6 additions & 0 deletions docs/changelog/113373.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 113373
summary: Implement `parseBytesRef` for `TimeSeriesRoutingHashFieldType`
area: TSDB
type: bug
issues:
- 112399
5 changes: 5 additions & 0 deletions docs/changelog/113385.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 113385
summary: Small performance improvement in h3 library
area: Geo
type: enhancement
issues: []
4 changes: 2 additions & 2 deletions docs/plugins/analysis-icu.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -380,7 +380,7 @@ GET /my-index-000001/_search <3>
--------------------------

<1> The `name` field uses the `standard` analyzer, and so support full text queries.
<1> The `name` field uses the `standard` analyzer, and so supports full text queries.
<2> The `name.sort` field is an `icu_collation_keyword` field that will preserve the name as
a single token doc values, and applies the German ``phonebook'' order.
<3> An example query which searches the `name` field and sorts on the `name.sort` field.
Expand Down Expand Up @@ -467,7 +467,7 @@ differences.
`case_first`::

Possible values: `lower` or `upper`. Useful to control which case is sorted
first when case is not ignored for strength `tertiary`. The default depends on
first when the case is not ignored for strength `tertiary`. The default depends on
the collation.

`numeric`::
Expand Down
4 changes: 2 additions & 2 deletions docs/plugins/analysis-kuromoji.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ The `kuromoji_iteration_mark` normalizes Japanese horizontal iteration marks

`normalize_kanji`::

Indicates whether kanji iteration marks should be normalize. Defaults to `true`.
Indicates whether kanji iteration marks should be normalized. Defaults to `true`.

`normalize_kana`::

Expand Down Expand Up @@ -194,7 +194,7 @@ PUT kuromoji_sample
+
--
Additional expert user parameters `nbest_cost` and `nbest_examples` can be used
to include additional tokens that most likely according to the statistical model.
to include additional tokens that are most likely according to the statistical model.
If both parameters are used, the largest number of both is applied.

`nbest_cost`::
Expand Down
2 changes: 1 addition & 1 deletion docs/plugins/analysis-nori.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@ Which responds with:
The `nori_number` token filter normalizes Korean numbers
to regular Arabic decimal numbers in half-width characters.

Korean numbers are often written using a combination of Hangul and Arabic numbers with various kinds punctuation.
Korean numbers are often written using a combination of Hangul and Arabic numbers with various kinds of punctuation.
For example, 3.2천 means 3200.
This filter does this kind of normalization and allows a search for 3200 to match 3.2천 in text,
but can also be used to make range facets based on the normalized numbers and so on.
Expand Down
50 changes: 25 additions & 25 deletions docs/plugins/development/creating-stable-plugins.asciidoc
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
[[creating-stable-plugins]]
=== Creating text analysis plugins with the stable plugin API

Text analysis plugins provide {es} with custom {ref}/analysis.html[Lucene
analyzers, token filters, character filters, and tokenizers].
Text analysis plugins provide {es} with custom {ref}/analysis.html[Lucene
analyzers, token filters, character filters, and tokenizers].

[discrete]
==== The stable plugin API

Text analysis plugins can be developed against the stable plugin API. This API
consists of the following dependencies:

* `plugin-api` - an API used by plugin developers to implement custom {es}
* `plugin-api` - an API used by plugin developers to implement custom {es}
plugins.
* `plugin-analysis-api` - an API used by plugin developers to implement analysis
plugins and integrate them into {es}.
* `lucene-analysis-common` - a dependency of `plugin-analysis-api` that contains
core Lucene analysis interfaces like `Tokenizer`, `Analyzer`, and `TokenStream`.

For new versions of {es} within the same major version, plugins built against
this API do not need to be recompiled. Future versions of the API will be
this API does not need to be recompiled. Future versions of the API will be
backwards compatible and plugins are binary compatible with future versions of
{es}. In other words, once you have a working artifact, you can re-use it when
you upgrade {es} to a new bugfix or minor version.
Expand Down Expand Up @@ -48,9 +48,9 @@ require code changes.

Stable plugins are ZIP files composed of JAR files and two metadata files:

* `stable-plugin-descriptor.properties` - a Java properties file that describes
* `stable-plugin-descriptor.properties` - a Java properties file that describes
the plugin. Refer to <<plugin-descriptor-file-{plugin-type}>>.
* `named_components.json` - a JSON file mapping interfaces to key-value pairs
* `named_components.json` - a JSON file mapping interfaces to key-value pairs
of component names and implementation classes.

Note that only JAR files at the root of the plugin are added to the classpath
Expand All @@ -65,7 +65,7 @@ you use this plugin. However, you don't need Gradle to create plugins.

The {es} Github repository contains
{es-repo}tree/main/plugins/examples/stable-analysis[an example analysis plugin].
The example `build.gradle` build script provides a good starting point for
The example `build.gradle` build script provides a good starting point for
developing your own plugin.

[discrete]
Expand All @@ -77,52 +77,52 @@ Plugins are written in Java, so you need to install a Java Development Kit
[discrete]
===== Step by step

. Create a directory for your project.
. Create a directory for your project.
. Copy the example `build.gradle` build script to your project directory. Note
that this build script uses the `elasticsearch.stable-esplugin` gradle plugin to
build your plugin.
. Edit the `build.gradle` build script:
** Add a definition for the `pluginApiVersion` and matching `luceneVersion`
variables to the top of the file. You can find these versions in the
`build-tools-internal/version.properties` file in the {es-repo}[Elasticsearch
** Add a definition for the `pluginApiVersion` and matching `luceneVersion`
variables to the top of the file. You can find these versions in the
`build-tools-internal/version.properties` file in the {es-repo}[Elasticsearch
Github repository].
** Edit the `name` and `description` in the `esplugin` section of the build
script. This will create the plugin descriptor file. If you're not using the
`elasticsearch.stable-esplugin` gradle plugin, refer to
** Edit the `name` and `description` in the `esplugin` section of the build
script. This will create the plugin descriptor file. If you're not using the
`elasticsearch.stable-esplugin` gradle plugin, refer to
<<plugin-descriptor-file-{plugin-type}>> to create the file manually.
** Add module information.
** Ensure you have declared the following compile-time dependencies. These
dependencies are compile-time only because {es} will provide these libraries at
** Ensure you have declared the following compile-time dependencies. These
dependencies are compile-time only because {es} will provide these libraries at
runtime.
*** `org.elasticsearch.plugin:elasticsearch-plugin-api`
*** `org.elasticsearch.plugin:elasticsearch-plugin-analysis-api`
*** `org.apache.lucene:lucene-analysis-common`
** For unit testing, ensure these dependencies have also been added to the
** For unit testing, ensure these dependencies have also been added to the
`build.gradle` script as `testImplementation` dependencies.
. Implement an interface from the analysis plugin API, annotating it with
. Implement an interface from the analysis plugin API, annotating it with
`NamedComponent`. Refer to <<example-text-analysis-plugin>> for an example.
. You should now be able to assemble a plugin ZIP file by running:
+
[source,sh]
----
gradle bundlePlugin
----
The resulting plugin ZIP file is written to the `build/distributions`
The resulting plugin ZIP file is written to the `build/distributions`
directory.

[discrete]
===== YAML REST tests

The Gradle `elasticsearch.yaml-rest-test` plugin enables testing of your
plugin using the {es-repo}blob/main/rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/README.asciidoc[{es} yamlRestTest framework].
The Gradle `elasticsearch.yaml-rest-test` plugin enables testing of your
plugin using the {es-repo}blob/main/rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/README.asciidoc[{es} yamlRestTest framework].
These tests use a YAML-formatted domain language to issue REST requests against
an internal {es} cluster that has your plugin installed, and to check the
results of those requests. The structure of a YAML REST test directory is as
an internal {es} cluster that has your plugin installed, and to check the
results of those requests. The structure of a YAML REST test directory is as
follows:

* A test suite class, defined under `src/yamlRestTest/java`. This class should
* A test suite class, defined under `src/yamlRestTest/java`. This class should
extend `ESClientYamlSuiteTestCase`.
* The YAML tests themselves should be defined under
* The YAML tests themselves should be defined under
`src/yamlRestTest/resources/test/`.

[[plugin-descriptor-file-stable]]
Expand Down
2 changes: 1 addition & 1 deletion docs/plugins/discovery-azure-classic.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ Before starting, you need to have:
--

You should follow http://azure.microsoft.com/en-us/documentation/articles/linux-use-ssh-key/[this guide] to learn
how to create or use existing SSH keys. If you have already did it, you can skip the following.
how to create or use existing SSH keys. If you have already done it, you can skip the following.

Here is a description on how to generate SSH keys using `openssl`:

Expand Down
2 changes: 1 addition & 1 deletion docs/plugins/discovery-gce.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,7 @@ discovery:
seed_providers: gce
--------------------------------------------------

Replaces `project_id` and `zone` with your settings.
Replace `project_id` and `zone` with your settings.

To run test:

Expand Down
4 changes: 2 additions & 2 deletions docs/plugins/integrations.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ Integrations are not plugins, but are external tools or modules that make it eas
Elasticsearch Grails plugin.

* https://hibernate.org/search/[Hibernate Search]
Integration with Hibernate ORM, from the Hibernate team. Automatic synchronization of write operations, yet exposes full Elasticsearch capabilities for queries. Can return either Elasticsearch native or re-map queries back into managed entities loaded within transaction from the reference database.
Integration with Hibernate ORM, from the Hibernate team. Automatic synchronization of write operations, yet exposes full Elasticsearch capabilities for queries. Can return either Elasticsearch native or re-map queries back into managed entities loaded within transactions from the reference database.

* https://github.com/spring-projects/spring-data-elasticsearch[Spring Data Elasticsearch]:
Spring Data implementation for Elasticsearch
Expand All @@ -104,7 +104,7 @@ Integrations are not plugins, but are external tools or modules that make it eas

* https://pulsar.apache.org/docs/en/io-elasticsearch[Apache Pulsar]:
The Elasticsearch Sink Connector is used to pull messages from Pulsar topics
and persist the messages to a index.
and persist the messages to an index.

* https://micronaut-projects.github.io/micronaut-elasticsearch/latest/guide/index.html[Micronaut Elasticsearch Integration]:
Integration of Micronaut with Elasticsearch
Expand Down
2 changes: 1 addition & 1 deletion docs/plugins/mapper-annotated-text.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ broader positional queries e.g. finding mentions of a `Guitarist` near to `strat

WARNING: Any use of `=` signs in annotation values eg `[Prince](person=Prince)` will
cause the document to be rejected with a parse failure. In future we hope to have a use for
the equals signs so wil actively reject documents that contain this today.
the equals signs so will actively reject documents that contain this today.

[[annotated-text-synthetic-source]]
===== Synthetic `_source`
Expand Down
4 changes: 2 additions & 2 deletions docs/plugins/store-smb.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ include::install_remove.asciidoc[]
==== Working around a bug in Windows SMB and Java on windows

When using a shared file system based on the SMB protocol (like Azure File Service) to store indices, the way Lucene
open index segment files is with a write only flag. This is the _correct_ way to open the files, as they will only be
opens index segment files is with a write only flag. This is the _correct_ way to open the files, as they will only be
used for writes and allows different FS implementations to optimize for it. Sadly, in windows with SMB, this disables
the cache manager, causing writes to be slow. This has been described in
https://issues.apache.org/jira/browse/LUCENE-6176[LUCENE-6176], but it affects each and every Java program out there!.
Expand Down Expand Up @@ -44,7 +44,7 @@ This can be configured for all indices by adding this to the `elasticsearch.yml`
index.store.type: smb_nio_fs
----

Note that setting will be applied for newly created indices.
Note that settings will be applied for newly created indices.

It can also be set on a per-index basis at index creation time:

Expand Down
48 changes: 48 additions & 0 deletions docs/reference/esql/functions/kibana/definition/top.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions docs/reference/esql/functions/types/top.asciidoc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion docs/reference/ilm/actions/ilm-searchable-snapshot.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ index>> prefixed with `partial-` to the frozen tier. In other phases, the action

In the frozen tier, the action will ignore the setting
<<total-shards-per-node,`index.routing.allocation.total_shards_per_node`>>, if it was present in the original index,
to account for the difference in the number of nodes between the frozen and the other tiers.
to account for the difference in the number of nodes between the frozen and the other tiers. To set <<total-shards-per-node,`index.routing.allocation.total_shards_per_node`>> for searchable snapshots, set the `total_shards_per_node` option in the frozen phase's `searchable_snapshot` action within the ILM policy.


WARNING: Don't include the `searchable_snapshot` action in both the hot and cold
Expand Down Expand Up @@ -74,6 +74,9 @@ will be performed on the hot nodes. If using a `searchable_snapshot` action in t
force merge will be performed on whatever tier the index is *prior* to the `cold` phase (either
`hot` or `warm`).

`total_shards_per_node`::
The maximum number of shards (replicas and primaries) that will be allocated to a single node for the searchable snapshot index. Defaults to unbounded.

[[ilm-searchable-snapshot-ex]]
==== Examples
////
Expand Down
30 changes: 30 additions & 0 deletions docs/reference/mapping/params/ignore-above.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,33 @@ NOTE: The value for `ignore_above` is the _character count_, but Lucene counts
bytes. If you use UTF-8 text with many non-ASCII characters, you may want to
set the limit to `32766 / 4 = 8191` since UTF-8 characters may occupy at most
4 bytes.

[[index-mapping-ignore-above]]
=== `index.mapping.ignore_above`

The `ignore_above` setting, typically used at the field level, can also be applied at the index level using
`index.mapping.ignore_above`. This setting lets you define a maximum string length for all applicable fields across
the index, including `keyword`, `wildcard`, and keyword values in `flattened` fields. Any values that exceed this
limit will be ignored during indexing and won’t be stored.

This index-wide setting ensures a consistent approach to managing excessively long values. It works the same as the
field-level setting—if a string’s length goes over the specified limit, that string won’t be indexed or stored.
When dealing with arrays, each element is evaluated separately, and only the elements that exceed the limit are ignored.

[source,console]
--------------------------------------------------
PUT my-index-000001
{
"settings": {
"index.mapping.ignore_above": 256
}
}
--------------------------------------------------

In this example, all applicable fields in `my-index-000001` will ignore any strings longer than 256 characters.

TIP: You can override this index-wide setting for specific fields by specifying a custom `ignore_above` value in the
field mapping.

NOTE: Just like the field-level `ignore_above`, this setting only affects indexing and storage. The original values
are still available in the `_source` field if `_source` is enabled, which is the default behavior in Elasticsearch.
Loading

0 comments on commit 5c6cfb0

Please sign in to comment.