Skip to content

Conversation

donoghuc
Copy link
Contributor

@donoghuc donoghuc commented Oct 1, 2025

The custom docker-setup.sh script here is not materially different from the shared script in the .ci repository. The only difference were pulling and starting the elasticsearch image. This should not affect CI materially and the consistency gain we get using the same docker-setup.sh file everywhere is preferrable to having a slightly different script here.

Thanks for contributing to Logstash! If you haven't already signed our CLA, here's a handy link: https://www.elastic.co/contributor-agreement/

The custom docker-setup.sh script here is not materially different from the
shared script in the `.ci` repository. The only difference were pulling and
starting the elasticsearch image. This should not affect CI materially and the
consistency gain we get using the same `docker-setup.sh` file everywhere is
preferrable to having a slightly different script here.
@donoghuc donoghuc force-pushed the backport-ci-script-removal branch from 136d87a to 7684d55 Compare October 1, 2025 21:51
@donoghuc donoghuc closed this Oct 7, 2025
@donoghuc donoghuc reopened this Oct 7, 2025
@donoghuc
Copy link
Contributor Author

Here is an interesting case...

The tests for 11.x are failing

context "after a bulk insert that generates errors" do
let(:bulk) { [
LogStash::Event.new("message" => "sample message here"),
LogStash::Event.new("message" => { "message" => "sample nested message here" }),
]}
it "increases bulk request with error metric" do
expect(bulk_request_metrics).to receive(:increment).with(:with_errors).once
expect(bulk_request_metrics).to_not receive(:increment).with(:successes)
subject.multi_receive(bulk)
end
it "increases number of successful and non retryable documents" do
expect(document_level_metrics).to receive(:increment).with(:dlq_routed).once
expect(document_level_metrics).to receive(:increment).with(:successes).once
subject.multi_receive(bulk)
end
because BOTH events report 201 success.

Looking at the log messages we see this is likely due to how ES is configured WRT streams:

main

elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,653][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,704][INFO ][logstash.outputs.elasticsearch] Connected to ES instance
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,705][INFO ][logstash.outputs.elasticsearch] Elasticsearch version determined (9.2.0-SNAPSHOT)
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,718][INFO ][logstash.outputs.elasticsearch] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"custom_index_5644"}
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,720][INFO ][logstash.outputs.elasticsearch] Data streams auto configuration (`data_stream => auto` or unset) resolved to `false`

11.x

logstash-1       | [2025-10-10T22:54:40,928][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance
logstash-1       | [2025-10-10T22:54:40,930][INFO ][logstash.outputs.elasticsearch] Elasticsearch version determined (9.2.0-SNAPSHOT)
logstash-1       | [2025-10-10T22:54:40,940][INFO ][logstash.outputs.elasticsearch] Data streams auto configuration (`data_stream => auto` or unset) resolved to `true`
elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.083Z", "log.level": "INFO", "message":"creating index [.ds-logs-generic-default-2025.10.10-000001]
elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.087Z", "log.level": "INFO", "message":"adding data stream [logs-generic-default]

The test fails because there is no 400 returned for the second event, we see:

elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.222Z", "log.level": "INFO", "message":"Error while parsing document for index [.ds-logs-generic-default-2025.10.10-000001]: [1:184] failed to parse field [message] of type [match_only_text] in document with id 'AZnQVfkBuBfqAVVk0iNS'. Preview of field's value: '{message=sample nested message here}'", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[34f03d73d199][write][T#2]","log.logger":"org.elasticsearch.index.mapper.DocumentMapper"

elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.233Z", "log.level": "INFO", "message":"creating index [.fs-logs-generic-default-2025.10.10-000002] in project [default], cause [rollover_failure_store], templates [provided in request], shards [1]/[1]"

elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.235Z", "log.level": "INFO", "message":"rolling over data stream [logs-generic-default] to index [.fs-logs-generic-default-2025.10.10-000002] because it was marked for lazy rollover"

Is this an ES configuration issue? How should we update the 11.x plugin to account for this? Do we need to change the test? The test matrix?

@mashhurs
Copy link
Contributor

Here is an interesting case...

The tests for 11.x are failing

context "after a bulk insert that generates errors" do
let(:bulk) { [
LogStash::Event.new("message" => "sample message here"),
LogStash::Event.new("message" => { "message" => "sample nested message here" }),
]}
it "increases bulk request with error metric" do
expect(bulk_request_metrics).to receive(:increment).with(:with_errors).once
expect(bulk_request_metrics).to_not receive(:increment).with(:successes)
subject.multi_receive(bulk)
end
it "increases number of successful and non retryable documents" do
expect(document_level_metrics).to receive(:increment).with(:dlq_routed).once
expect(document_level_metrics).to receive(:increment).with(:successes).once
subject.multi_receive(bulk)
end

because BOTH events report 201 success.
Looking at the log messages we see this is likely due to how ES is configured WRT streams:

main

elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,653][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,704][INFO ][logstash.outputs.elasticsearch] Connected to ES instance
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,705][INFO ][logstash.outputs.elasticsearch] Elasticsearch version determined (9.2.0-SNAPSHOT)
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,718][INFO ][logstash.outputs.elasticsearch] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"custom_index_5644"}
elasticsearch-1  | {"@timestamp":"2025-10-10T22:43:58,720][INFO ][logstash.outputs.elasticsearch] Data streams auto configuration (`data_stream => auto` or unset) resolved to `false`

11.x

logstash-1       | [2025-10-10T22:54:40,928][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance
logstash-1       | [2025-10-10T22:54:40,930][INFO ][logstash.outputs.elasticsearch] Elasticsearch version determined (9.2.0-SNAPSHOT)
logstash-1       | [2025-10-10T22:54:40,940][INFO ][logstash.outputs.elasticsearch] Data streams auto configuration (`data_stream => auto` or unset) resolved to `true`
elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.083Z", "log.level": "INFO", "message":"creating index [.ds-logs-generic-default-2025.10.10-000001]
elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.087Z", "log.level": "INFO", "message":"adding data stream [logs-generic-default]

The test fails because there is no 400 returned for the second event, we see:

elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.222Z", "log.level": "INFO", "message":"Error while parsing document for index [.ds-logs-generic-default-2025.10.10-000001]: [1:184] failed to parse field [message] of type [match_only_text] in document with id 'AZnQVfkBuBfqAVVk0iNS'. Preview of field's value: '{message=sample nested message here}'", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[34f03d73d199][write][T#2]","log.logger":"org.elasticsearch.index.mapper.DocumentMapper"

elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.233Z", "log.level": "INFO", "message":"creating index [.fs-logs-generic-default-2025.10.10-000002] in project [default], cause [rollover_failure_store], templates [provided in request], shards [1]/[1]"

elasticsearch-1  | {"@timestamp":"2025-10-10T22:54:41.235Z", "log.level": "INFO", "message":"rolling over data stream [logs-generic-default] to index [.fs-logs-generic-default-2025.10.10-000002] because it was marked for lazy rollover"

Is this an ES configuration issue? How should we update the 11.x plugin to account for this? Do we need to change the test? The test matrix?

Long story short, we missed backporting this change - https://github.com/logstash-plugins/logstash-output-elasticsearch/pull/1220/files#diff-4ac1f3b518168b2e30c23d2c20e6eb3c4fa19c45a560bf481bca5641d41e968a

If we apply the following change, CI will be happy as I have validated on my local.

diff --git a/spec/integration/outputs/metrics_spec.rb b/spec/integration/outputs/metrics_spec.rb
index 4fa31c1..08bd85e 100644
--- a/spec/integration/outputs/metrics_spec.rb
+++ b/spec/integration/outputs/metrics_spec.rb
@@ -5,7 +5,10 @@ describe "metrics", :integration => true do
     require "logstash/outputs/elasticsearch"
     settings = {
       "manage_template" => false,
-      "hosts" => "#{get_host_port()}"
+      "hosts" => "#{get_host_port()}",
+      # write data to a random non templated index to ensure the bulk partially fails
+      # don't use streams like "logs-*" as those have failure stores enabled, causing the bulk to succeed instead
+      "index" => "custom_index_#{rand(10000)}"
     }
     plugin = LogStash::Outputs::ElasticSearch.new(settings)
   end

@donoghuc donoghuc requested a review from mashhurs October 13, 2025 22:12
Copy link
Contributor

@mashhurs mashhurs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mashhurs mashhurs merged commit 94ec542 into logstash-plugins:11.x Oct 14, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants