Make the river more resistant to bulk import failures #570

PeterBackman · 2015-11-12T14:16:48Z

Hi,
in our system we can occasionally get documents that can not be imported into ES. We do not have full control of the input so sometimes we hit the 32k limit in Lucene which will prevent the document to be inserted into ES. The bulk import will fail and the river is stopped.

Locally I made a patch to make afterBulk(long executionId, BulkRequest request, BulkResponse response) öog an error and continue without stopping the river. Seems to work fine.

Is there a reason the river must be stopped or would the above change be interesting?

ankon · 2016-01-04T11:04:03Z

I guess the main question here is whether someone actually reads the logs :)

In our case stopping the river is preferable, because we can monitor that easily, and then call a human to investigate the failure. There should never be ignorable import failures for us. Other situations might differ, so even if the change might not ever be applied to avoid accidental foot-shooting, it could still be useful for other people that can tolerate some data loss.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the river more resistant to bulk import failures #570

Make the river more resistant to bulk import failures #570

PeterBackman commented Nov 12, 2015

ankon commented Jan 4, 2016

Make the river more resistant to bulk import failures #570

Make the river more resistant to bulk import failures #570

Comments

PeterBackman commented Nov 12, 2015

ankon commented Jan 4, 2016