Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle exceptions gracefully when delete non-existent resources during integ test resource clean up #1154

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

weijia-aws
Copy link

@weijia-aws weijia-aws commented Jan 30, 2025

Description

When integ test runs, the test will create model, index, pipeline, etc., then perform some assertions, finally clean up the resources.

public void processor_integrationTest() {
    try {
        // load model
        // create pipeline/index
        // do tests/asserts
    } finally {
        // cleanup pipeline, index, model
    }
}

If somehow resource creation fail and receive an exception, the test will enter the finally block and try to delete the resources. Since these resources don't exist, exception will be thrown when try to delete them, see issue 1091. I was able to reproduce the exception by following #1093 (comment). So in this case, we should gracefully handle such NOT FOUND exceptions

Related Issues

Resolves #1098

Related issue: #1091 and #1093

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@heemin32
Copy link
Collaborator

Can we expand the scope of this PR to shift the responsibility for resource cleanup from individual tests to the framework? This way, each test wouldn’t need to handle resource cleanup individually.

@vibrantvarun
Copy link
Member

vibrantvarun commented Jan 31, 2025

Can we expand the scope of this PR to shift the responsibility for resource cleanup from individual tests to the framework? This way, each test wouldn’t need to handle resource cleanup individually.

@heemin32 I would suggest not to do that. As Integ tests are meant to be run in isolation mode. With resource cleanup at framework level, it will create issues where test will start sharing resources with each other and can cause critical issues to be skipped at integ test level because the resource might be created by some other method and used by some other method.

@vibrantvarun
Copy link
Member

LGTM

@minalsha
Copy link
Collaborator

#1154 (comment)

+1 to Varun's comment here.

@heemin32
Copy link
Collaborator

heemin32 commented Jan 31, 2025

@heemin32 I would suggest not to do that. As Integ tests are meant to be run in isolation mode. With resource cleanup at framework level, it will create issues where test will start sharing resources with each other and can cause critical issues to be skipped at integ test level because the resource might be created by some other method and used by some other method.

Have you seen any such issue in k-nn repo where the resource clean up is handling in framework level? The resource sharing issue can happen when clean up is happening in individual test level and developer missed to clean up the resource in a test.

// for this case we can call undeploy api one more time
deleteModel(modelId);
try {
if (ingestPipeline != null) {
Copy link

@q-andy q-andy Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the discussion below to move cleanup to an @after block, but I want to point out that the root cause is that we're calling this method incorrectly, our integ tests pass in static constant Strings directly into ingestPipeline/searchPipeline/indexName regardless of whether or not the resources were created like here: https://github.com/opensearch-project/neural-search/blob/main/src/test/java/org/opensearch/neuralsearch/processor/TextEmbeddingProcessorIT.java#L76

These blocks are meant to catch cases where we delete resources that don't exist but the conditions are never triggered because we don't have any path to pass in null, instead always passing the constant. Not saying we need to change that right now, but pointing out with an @after block we still don't address the root cause here, just move the exception where we can't see it?

@vibrantvarun vibrantvarun self-requested a review January 31, 2025 22:59
* @return get pipeline response as a map object
*/
@SneakyThrows(ParseException.class)
protected Map<String, Object> retrievePipelines(final String pipelineType, final String pipelineName) throws IOException {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, I'm curious if there's any increased latency for the integ tests by adding extra get requests per test? Is the runtime for all the integ tests impacted?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latency should be minimal and it is generally okay to have some slowness for testing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I ran the integ tests with and without this commit. Total runtime is:

Previous: 8m3.59s
Now: 8m2.34s

The latency is really minimal

List<Map<String, Object>> indices = retrieveIndices();
for (Map<String, Object> index : indices) {
final String indexName = (String) index.get("index");
if (!shouldDeleteIndex(indexName)) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename this method to shouldNotDeleteIndex

) throws IOException {
if (ingestPipeline != null) {
deletePipeline(ingestPipeline);
protected void wipeOfTestResources() throws IOException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this method entirely so that people don't call this explicitly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

// It's possible that test fails during resources (index, model, pipeline, etc.) creation, when clean up
// these resources after test run, the delete methods will throw ResponseException with 404 Not Found code
// In this case, we just need to ignore this exception, for other exceptions, continue throwing
if (RestStatus.NOT_FOUND != RestStatus.fromCode(e.getResponse().getStatusLine().getStatusCode())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we don't need to consume not found exception silently.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need it, at least for pipelines. If there's no pipeline exists, and we make a get pipelines call, the API will throw us this exception. There's an issue opened as well in core: opensearch-project/OpenSearch#15917

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, do we make get call for non existing pipeline?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're some test cases that don't create ingest pipeline, such as this one, there're some test cases that don't create search pipeline, such as this. Since we want to run this clean up method after every test, we need to make a get call for every resource

Copy link

codecov bot commented Feb 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.72%. Comparing base (e8ed3a4) to head (b9dd581).

Additional details and impacted files
@@             Coverage Diff             @@
##               main    #1154     +/-   ##
===========================================
  Coverage     81.72%   81.72%             
+ Complexity     2494     1247   -1247     
===========================================
  Files           186       93     -93     
  Lines          8426     4213   -4213     
  Branches       1428      714    -714     
===========================================
- Hits           6886     3443   -3443     
+ Misses         1000      500    -500     
+ Partials        540      270    -270     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

*/
@SneakyThrows(ParseException.class)
protected Map<String, Object> retrievePipelines(final String pipelineType, final String pipelineName) throws IOException {
Request request = new Request("GET", "/" + pipelineType + "/pipeline/" + (StringUtils.isEmpty(pipelineName) ? "" : pipelineName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Request request = new Request("GET", "/" + pipelineType + "/pipeline/" + (StringUtils.isEmpty(pipelineName) ? "" : pipelineName));
String endpoint = String.format("/%s/pipeline/%s", pipelineType,
Optional.ofNullable(pipelineName).orElse(""));
Request request = new Request("GET", endpoint);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, this is more readable for handling null values, but it's not able to handle strings with empty space. However I think it is fine as this is only used in Integ tests. Will adopt the suggestion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Infrastructure] Wipe integration test resources and surface exceptions gracefully
6 participants