Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the correct type to widen the sort fields when merging top field docs #16860

Open
bugmakerrrrrr opened this issue Dec 16, 2024 · 2 comments · May be fixed by #16881
Open

Use the correct type to widen the sort fields when merging top field docs #16860

bugmakerrrrrr opened this issue Dec 16, 2024 · 2 comments · May be fixed by #16881
Labels
bug Something isn't working Search Search query, autocomplete ...etc

Comments

@bugmakerrrrrr
Copy link
Contributor

Describe the bug

To enable numeric sort optimization support for all numeric types, we use double to widen the sort fields in #6424. However, when the sort value exceeds the MAX_SAFE_INTEGER (2^53 – 1), it may result in incorrect sorting.

Related component

Search

To Reproduce

public class FieldSortIT extends ParameterizedDynamicSettingsOpenSearchIntegTestCase {
    public void testSortMixedNumericField() throws Exception {
            internalCluster().ensureAtLeastNumDataNodes(3);
            index("long", Long.MAX_VALUE);
            index("integer", Integer.MAX_VALUE);
            SearchResponse searchResponse = client().prepareSearch("long", "integer")
                .setQuery(matchAllQuery())
                .setSize(10)
                .addSort(SortBuilders.fieldSort("field").order(SortOrder.ASC).sortMode(SortMode.MAX))
                .get();
            assertNoFailures(searchResponse);
            long[] sortValues = new long[10];
            for (int i = 0; i < 10; i++) {
                sortValues[i] = ((Number) searchResponse.getHits().getAt(i).getSortValues()[0]).longValue();
            }
            for (int i = 1; i < 10; i++) {
                assertThat(Arrays.toString(sortValues), sortValues[i - 1], lessThan(sortValues[i]));
            }
        }
    
        private void index(String type, long end) throws Exception {
            assertAcked(
                prepareCreate(type).setMapping(
                    XContentFactory.jsonBuilder()
                        .startObject()
                        .startObject("properties")
                        .startObject("field")
                        .field("type", type)
                        .endObject()
                        .endObject()
                        .endObject()
                ).setSettings(Settings.builder().put("index.number_of_shards", 3).put("index.number_of_replicas", 0))
            );
            ensureGreen(type);
            for (int i = 0; i < 5; i++) {
                client().prepareIndex(type).setId(Integer.toString(i)).setSource("{\"field\" : " + (end - i) + " }", XContentType.JSON).get();
            }
            client().admin().indices().prepareRefresh(type).get();
        }
}

Expected behavior

When the sort values are all integer values, we can use the Long type to widen the sort fields.

To be discussed: 1. When sort values contain both integer values and float values, should we throw an exception or use double to widen sort fields? 2. Should we check the compatibility between sort types when merging top docs?

Additional Details

No response

@bugmakerrrrrr bugmakerrrrrr added bug Something isn't working untriaged labels Dec 16, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Dec 16, 2024
@reta
Copy link
Collaborator

reta commented Dec 16, 2024

@bugmakerrrrrr I believe this is listed as the limitation of this data type and is explicitly documented [1]

[1] https://opensearch.org/docs/latest/field-types/supported-field-types/unsigned-long/#limitations

@bugmakerrrrrr
Copy link
Contributor Author

@bugmakerrrrrr I believe this is listed as the limitation of this data type and is explicitly documented [1]

[1] https://opensearch.org/docs/latest/field-types/supported-field-types/unsigned-long/#limitations

@reta Thanks for the update. Actually, this is not only related to unsigned_long. The example I gave above only contains long and integer. Also, as I explained in this comment, the results can be wrong even when sorting only for unsigned_long fields in a single index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search Search query, autocomplete ...etc
Projects
Status: 🆕 New
Development

Successfully merging a pull request may close this issue.

3 participants