[PROPOSAL] Neural Search field type #803

asfoorial · 2024-06-25T12:59:15Z

Can we mimic this feature in OpenSearch https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text

I know that a lot has been done recently in OpenSearch projects to make things headache free. I think a neural-search field type in OpenSearch would be an interesting addition. However, it should account for synonyms to avoid any fine-tuning headache.

navneet1v · 2024-07-04T02:57:06Z

@asfoorial from Opensearch side we do this via a combination of ingestion processor and vector field. As there are multiple use-cases for semantic search including multi-model, this would be an interesting field to have.

But is there any specific reason you are looking for the field as compared to what is present currently. My main motive here is to know the advantages of a new field vs what is currently present in opensearch.

asfoorial · 2024-07-04T10:00:25Z

The main reason is simplifying the process and keep the focus on the business. In fact elasticsearch had the same reason when they introduced the field.

Another reason is alignment of new features across multiple OpenSearch projects. I have noticed over the past number of releases we get new features in ml-commons and kNN. But it takes a while until we see their benefits reflected in neural-search. If they become one component (neural-search field), then that would sort of guarantee that any new feature in ml-common or kNN must be reflected in the neural-search field type before their release.

navneet1v · 2024-07-29T19:18:47Z

If they become one component (neural-search field), then that would sort of guarantee that any new feature in ml-common or kNN must be reflected in the neural-search field type before their release.

@asfoorial thanks for providing the details. I want to know little bit more on what features added in ML/k-NN doesn't make into Neural. May be there is something missing.

But I really like the idea of having a field which can encapsulate the processor information.

navneet1v · 2024-07-29T19:21:54Z

One place where having the field will be useful is nested fields. I see putting this information in the processor is very painful and not intutive.

navneet1v · 2024-09-03T19:15:57Z

@minalsha please take a look into this and please add your thoughts

heemin32 · 2024-11-26T18:46:57Z

I think this is a good idea as it simplifies the use of neural search significantly. By defining a neural field, all other processes, such as the neural search pipeline, neural ingestion pipeline, KNN index creation, chunking, and more, will be handled behind the scenes.

navneet1v · 2024-12-26T17:37:36Z

@heemin32 any reason for closing this gh issue?

heemin32 · 2024-12-30T17:17:56Z

@heemin32 any reason for closing this gh issue?

@navneet1v I think it is closed automatically when I added them in NeuralSearch RoadMap. Reopened it.

navneet1v · 2024-12-30T20:37:21Z

One case where I feel this field type will be very useful is in cases of complex nested fields. Currently with TextEmbedding processor it is always feels like we are finding different cases where the processor is not working some GH issues:

[BUG] Fail to generate embedding for ingest document with nested field defined in field map #1042
[BUG] Fail to ingest document with nested list into text_embedding processor #1024
[BUG] Text chunking processor not working with nested documents #895
[BUG] _bulk update request failing when using text chunking processor pipeline #798
[BUG] Incorrect validation logic for map type in xxxProcessor #739
[BUG] error on complex types list type field [category] has empty string, cannot process it #678
IllegalArgumentException when all embedding fields not shown or doing a partial update without embedding fields #73

I believe having a field type will solve this problem, in the mappers only we will call the MLCommons inference APIs to convert the text to embeddings. I think we can use the concept of properties in the mapper to have a neural field handling both text and vectors.

cc: @minalsha , @heemin32 , @vibrantvarun , @martin-gaievski

YeonghyeonKO · 2024-12-30T22:52:13Z

This will also reduce the number of inference requests when multiple fields have to be embedded.

Inference requests in semantic_text fields are also batched. If you have 10 documents in a bulk API request, and each document contains 2 semantic_text fields, then that request will perform a single inference request with 20 texts to your inference service in one go, instead of making 10 separate inference requests of 2 texts each.
(https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text)

bzhangam · 2024-12-31T00:29:44Z

I'll work on this item.

YeonghyeonKO · 2025-01-01T11:37:50Z

@bzhangam, is there room for consideration to include a minor feature? (See: opensearch-project/k-NN#2356)

Either

Give an warning message about mismatch between original similarity function of embedding model and space_type of indices

or

Suggest or fix space_type when defining mappings for an index according to the embedding model which neural_search field type will use.

heemin32 · 2025-01-01T19:07:56Z

@YeonghyeonKO, the space_type will be automatically retrieved from the model metadata, so users won't need to specify it explicitly.

YeonghyeonKO · 2025-01-02T04:11:52Z

@heemin32
if then, users who aren't familiar with vector spaces can easily transform text type fields to knn_vector type. Thanks for initiating this proposal @asfoorial

dblock · 2025-01-06T17:04:06Z

[Catch All Triage - 1, 2, 3, 4]

mingshl · 2025-01-09T02:30:32Z

I was having similar idea earlier when I heard about a use case that wants to rewrite a match query to neural search query.

Think about this,

User config the mapping to have a field text defined as neural search field, along with a model id, optionally with text chunk size and model config.
when the document is ingested, the text field will ingest, it auto applies text chunking if needed, and internally call a ml inference processor or text embedding processor, that generates an embedding field call text_embedding which holds an array of embeddings.
when user are running match query using query text foo to lookup text_embedding field, the query field foo can be rewrite with embedding in a knn query. Or when when running match query with query text 'foo' to lookup text_embedding field, it rewrites to a neural search query.

This can simplify the neural search experience. But again, we will have to consider how do we handle different model input and output format. For course we can use pre and post processing function through connectors. But what if we can do it easier?

bzhangam · 2025-01-28T20:46:45Z

One case where I feel this field type will be very useful is in cases of complex nested fields. Currently with TextEmbedding processor it is always feels like we are finding different cases where the processor is not working some GH issues:

[BUG] Fail to generate embedding for ingest document with nested field defined in field map #1042

[BUG] Fail to ingest document with nested list into text_embedding processor #1024

[BUG] Text chunking processor not working with nested documents #895

[BUG] _bulk update request failing when using text chunking processor pipeline #798

[BUG] Incorrect validation logic for map type in xxxProcessor #739

[BUG] error on complex types list type field [category] has empty string, cannot process it #678

IllegalArgumentException when all embedding fields not shown or doing a partial update without embedding fields #73

I believe having a field type will solve this problem, in the mappers only we will call the MLCommons inference APIs to convert the text to embeddings. I think we can use the concept of properties in the mapper to have a neural field handling both text and vectors.

cc: @minalsha , @heemin32 , @vibrantvarun , @martin-gaievski

In think if we inference the data in the mapper then each time we will only be able to handle one neural field. In the case we need to inference multiple docs and each doc has multiple neural fields we will invoke the inference API multiple times rather than batching them in one API call. It can be a performance concern especially when we need to use a remote model to do the inference work.

bzhangam · 2025-01-28T20:51:50Z

This will also reduce the number of inference requests when multiple fields have to be embedded.

Inference requests in semantic_text fields are also batched. If you have 10 documents in a bulk API request, and each document contains 2 semantic_text fields, then that request will perform a single inference request with 20 texts to your inference service in one go, instead of making 10 separate inference requests of 2 texts each.
(https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text)

I think technically this improvement doesn't have to be done with this new proposal. Even for our existing ingest process we should be able to improve it by overriding this function to pass all the docs to the inferenceProcessor rather than just pass one doc each time.

navneet1v · 2025-01-28T21:13:18Z

In think if we inference the data in the mapper then each time we will only be able to handle one neural field. In the case we need to inference multiple docs and each doc has multiple neural fields we will invoke the inference API multiple times rather than batching them in one API call. It can be a performance concern especially when we need to use a remote model to do the inference work.

My thoughts:

Doing 1 inference for neural field at a time can for sure decrease the performance compare to batching all the fields and doing inference at once. But at the same time having a field that does the inference solves multiple issues and provides a good cloud native architecture, as system is free from ingest pipelines. Since ingest pipelines require separate node roles and compute power.

To go around the problem of performance, it will be a good idea to parallelize running of mappers. This will provide a good boost and solve other problems too. This idea has been discussed multiple times in core. I can try to find out references for this. But I have discussed this with couple of maintainers of core. cc: @msfroh

vibrantvarun · 2025-01-28T21:33:18Z

I agree with @navneet1v here.

cc: @bzhangam

bzhangam · 2025-01-29T21:01:31Z

This will also reduce the number of inference requests when multiple fields have to be embedded.

Inference requests in semantic_text fields are also batched. If you have 10 documents in a bulk API request, and each document contains 2 semantic_text fields, then that request will perform a single inference request with 20 texts to your inference service in one go, instead of making 10 separate inference requests of 2 texts each.
(https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text)

I think today we already support a feature for that - https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/batch-ingestion/#step-5-create-an-ingest-pipeline

I think the consideration here is more like with this new proposal we should not lose the ability to batch the inference call.

bzhangam · 2025-01-29T21:06:52Z

In think if we inference the data in the mapper then each time we will only be able to handle one neural field. In the case we need to inference multiple docs and each doc has multiple neural fields we will invoke the inference API multiple times rather than batching them in one API call. It can be a performance concern especially when we need to use a remote model to do the inference work.

My thoughts:

Doing 1 inference for neural field at a time can for sure decrease the performance compare to batching all the fields and doing inference at once. But at the same time having a field that does the inference solves multiple issues and provides a good cloud native architecture, as system is free from ingest pipelines. Since ingest pipelines require separate node roles and compute power.

To go around the problem of performance, it will be a good idea to parallelize running of mappers. This will provide a good boost and solve other problems too. This idea has been discussed multiple times in core. I can try to find out references for this. But I have discussed this with couple of maintainers of core. cc: @msfroh

I agree let the field mapper to do the inference work can simplify things and using parallel can address the latency concern. But we still need to make multiple inference calls. And some ML models may bill us by the number of the API calls which can be a big concern.

navneet1v · 2025-01-30T01:27:35Z

In think if we inference the data in the mapper then each time we will only be able to handle one neural field. In the case we need to inference multiple docs and each doc has multiple neural fields we will invoke the inference API multiple times rather than batching them in one API call. It can be a performance concern especially when we need to use a remote model to do the inference work.

My thoughts:
Doing 1 inference for neural field at a time can for sure decrease the performance compare to batching all the fields and doing inference at once. But at the same time having a field that does the inference solves multiple issues and provides a good cloud native architecture, as system is free from ingest pipelines. Since ingest pipelines require separate node roles and compute power.
To go around the problem of performance, it will be a good idea to parallelize running of mappers. This will provide a good boost and solve other problems too. This idea has been discussed multiple times in core. I can try to find out references for this. But I have discussed this with couple of maintainers of core. cc: @msfroh

I agree let the field mapper to do the inference work can simplify things and using parallel can address the latency concern. But we still need to make multiple inference calls. And some ML models may bill us by the number of the API calls which can be a big concern.

Some thoughts:

If the concern of making a lot of calls to ML model then user can still use the current ingest pipeline and not use this new field.

Another option I can think of is putting up a queue which can queue up all the embedding generation in a batch and then use ML Commons predict api. This way you still get the benefits of mapping field without adding in built extra pipelines.

See if this can help.

heemin32 · 2025-01-31T18:15:34Z

I was thinking custom codec for neural search so that batch inference can happen during flush. However, handling it inside mapper might be way simpler to implement. If batch inference is required, we could explore queuing option as @navneet1v mentioned.

github-actions bot added the untriaged label Jun 25, 2024

asfoorial changed the title ~~[PROPOSAL] Neural Search built-in type~~ [PROPOSAL] Neural Search field type Jun 25, 2024

navneet1v added Enhancements Increases software capabilities beyond original client specifications and removed untriaged labels Jul 4, 2024

jmazanec15 assigned minalsha Sep 4, 2024

heemin32 added this to Neural Search RoadMap Dec 26, 2024

heemin32 closed this as completed by moving to Backlog(Hot) in Neural Search RoadMap Dec 26, 2024

heemin32 moved this to Backlog(Hot) in Neural Search RoadMap Dec 26, 2024

heemin32 reopened this Dec 30, 2024

github-actions bot added the untriaged label Dec 30, 2024

heemin32 unassigned minalsha Dec 30, 2024

heemin32 mentioned this issue Dec 30, 2024

[FEATURE] Warning about Mismatch Between similarity function of Embedding Model and Index space_type opensearch-project/k-NN#2356

Open

heemin32 assigned bzhangam Dec 31, 2024

dblock removed the untriaged label Jan 6, 2025

heemin32 added the neural-search label Jan 10, 2025

q-andy mentioned this issue Jan 28, 2025

[FEATURE]Add Metrics for Neural Search Usage #1104

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Neural Search field type #803

[PROPOSAL] Neural Search field type #803

asfoorial commented Jun 25, 2024 •

edited by dblock

Loading

navneet1v commented Jul 4, 2024

asfoorial commented Jul 4, 2024

navneet1v commented Jul 29, 2024

navneet1v commented Jul 29, 2024

navneet1v commented Sep 3, 2024

heemin32 commented Nov 26, 2024

navneet1v commented Dec 26, 2024

heemin32 commented Dec 30, 2024 •

edited

Loading

navneet1v commented Dec 30, 2024

YeonghyeonKO commented Dec 30, 2024 •

edited

Loading

bzhangam commented Dec 31, 2024

YeonghyeonKO commented Jan 1, 2025

heemin32 commented Jan 1, 2025

YeonghyeonKO commented Jan 2, 2025

dblock commented Jan 6, 2025

mingshl commented Jan 9, 2025 •

edited

Loading

bzhangam commented Jan 28, 2025

bzhangam commented Jan 28, 2025

navneet1v commented Jan 28, 2025

vibrantvarun commented Jan 28, 2025

bzhangam commented Jan 29, 2025

bzhangam commented Jan 29, 2025

navneet1v commented Jan 30, 2025 •

edited

Loading

heemin32 commented Jan 31, 2025

[PROPOSAL] Neural Search field type #803

[PROPOSAL] Neural Search field type #803

Comments

asfoorial commented Jun 25, 2024 • edited by dblock Loading

navneet1v commented Jul 4, 2024

asfoorial commented Jul 4, 2024

navneet1v commented Jul 29, 2024

navneet1v commented Jul 29, 2024

navneet1v commented Sep 3, 2024

heemin32 commented Nov 26, 2024

navneet1v commented Dec 26, 2024

heemin32 commented Dec 30, 2024 • edited Loading

navneet1v commented Dec 30, 2024

YeonghyeonKO commented Dec 30, 2024 • edited Loading

bzhangam commented Dec 31, 2024

YeonghyeonKO commented Jan 1, 2025

heemin32 commented Jan 1, 2025

YeonghyeonKO commented Jan 2, 2025

dblock commented Jan 6, 2025

mingshl commented Jan 9, 2025 • edited Loading

bzhangam commented Jan 28, 2025

bzhangam commented Jan 28, 2025

navneet1v commented Jan 28, 2025

vibrantvarun commented Jan 28, 2025

bzhangam commented Jan 29, 2025

bzhangam commented Jan 29, 2025

navneet1v commented Jan 30, 2025 • edited Loading

heemin32 commented Jan 31, 2025

asfoorial commented Jun 25, 2024 •

edited by dblock

Loading

heemin32 commented Dec 30, 2024 •

edited

Loading

YeonghyeonKO commented Dec 30, 2024 •

edited

Loading

mingshl commented Jan 9, 2025 •

edited

Loading

navneet1v commented Jan 30, 2025 •

edited

Loading