Skip to content

Use More Embedding Models in Similarity Search for Semantic Cache #106

@rootfs

Description

@rootfs

Is your feature request related to a problem? Please describe.
Currently the semantic cache uses all-MiniLM-L12-v2 embedding model in semantic cache. This model supports max seq up to 512. This works for short prompts.

For long prompts, other embedding models need to be explored.

Describe the solution you'd like
Document the limitation or support embedding models with longer max seq.

Describe alternatives you've considered

Additional context
#59

Metadata

Metadata

Assignees

Labels

help wantedExtra attention is neededpriority/P1Important / Should-Have

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions