Skip to content

Latest commit

 

History

History
358 lines (285 loc) · 13.4 KB

spring_boot_redis_om_spring.md

File metadata and controls

358 lines (285 loc) · 13.4 KB

Vector Similarity Search with Redis OM Spring (Spring Boot)

Redis

This guide demonstrates how to build a Vector Similarity Search (VSS) system using Spring Boot and Redis OM Spring. The example allows movies to be searched by their synopses based on semantic similarity rather than keyword matching.

Prerequisites

  • Java 21
  • Maven for dependency management
  • Docker and Docker Compose (for running Redis)
  • OpenAI API key for text embeddings

Repository

The repository for this demo can be found here

Project Structure Overview

The project implements a system that demonstrates vector similarity search using Redis 8's built-in capabilities together with Redis OM Spring:

The repository can be found at:

roms-vss-movies/
├── src/main/java/dev/raphaeldelio/redis8demovectorsimilaritysearch/
│   ├── controller/
│   │   └── SearchController.java       # REST endpoints for search
│   ├── domain/
│   │   └── Movie.java                  # Entity with vector annotations
│   ├── repository/
│   │   └── MovieRepository.java        # Redis repository interface
│   ├── service/
│   │   ├── MovieService.java           # Service for data loading
│   │   └── SearchService.java          # Service for vector search
│   └── RomsVectorSimilaritySearchMovies.java  # Main application
└── src/main/resources/
    ├── application.properties          # Application configuration
    └── movies.json                     # Sample dataset

Getting Started

  1. Start the Redis instance:

    docker-compose up -d redis-vector-search
  2. Build and run the application:

    mvn spring-boot:run

Dependencies

This application uses the following key dependencies:

<!-- Redis OM Spring for Redis object mapping and vector search -->
<dependency>
    <groupId>com.redis.om.spring</groupId>
    <artifactId>redis-om-spring</artifactId>
    <version>0.9.11</version>
</dependency>

<!-- Spring AI for embeddings -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai</artifactId>
    <version>1.0.0-M6</version>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-transformers</artifactId>
    <version>1.0.0-M6</version>
</dependency>

Implementation Details

1. Define the Movie entity

Redis OM Spring provides two annotations that makes it easy to vectorize data and perform vector similarity search from within Spring Boot.

  • @Vectorize: Automatically generates vector embeddings from the text field
  • @Indexed: Enables vector indexing on the field for efficient search

The core of the implementation is the Movie class with Redis vector indexing annotations:

@RedisHash // This annotation is used by Redis OM Spring to store the entity as a hash in Redis
public class Movie {

    @Id // IDs are automatically generated by Redis OM Spring as ULID
    private String title;

    @Indexed(sortable = true) // This annotation enables indexing on the field for filtering and sorting
    private int year;

    @Indexed
    private List<String> cast;

    @Indexed
    private List<String> genres;

    private String href;

    // This annotation automatically generates vector embeddings from the text
    @Vectorize(
            destination = "embeddedExtract", // The field where the embedding will be stored
            embeddingType = EmbeddingType.SENTENCE, // Type of embedding to generate (Sentence, Image, face, or word)
            provider = EmbeddingProvider.OPENAI, // The provider for generating embeddings (OpenAI, Transformers, VertexAI, etc.)
            openAiEmbeddingModel = OpenAiApi.EmbeddingModel.TEXT_EMBEDDING_3_LARGE // The specific OpenAI model to use for embeddings
    )
    private String extract;

    // This defines the vector field that will store the embeddings
    // The indexed annotation enables vector search on this field
    @Indexed(
            schemaFieldType = SchemaFieldType.VECTOR, // Defines the field type as a vector
            algorithm = VectorField.VectorAlgorithm.FLAT, // The algorithm used for vector search (FLAT or HNSW)
            type = VectorType.FLOAT32,
            dimension = 3072, // The dimension of the vector (must match the embedding model)
            distanceMetric = DistanceMetric.COSINE, // The distance metric used for similarity search (Cosine or Euclidean)
            initialCapacity = 10
    )
    private byte[] embeddedExtract;

    private String thumbnail;
    private int thumbnailWidth;
    private int thumbnailHeight;

    // Getters and setters...
}

2. Repository Interface

A simple repository interface extends RedisEnhancedRepository. This will be used to load the data into Redis using the saveAll() method:

public interface MovieRepository extends RedisEnhancedRepository<Movie, String> {
}

This provides basic CRUD operations for Movie entities, with the first generic parameter being the entity type and the second being the ID type.

3. Search Service

The search service uses two beans provided by Redis OM Spring:

  • EntityStream: For creating a stream of entities to perform searches. The Entity Stream must not be confused with the Java Streams API. The Entity Stream will generate a Redis Command that will be sent to Redis so that Redis can perform the searching, filtering and sorting efficiently on its side.
  • Embedder: Used for generating the embedding for the query sent by the user. It will be generated following the configuration of the @Vectorize annotation defined in the Movie class/

The search functionality is implemented in the SearchService:

@Service
public class SearchService {

    private static final Logger logger = LoggerFactory.getLogger(SearchService.class);
    private final EntityStream entityStream;
    private final Embedder embedder;

    public SearchService(EntityStream entityStream, Embedder embedder) {
        this.entityStream = entityStream;
        this.embedder = embedder;
    }

    public List<Pair<Movie, Double>> search(
            String query,
            Integer yearMin,
            Integer yearMax,
            List<String> cast,
            List<String> genres,
            Integer numberOfNearestNeighbors) {
        logger.info("Received text: {}", query);
        logger.info("Received yearMin: {} yearMax: {}", yearMin, yearMax);
        logger.info("Received cast: {}", cast);
        logger.info("Received genres: {}", genres);

        if (numberOfNearestNeighbors == null) numberOfNearestNeighbors = 3;
        if (yearMin == null) yearMin = 1900;
        if (yearMax == null) yearMax = 2100;

        // Convert query text to vector embedding
        byte[] embeddedQuery = embedder.getTextEmbeddingsAsBytes(List.of(query), Movie$.EXTRACT).getFirst();

        // Perform vector search with additional filters
        SearchStream<Movie> stream = entityStream.of(Movie.class);
        return stream
                // KNN search for nearest vectors
                .filter(Movie$.EMBEDDED_EXTRACT.knn(numberOfNearestNeighbors, embeddedQuery))
                // Additional metadata filters (hybrid search)
                .filter(Movie$.YEAR.between(yearMin, yearMax))
                .filter(Movie$.CAST.eq(cast))
                .filter(Movie$.GENRES.eq(genres))
                // Sort by similarity score
                .sorted(Movie$._EMBEDDED_EXTRACT_SCORE)
                // Return both the movie and its similarity score
                .map(Fields.of(Movie$._THIS, Movie$._EMBEDDED_EXTRACT_SCORE))
                .collect(Collectors.toList());
    }
}

Key features of the search service:

  • Uses EntityStream to create a search stream for Movie entities
  • Converts the text query into a vector embedding
  • Uses K-nearest neighbors (KNN) search to find similar vectors
  • Applies additional filters for hybrid search (combining vector and traditional search)
  • Returns pairs of movies and their similarity scores

4. Movie Service for Data Loading

The MovieService handles loading movie data into Redis. It reads a JSON file containing movie date and save the movies into Redis. It may take one or two minutes to load the data for the 36 thousand movies in the file because the embedding generation is done in the background. The @Vectorize annotation will generate the embeddings for the extract field when the movie is saved into Redis.:

@Service
public class MovieService {

    private static final Logger log = LoggerFactory.getLogger(MovieService.class);
    private final ObjectMapper objectMapper;
    private final ResourceLoader resourceLoader;
    private final MovieRepository movieRepository;

    public MovieService(ObjectMapper objectMapper, ResourceLoader resourceLoader, MovieRepository movieRepository) {
        this.objectMapper = objectMapper;
        this.resourceLoader = resourceLoader;
        this.movieRepository = movieRepository;
    }

    public void loadAndSaveMovies(String filePath) throws Exception {
        Resource resource = resourceLoader.getResource("classpath:" + filePath);
        try (InputStream is = resource.getInputStream()) {
            List<Movie> movies = objectMapper.readValue(is, new TypeReference<>() {});
            List<Movie> unprocessedMovies = movies.stream()
                    .filter(movie -> !movieRepository.existsById(movie.getTitle()) &&
                            movie.getYear() > 1980
                    ).toList();
            long systemMillis = System.currentTimeMillis();
            movieRepository.saveAll(unprocessedMovies);
            long elapsedMillis = System.currentTimeMillis() - systemMillis;
            log.info("Saved " + movies.size() + " movies in " + elapsedMillis + " ms");
        }
    }

    public boolean isDataLoaded() {
        return movieRepository.count() > 0;
    }
}

5. Search Controller

The REST controller exposes the search endpoint:

@RestController
public class SearchController {

    private final SearchService searchService;

    public SearchController(SearchService searchService) {
        this.searchService = searchService;
    }

    @GetMapping("/search")
    public Map<String, Object> search(
            @RequestParam(required = false) String text,
            @RequestParam(required = false) Integer yearMin,
            @RequestParam(required = false) Integer yearMax,
            @RequestParam(required = false) List<String> cast,
            @RequestParam(required = false) List<String> genres,
            @RequestParam(required = false) Integer numberOfNearestNeighbors
    ) {
        List<Pair<Movie, Double>> matchedMovies = searchService.search(
                text,
                yearMin,
                yearMax,
                cast,
                genres,
                numberOfNearestNeighbors
        );
        return Map.of(
                "matchedMovies", matchedMovies,
                "count", matchedMovies.size()
        );
    }
}

6. Application Bootstrap

The main application class initializes Redis OM Spring and loads data:

@SpringBootApplication
@EnableRedisEnhancedRepositories(basePackages = {"dev.raphaeldelio.redis8demo*"})
public class Redis8DemoVectorSimilaritySearchApplication {

    public static void main(String[] args) {
        SpringApplication.run(Redis8DemoVectorSimilaritySearchApplication.class, args);
    }

    @Bean
    CommandLineRunner loadData(MovieService movieService) {
        return args -> {
            if (movieService.isDataLoaded()) {
                System.out.println("Data already loaded. Skipping data load.");
                return;
            }
            movieService.loadAndSaveMovies("movies.json");
        };
    }
}

The @EnableRedisEnhancedRepositories annotation activates Redis OM Spring's repository support.

Example API Requests

You can make requests to the search endpoint:

GET http://localhost:8082/search?text=A movie about a young boy who goes to a wizardry school

GET http://localhost:8082/search?numberOfNearestNeighbors=1&yearMin=1970&yearMax=1990&text=A movie about a kid and a scientist who go back in time

GET http://localhost:8082/search?cast=Dee Wallace,Henry Thomas&text=A boy who becomes friend with an alien

Sample Response

{
  "count": 1,
  "matchedMovies": [
    {
      "first": {
        "title": "Back to the Future",
        "year": 1985,
        "cast": [
          "Michael J. Fox",
          "Christopher Lloyd"
        ],
        "genres": [
          "Science Fiction"
        ],
        "extract": "Back to the Future is a 1985 American science fiction film directed by Robert Zemeckis and written by Zemeckis, and Bob Gale. It stars Michael J. Fox, Christopher Lloyd, Lea Thompson, Crispin Glover, and Thomas F. Wilson. Set in 1985, it follows Marty McFly (Fox), a teenager accidentally sent back to 1955 in a time-traveling DeLorean automobile built by his eccentric scientist friend Emmett \"Doc\" Brown (Lloyd), where he inadvertently prevents his future parents from falling in love – threatening his own existence – and is forced to reconcile them and somehow get back to the future.",
        "thumbnail": "https://upload.wikimedia.org/wikipedia/en/d/d2/Back_to_the_Future.jpg"
      },
      "second": 0.463297247887
    }
  ]
}