Skip to content

Content based music recommendation using Spotify's Web API and CNN feature extraction.

License

Notifications You must be signed in to change notification settings

rtjfarrimond/spotify-recommender

Repository files navigation

CircleCI

Spotify Recommender

This application provides the services needed to perform content based music recomendation using features extracted from 30s track previews available via the Spotify Web API.

Dependencies

Components

Playlist crawler:

  • Crawls Spotify playlists, gets the 30s preview URL, downloads audio to S3.
  • Uses the Spotify web API.

Feature extractor:

  • Uses Keunwoo Choi's CNN for feature extraction.
  • Triggered by event from audio file uploaded to S3.
    • Pulls audio file down, extracts features, stores them in database.
    • Deletes the audio when done.

Database:

  • For storage and retrieval of unprocessed features extracted from audio files.
  • The AnnoyIndex item attribute is computed as a uuid1, bit shifted 114 bits to the right. This ensures that the python int maps within the C 32 bit length limit, whilst remaining unique, as the 14 rightmost bits are generated from the time that the uuid is generated. See the documentation for more details.

API:

  • Provide a GET endpoint to service recommendations following query by example.
  • Uses ANNOY, to service queries.
  • Takes a single parameter, a spotify track ID.
  • Subscribes to events that notify when the ANNOY space has been updated.

ANNOY service

This service is the custodian of the ANNOY space. It is responsible for:

  • Initialising the annoy space and storing it in S3.
  • Updating the annoy space following writes to the database.
    • Possibly implemented by subscribing to a DynamoDB event stream.
  • Publishing events to let subscribers know when the annoy space has been updated.
    • Possibly implemented by an event triggered via upload to the S3 bucket.

Infrastructure:

  • An S3 bucket to temporarily store audio from which to extract features.
  • An AWS managed database instance in which to store extracted features.

Getting started

  1. The services use .env files for configuration, which must be created and stored in the config/ directory. In this directory you will also find templates to create the necessary .env files.

  2. To build the project, run:

     make build-all
    
  3. To run the playlist crawler, run:

     make crawl
    
  4. To run the feature extractor, run:

     make extract
    

About

Content based music recommendation using Spotify's Web API and CNN feature extraction.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published