🎧 Visualize and understand my Spotify data.
🚧 Work-in-progress repository
Clone this repository.
Install uv.
Create the virtualenv and install the dependencies with uv
cd sploty/
uv sync
This project uses environment variables such as SPOTIFY_CLIENT_ID
.
You should add them to the .env
file.
Environment variables are specified in the sample.env
file, copy it and complete it
cp sample.env .env
Sploty requires a Spotify developer account, look at the Spotify documentation to set it up.
Retrieve the customer's id and secret and complete the .env
file
SPOTIFY_CLIENT_ID="YOUR SPOTIFY CLIENT ID"
SPOTIFY_CLIENT_SECRET="YOUR SPOTIFY CLIENT SECRET"
SPOTIFY_AUTH_URL="https://accounts.spotify.com/api/token"
SPOTIFY_BASE_URL="https://api.spotify.com/v1/"
Timeout and sleep could be configured with the Sploty args.
The final part (to_elastic.py
) required Elasticsearch, have a look at docker-elk
to configure it locally.
Retrieve host, username and password and complete the .env
file
ELASTIC_HOSTS=["YOUR ELASTIC HOST"]
ELASTIC_USER="YOUR ELASTIC USERNAME"
ELASTIC_PASS="YOUR ELASTIC PASSWORD"
Timeout and index name could be configured with the Sploty args.
- Request your spotify data on your spotify account
- Select "Extended streaming history"
- Click on "Request data"
- 30 days later
- Open the mail from Spotify and download files
Run the app
uv run python sploty/app.py \
--resources-path your/path/to/the/extended_streaming_history_folder/ \
--db-path your/path/to/a/folder/to/save/tracks/data \
--index-name your-index-name
You can also reduce the syntax with uv run sploty
instead of uv run python sploty/app.py
(thanks to the [project.scripts]
added to pyproject.toml
file).
uv run sploty \
--resources-path your/path/to/the/extended_streaming_history_folder/ \
--db-path your/path/to/a/folder/to/save/tracks/data \
--index-name your-index-name
The app will :
- Concat all streams files with
sploty/concat.py
- Filter already enriched streams with poetry run
sploty/filter.py
- Enrich spotify metadata with
sploty/enrich.py
- The Spotify API is used at this stage, don't forget to configure it
- Enrich spotify audio features with
sploty/audio_features.py
- The Spotify API is used at this stage, don't forget to configure it
- A
json database
(TinyDB) is used at this stage to reduce Spotify API calls by storing tracks data
- Add additional metrics with
sploty/metrics.py
- Index their to elastic with
sploty/to_elastic.py
- Elasticsearch is used at this stage, don't forget to configure it
Use the --help
option
uv run python sploty/app.py --help
By default, the sploty_enriched_history
file in the resources folder is used, but you can choose another one with the --previous-enriched-streaming-history-path
option
uv run python sploty/app.py … --previous-enriched-streaming-history-path your/path/to/another/sploty_enriched_history.csv
Use the -no-<the part>
options
uv run python sploty/app.py … --no-concat --no-filter --no-enrich --no-feature --no-metric --no-elastic
Use the --chunk-size
option, default is 100
uv run python sploty/app.py … --chunk-size 101
Open Kibana (http://localhost:5601
with docker-elk
) and create a dashboard to query your index
🚧 This part is not yet in the repository