be-true-to-thine-fox
Explore your firefox browsing history trends using Metabase, an open source analytics tool
Disclaimer - this was made in a rush, and sat on ice for a while, please PR any suggestions & glaring oversights.
Background reading on the Firefox ERD can be found in the rather sparse Firefox docs (the ERD is no longer valid)
Clone this repo
git clone https://github.com/mattarderne/firefox_explore.git
cd firefox_explore
Find where your Firefox places.sqlite
file lives in about:support page under the Profile Folder entry and copy here.
cp ~/Library/ApplicationSupport/Firefox/Profiles/y4pw28fm.default/places.sqlite .
The below creates a docker container, linking this repo (with the Metabase backend sqlite database and the Firefox places.sqlite) to the Docker container
docker run -d -p 3000:3000 \
-v $PWD:/metabase-data \
-e "MB_DB_FILE=/metabase-data/metabase.db" \
--name metabase_ff metabase/metabase
Run below and 2-3min later Metabase will be running at http://localhost:3000
docker start metabase_ff
Login details are:
[email protected]
admin11
- Browsing Overview, use as a starting point
- Base SQL table, use this as a base for new queries
The dashboards sometimes need to be refreshed after their first run if any questions don't load
This allows you to define a "bad list" of sites that you'd like to mark specifically (or use as a way to insert custom data into the sqlite db)
- Modify the
site_list
inprocrastinate.py
with the domains you want to include, use this question to get your top 20 list. - Run
python procrastinate.py
to copy the lists to theplaces.sqlite
Contributions useful, the following are necessary at some stage, but other useful things include creating useful visualisations (kinda tricky to integrate as the code lives in the metabase.db), cleaning up the all_site_visits.sql
file
- figure out your most clicked HN titles, look at the words in the title
- Name it something clever -
true-to-thine-fox
- Fix the Docker to use docker-compose
- Mount the
places.sqlite
directly in place rather than copying it to the repo - Find a better way of doing
procrastinate_base
- think about doing an integer join on the
all_site_visits
table on theLEFT JOIN procrastinate on cleanup.top_level_domain = procrastinate.procrastinate
join
- Fix the dates in the base Base SQL table
- python, add some deep learning correlation graphs, jupyter notebooks, streamlit
- ai on docker
- streamlit docker
- find some other sqlite data sources and add them