-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 2057bb3
Showing
9 changed files
with
22,185 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
name: Download OSM data and extract postcode data | ||
on: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: '0 1 1 * *' # 1AM every 1st of the month | ||
jobs: | ||
download: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Install osmium | ||
run: | | ||
sudo apt-get update | ||
sudo apt-get install -y osmium-tool | ||
- uses: actions/checkout@v4 | ||
- uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.12' | ||
cache: 'pip' | ||
- run: pip install -r requirements.txt | ||
- name: Download OSM data | ||
run: | | ||
mkdir -p ./data | ||
wget http://download.openstreetmap.fr/extracts/europe/sweden-latest.osm.pbf -O ./data/sweden-latest.osm.pbf | ||
- name: Filter OSM data | ||
run: | | ||
osmium tags-filter ./data/sweden-latest.osm.pbf nwr/addr:postcode -o ./data/extract.osm.pbf | ||
- name: Convert the data | ||
run: python convert_osm.py | ||
- name: Commit the data | ||
uses: nick-fields/retry@v2 | ||
with: | ||
timeout_seconds: 10 | ||
max_attempts: 5 | ||
command: | | ||
git config --global user.name 'Pierre Mesure (Github Actions)' | ||
git config --global user.email '[email protected]' | ||
git config --global rebase.autoStash true | ||
git pull --rebase | ||
git add ./data/osm_codes.csv | ||
git commit -am "Update the data" | ||
git push |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
name: Download Postnord data | ||
on: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: '0 1 1 * *' # 1AM every 1st of the month | ||
jobs: | ||
download: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.12' | ||
cache: 'pip' | ||
- run: pip install -r requirements.txt | ||
- name: Fetch the data | ||
run: python download_postnord.py | ||
- name: Commit the data | ||
uses: nick-fields/retry@v2 | ||
with: | ||
timeout_seconds: 10 | ||
max_attempts: 5 | ||
command: | | ||
git config --global user.name 'Pierre Mesure (Github Actions)' | ||
git config --global user.email '[email protected]' | ||
git config --global rebase.autoStash true | ||
git pull --rebase | ||
git add . | ||
git commit -am "Update the data" | ||
git push |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
data/*.pbf | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# pOSMkod | ||
|
||
pOSMkod is a set of scripts to download Swedish postal code (*postkod*) data from OpenStreetMap (OSM, hence the name) | ||
|
||
## What does it do? | ||
|
||
A [first script](./download_postnord.py) downloads a updated list of all Swedish postal codes from one of Postnord's private APIs. This script is run every month using a [Github Action](.github/workflows/download_postnord.yml) and the data is pushed to the file [postnord_codes.csv](data/postnord_codes.csv). | ||
|
||
[Another Github action](.github/workflows/download_osm.yml) downloads a dump of all Swedish data on OpenStreetMap and extracts all the objects that contain the `addr:postcode` property. It is run every month in the same way and the data is saved to the file [osm_codes.csv](data/osm_codes.csv). | ||
|
||
These two lists are compared in order to see what proportion of the postal codes are present on OSM. | ||
|
||
In the future, another script will attempt to draw boundaries for each postal code based on the objects associated to it. | ||
|
||
## Prerequisites | ||
|
||
This code has been designed to run using Github Actions but it is perfectly possible to run it locally. | ||
You will need a recent version of Python (at least 3.9). You can install all its dependencies using: | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
In order to extract objects from an OSM data dump, you will need osmium, which can be install on Linux with: | ||
|
||
```bash | ||
apt install osmium-tool | ||
``` | ||
|
||
## How to use | ||
|
||
The first script can then be run using: | ||
|
||
```bash | ||
python download_postnord.py | ||
``` | ||
|
||
To download OSM data, you can simply use wget: | ||
|
||
```bash | ||
wget https://download.geofabrik.de/europe/sweden-latest.osm.pbf -O ./data/sweden-latest.osm.pbf | ||
``` | ||
|
||
When that is done, you can use [osmium](https://osmcode.org/osmium-tool/) to extract the only objects we need: | ||
|
||
```bash | ||
osmium tags-filter ./data/sweden-latest.osm.pbf nwr/addr:postcode -o ./data/extract.osm.pbf | ||
``` | ||
|
||
Finally, the second python script will extract data from the .pbf file and save it as a table. | ||
|
||
```bash | ||
python convert_osm.py | ||
``` | ||
|
||
## License | ||
|
||
This code is licensed under AGPLv3. | ||
|
||
The postal code data extracted from OpenStreetMap is licensed under [ODbl](https://www.openstreetmap.org/copyright). The license for Postnord's data is unclear. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
from pyrosm import OSM | ||
import polars as pl | ||
print('Loading OSM file...') | ||
osm = OSM(filepath='./data/extract.osm.pbf') | ||
|
||
print('Loading objects...') | ||
buildings = osm.get_buildings() | ||
|
||
extract = buildings[[ | ||
'name', | ||
'addr:street', | ||
'addr:housenumber', | ||
'addr:postcode', | ||
'addr:city', | ||
'osm_type', | ||
'geometry' | ||
]] | ||
|
||
print('Creating a list of unique postcodes') | ||
osm_codes = extract['addr:postcode'].str.replace(' ', '').dropna().unique() | ||
osm_codes = sorted([int(code) for code in osm_codes]) | ||
osm_codes = [code for code in osm_codes if code > 10000 and code < 98700] | ||
print('Saving all objects as CSV...') | ||
pl.DataFrame({ 'postcodes': osm_codes }).write_csv('./data/osm_codes.csv') |
Oops, something went wrong.