A Python toolkit for processing Japanese addresses
japanese-address-parser-py is a Python package for parsing Japanese addresses. Any address can be parsed into structured data.
pip install japanese-address-parser-py
from japanese_address_parser_py import Parser
address_list = [
"埼玉県さいたま市浦和区高砂3-15-1",
"千葉県千葉市中央区市場町1-1",
"東京都新宿区西新宿2-8-1",
"神奈川県横浜市中区日本大通1"
]
parser = Parser()
for address in address_list:
parse_result = parser.parse(address)
print(parse_result.address)
{'prefecture': '埼玉県', 'town': '高砂三丁目', 'rest': '15-1', 'city': 'さいたま市浦和区'}
{'rest': '1-1', 'town': '市場町', 'prefecture': '千葉県', 'city': '千葉市中央区'}
{'prefecture': '東京都', 'rest': '8-1', 'town': '西新宿二丁目', 'city': '新宿区'}
{'town': '日本大通', 'city': '横浜市中区', 'prefecture': '神奈川県', 'rest': '1'}
from japanese_address_parser_py import Parser
parser = Parser()
address = "神奈川県横浜市中区本町6丁目50-10"
parse_result = parser.parse(address)
print(parse_result.address["prefecture"])
print(parse_result.address["city"])
print(parse_result.address["town"])
print(parse_result.address["rest"])
神奈川県
横浜市中区
本町六丁目
50-10
This library is written in Rust. You need to set up a Rust development environment to build this library.
Also, you need to install maturin
as this library uses it in order to generate Python bindings.
# Install maturin
cargo install --locked maturin
# Clone repository
git clone https://github.com/YuukiToriyama/japanese-address-parser.git
# Build python module
cd japanse-address-parser/python
maturin build --release --out dist --find-interpreter
# Install the built library
python3 -m venv .venv
pip3 install dist/japanese_address_parser_py-[version]-cp37-abi3-[arch].whl
This software is maintained by YuukiToriyama. If you have any questions, please create a new issue.
The source code is hosted on GitHub at: https://github.com/YuukiToriyama/japanese-address-parser
This software was inspired
by @geolonia/normalize-japanese-addresses.
In addition, the parsing process uses Geolonia 住所データ which is
provided by 株式会社Geolonia.
This crate is distributed under the terms of the MIT license.