address-standardizer

An address parser and standardizer in C++

The goal is to be able to take an address string and parse it into tokens classify and standardize the tokens using an appropriate lexicon, then map the tokens to specific output fields based on rules.

The purpose of this code is to be used as part of a geocoding or record matching system and is not useful for standardizing addresses for postal correctness or delivery point solutions.

Dependencies

boost/regex.hpp
boost/regex/icu.hpp
boost/lexical_cast.hpp
boost/algorithm/string.hpp
boost/algorithm/string/join.hpp
boost/algorithm/string/split.hpp
boost/algorithm/string/classification.hpp
boost/serialization/*
unicode/* (from icu packages)
PostgreSQL server development for 9.2, 9.3, 9.4, 9.5, and/or 9.6

For Ubuntu/Debian

sudo apt-get install libicu-dev libicu52 libboost1.55-dev libboost-serialization1.55-dev libboost-serialization1.55.0 libboost-regex1.55-dev libboost-regex1.55.0

Build

See file src/README.address\standardizer

Getting Started

See script tools/getting-started.sh

Funding Needed

There are a lot of pieces to this project and funding will help it along. There is a lot of code to write, test and document. Beyond that we need to develop lexicons and rules addressing standards for every country that one might want to support.

Please contact me if you are interested in this project and/or you can provide some amount of funding. stephenwoodbridge37 (at) gmail (dot) com.

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
data		data
docs		docs
src		src
tools		tools
.gitignore		.gitignore
DOCUMENTATION.md		DOCUMENTATION.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

address-standardizer

Dependencies

For Ubuntu/Debian

Build

Getting Started

Funding Needed

About

Releases

Packages

Languages

License

woodbri/address-standardizer

Folders and files

Latest commit

History

Repository files navigation

address-standardizer

Dependencies

For Ubuntu/Debian

Build

Getting Started

Funding Needed

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages