Deepparse : An Extendable, and Fine-Tunable State-Of-The-Art Library for Parsing Multinational Street Addresses

Published: 09 Oct 2023, Last Modified: 13 Oct 2023NLP-OSS 2023EveryoneRevisionsBibTeX
Keywords: address parsing, deep learning, natural language processing
TL;DR: Deepparse, an open-source address parsing solution under LGPL-3.0 licence to parse multinational addresses using state-of-the-art deep learning algorithms that supports fine-tuning with new data to generate a custom address parser.
Abstract: Segmenting an address into meaningful components, also known as address parsing, is an essential step in many applications from record linkage to geocoding and package delivery. Consequently, a lot of work has been dedicated to develop accurate address parsing techniques, with machine learning and neural network methods leading the state-of-the-art scoreboard. However, most of the work on address parsing has been confined to academic endeavours with little availability of free and easy-to-use open-source solutions. This paper presents Deepparse, a Python open-source, extendable, fine-tunable address parsing solution under LGPL-3.0 licence to parse multinational addresses using state-of-the-art deep learning algorithms and evaluated on over 60 countries. It can parse addresses written in any language and use any address standard. The pre-trained model achieves average 99% parsing accuracies on the countries used for training with no pre-processing nor post-processing needed. Moreover, the library supports fine-tuning with new data to generate a custom address parser.
Submission Number: 6
Loading