Abstract: Sound correspondence patterns play a crucial role for linguistic reconstruction. Linguists
use them to prove language relationship, to reconstruct proto-forms, and for classical
phylogenetic reconstruction based on shared innovations.Cognate words which fail to
conform with expected patterns can further point to various kinds of exceptions in sound
change, such as analogy or assimilation of frequent words. Here we present an automatic
method for the inference of sound correspondence patterns across multiple languages based
on a network approach. The core idea is to represent all columns in aligned cognate sets as
nodes in a network with edges representing the degree of compatibility between the nodes.
The task of inferring all compatible correspondence sets can then be handled as the well-known minimum clique cover problem in graph theory, which essentially seeks to split the graph into the smallest number of cliques in which each node is represented by exactly one clique. The resulting partitions represent all correspondence patterns which can be
inferred for a given dataset. By excluding those patterns which occur in only a few cognate
sets, the core of regularly recurring sound correspondences can be inferred. Based on this
idea, the paper presents a method for automatic correspondence pattern recognition, which
is implemented as part of a Python library which supplements the paper. To illustrate the
usefulness of the method, various tests are presented, and concrete examples of the output
of the method are provided. In addition to the source code, the study is supplemented by
a short interactive tutorial that illustrates how to use the new method and how to inspect
its results.
Keywords: sound correspondence patterns, network analysis, graph coloring, historical linguistics, sequence comparison
TL;DR: The paper describes a new algorithm by which sound correspondence patterns for multiple languages can be inferred.
0 Replies
Loading