Reflections of Linguistic History in Quantitative Phonotactics [article]

Jayden L. Macklin-Cordes, Erich R. Round
2016 Figshare  
Citation:Macklin-Cordes, J. L. & E. R. Round, 2016. Reflections of linguistic history in quantitative phonotactics. Paper presented at the Australian Linguistic Society Annual Conference, Monash University, Caulfield, Australia. 7 December 2016. Doi: https://dx.doi.org/10.6084/m9.figshare.4299365 Abstract: Advanced quantitative methods are at the cutting edge of historical linguistics, however these methods often ideally require many hundreds of data points per language. In order to generate
more » ... rder to generate reliable inferences at ever greater time depths, there is a need for typological datasets which are not only broader in coverage, but also contain a deeper store of information. We explore one avenue by extracting large numbers of high-definition phonotactic 'traits' per language. We show that these traits contain phylogenetic signal, thus demonstrating an important path towards high-powered methods of the near future. Methodology: Languages may be compared in terms of which two-segment sequences they permit. Moreover, such biphones possess distinct lexical frequencies, which can also be compared. We examined whether such data contain information about family-tree structure, i.e., phylogenetic signal. Two standard statistics are used: D [1] tests coarse-grained biphone 'permissibility' data; and K [2] tests higher-definition transition probabilities. We examined 2 subgroups of the Australian Pama-Nyungan family: 10 languages of Ngumpin-Yapa [3] and 7 of Yolngu [4], represented by phonemically-standardised lexicons from the CHIRILA database [5]. Phylogenetic signal is calculated with reference to phylogenies from C. Bowern (updated from [6]). Australian languages present a tough challenge, since phonotactically they are notoriously uniform [7­–9]. Moreover, Ngumpin-Yapa has some of the world's highest borrowing rates [10–11]. Thus we hypothesized that the coarse-grained D test would fail. The key question is whether the high-definition K test succeeds. Results: D </i [...]
doi:10.6084/m9.figshare.4299365.v1 fatcat:lacpzdehcffkhoqa4vtnm7dghe