Learning complex segments. Manuscript, New York University. [pdf]
Languages differ in the status of sequences such as [mb, kp, ts]: they can pattern as complex segments or as clusters of simple consonants. We ask what evidence learners use to figure out which representations their languages motivate. We present an implemented computational model that starts with simple consonants only, and builds more complex representations by tracking statistical distributions of consonant sequences. We demonstrate that this strategy is successful in a wide range of cases, both in languages that supply clear phonotactic arguments for complex segments and in languages where the evidence is less clear. We then turn to the typological parallels between complex segments and consonant clusters: both tend to be limited in size and composition. We suggest that our approach allows the parallels to be reconciled. Finally, we compare our model with alternatives: learning complex segments from phonotactics and from phonetics.
You may view the case studies and try out the learner on the project website.