The Mathematics of Language Universals

Arto Anttila and research group

Our project studies models that have been proposed to account for universal similarities across languages. Many linguists argue that such similarities are not just historical accidents, but can be explained in terms of our shared cognitive mechanisms. Our goal is to establish a new approach to linguistic universals that derives candidate universals from categorical and stochastic linguistic models through sophisticated mathematical analysis of the structure of these models, and validate the resulting predictions on empirical linguistic dat

The approach brings together mathematical, computational, theoretical, and empirical linguistics and promises to shed new light on the contentious question of the place of language in human cognition. The research was carried out at Stanford, in Paris, during online Skype sessions, and at conferences. The collaboration has so far produced two conference presentations (New York, Salt Lake City) and two co-authored papers (2018, submitted). Perhaps most importantly, the project has produced software for the automatic extraction and visualization of language universals from formal linguistic models.

The core idea of the project — using formal grammatical theories as heuristic deductive principles for the discovery of language universals — was stated in a 2006 manuscript by Arto Anttila (Stanford Linguistics) and Curtis Andrus (Stanford Mathematics). Despite its appeal, the idea remained largely unexplored for the following ten years. The reason seems to have been mostly the algorithmic difficulty of deducing implicational universals from a constraint-based phonological model when the representational space has a realistically high number of dimensions. When Anttila and Magri met at a workshop at Rutgers University, organized by Alan Prince in May 2015, they started discussing the problem. In 2016, a more regular long-distance exchange ensued which led to pilot results. Anttila and Magri immediately realized the potential of the approach, but also the limitations of long-distance interaction through just email and Skype. The grant from the France-Stanford Center made it possible to consolidate the collaboration into a large-scale project.

Perhaps the most exciting new development has been that we are now able to visualize implicational universals for two numerical models, Harmonic Grammar and Maximum Entropy Grammar, based on the mathematical work carried out in the earlier stages of the project, and for the first time in the history of these models. It is always very exciting to see one's theory take a concrete visual form, but that is not just a matter of aesthetic pleasure: visualization is a tremendous help in really understanding the content of one's theory and is indispensable in comparing alternative theories to one another.

So far, the project has resulted in two papers:

  • Anttila, Arto and Giorgio Magri. 2018. Does MaxEnt Overgenerate? Implicational Universals in Maximum Entropy Grammar. In Gallagher, Gillian, Maria Gouskova, and Sora Yin (eds.), Proceedings of the 2017 Annual Meeting on Phonology. Washington, DC: Linguistic Society of America.
  • Magri, Giorgio, Scott Borgeson, and Arto Anttila. submitted. Equiprobable mappings in weighted constraint grammars. Submitted to SCiL 2019 [The Society for Computation in Linguistics], New York City, January 3-6, 2019.

We are currently combining our results written as separate papers into a monograph tentatively entitled "Implicational Universals and Phonological Theory."


 

Academic Year
2017-2018
Area of Study