Statistical learning with and without a lexicon

Project Leaders:

Past Post-Doctoral Fellows:

  • Clay Beckner (Clay is now at the University of Warwick) 
  • Yoon Mi Oh (Yoon Mi is now at Anjou University)
  • Simon Todd (Simon is now at UC Santa Barbara) 

Funding Agency: Royal Society of New Zealand - Marsden Fund

  • $767,000
  • October 2016 - September 2020

Native speakers of a language display a vast amount of statistical knowledge. For example, they know where different sounds tend to occur in their language, and the relative likelihood of particular sounds occurring together in combination. This knowledge is believed to be drawn from the speaker's vocabulary - their lexicon. However speakers of a language also possess knowledge about the statistical properties of sounds in running speech, which they use to segment the speech stream into words. The relationship between knowledge of lexical statistics (generated from the lexicon) and pre-lexical statistics (generated from running speech) is not understood. What is the nature of learning that takes place when you do, or don't have a lexicon?
New Zealand provides a unique testing-ground for this question. Many New Zealanders have
regular exposure to Māori, but do not know many words. This enables us to study pre-lexical statistical learning in considerable depth. We will document the statistical properties of Māori sound structure. Then, using our established experimental architecture to present experiments in the form of computer games, we will investigate what knowledge of these properties non-Māori-speaking New Zealanders actually have.