This is a collaboration between researchers in linguistics, mathematics and statistical physics from Cambridge and Portsmouth Universities.

The team will develop a web application to automate the collection and analysis of a large corpus of spoken language, paired with social and geographical speaker information.

By pairing raw acoustic data with spatial-social data, the dataset can be used for modelling language learning and evolution and to understand how language changes are taking effect across the age and social spectra.

The systematic collection of speech data with meaningful demographics in this way is a recent and exciting opportunity that could also lead to useful insights and applications in a number of other fields such as machine learning.
