Vowel normalisation in latent space for sociolinguistics

Research output: Chapter in Book/Report/Conference proceedingConference contribution

49 Downloads (Pure)

Abstract

To study variations in vowel sounds between different sociolinguistic groups, sounds must be normalized to minimize variations caused by physical factors. The Lobanov method, for example, standardizes formant distributions by speaker. Since formants are often difficult to measure, and offer only a partial description of sounds, a robust and reproducible normalisation method based on the whole spectrum would be useful. One candidate is speaker-level standardization in the latent space of a variational auto-encoder, trained on a large sample of vowel spectra. We show that whole spectrum transformations induced by latent normalisation shift formants similarly to direct formant normalisation. We also show that formant-based normalisation procedures can be used to induce whole-spectrum transformations via latent space.
Original languageEnglish
Title of host publicationProceedings of Interspeech 2023
PublisherInternational Speech Communication Association
Pages3547-3551
DOIs
Publication statusPublished - 20 Aug 2023
EventInterspeech 2023 - Dublin, Ireland
Duration: 20 Aug 202324 Aug 2023

Conference

ConferenceInterspeech 2023
Country/TerritoryIreland
CityDublin
Period20/08/2324/08/23

Keywords

  • normalisation
  • formants
  • vowels
  • dialects
  • sociolinguistics

Cite this