Abstract
In this paper, we are tackling a practical problem which can be faced when establishing an i-vector speaker recognition system with limited resources. This addresses the problem of lack of development data of multiple recordings for each speaker. When we only have one recording for each speaker in the development set, phonetic variability can be simply synthesised by dividing the recordings if they are of sufficient length. For channel variability, we pass the recordings through a Gaussian channel to produce another set of recordings, referred to here as Gaussian version recordings. The proposed method for channel variability synthesis produces total relative improvements in EER of 5%.
Original language | English |
---|---|
Title of host publication | IET 3rd International Conference on Intelligent Signal Processing (ISP 2017) |
Place of Publication | London, UK |
Publisher | IET Conference Publications |
Pages | 1-6 |
Number of pages | 6 |
ISBN (Electronic) | 978-1-78561-708-9 |
ISBN (Print) | 978-1-78561-707-2 |
DOIs | |
Publication status | Published - 21 May 2018 |
Event | IET 3rd International Conference on Intelligent Signal Processing (ISP 2017) - Savoy Place, IET Headquarters, London, United Kingdom Duration: 4 Dec 2017 → 5 Dec 2017 Conference number: 3 http://digital-library.theiet.org/content/conferences/cp731 |
Conference
Conference | IET 3rd International Conference on Intelligent Signal Processing (ISP 2017) |
---|---|
Abbreviated title | IET ISP 2017 |
Country/Territory | United Kingdom |
City | London |
Period | 4/12/17 → 5/12/17 |
Internet address |
Keywords
- multi-condition training
- i-vector
- session variability