MODELING PSEUDO-SPEAKER WITH UNCERTAINTY FOR SPEAKER ANONYMIZATION

Abstract: This paper proposes to exploit the uncertainty estimate of the speaker attributes to anonymize speech utterances. The pseudo-speaker is assumed to follow a Gaussian distribution where the covariance measures the uncertainty of representing the speaker with the mean vector. Given the utterances of a selected cohort speaker set, the pseudo-speaker distribution is estimated by minimizing its divergence from the posterior speaker distributions estimated from the utterances. After that, a pseudo-speaker vector is generated via sampling to provide the pseudo-speaker representation to every single utterance. Our experiments carried out on the datasets provided by the VoicePrivacy Challenge (VPC) demonstrate that the proposed pseudo-speaker vector is effective in de-identification when applied in anonymization.



In the following, samples of four speakers are displayed, two from VCTK and two from LibriSpeech. For each speaker, a recording is presented for reference. Following that, the anonymized speech utterances generated with the baseline and our proposed method are given. As configured by VoicePrivacy Challenge, the speech utterances of an original speaker are anonymized with different cohort speakers when appearing in enrollment and trial. As such, for each speaker, the anonymized speech in enrollment and trial are presented, respectively.
Anonymization samples:
Recording Baseline Pseudo-speaker vector
Enrollment Trial Enrollment Trial
VCTK-p234 p234_001_mic2 p234_089_mic2 p234_005_mic2 p234_089_mic2 p234_005_mic2
VCTK-p234 p234_002_mic2 p234_152_mic2 p234_024_mic2 p234_152_mic2 p234_024_mic2
VCTK-p237 p237_001_mic2 p237_175_mic2 p237_004_mic2 p237_175_mic2 p237_004_mic2
VCTK-p237 p237_002_mic2 p237_262_mic2 p237_007_mic2 p237_262_mic2 p237_007_mic2
LibriSpeech-260 260-123286-0000 260-123286-0009 260-123288-0007 260-123286-0009 260-123288-0007
LibriSpeech-260 260-123286-0001 260-123286-0017 260-123288-0012 260-123286-0017 260-123288-0012
LibriSpeech-121 121-121726-0002 121-121726-0000 121-123852-0002 121-121726-0000 121-123852-0002
LibriSpeech-121 121-121726-0003 121-121726-0001 121-123852-0003 121-121726-0001 121-123852-0003