MODELING PSEUDO-SPEAKER WITH UNCERTAINTY FOR SPEAKER ANONYMIZATION
Abstract: This paper proposes to exploit the uncertainty estimate of the speaker attributes to anonymize speech utterances. The pseudo-speaker is assumed to follow a Gaussian distribution where the covariance measures the uncertainty of representing the speaker with the mean vector. Given the utterances of a selected cohort speaker set, the pseudo-speaker distribution is estimated by minimizing its divergence from the posterior speaker distributions estimated from the utterances. After that, a pseudo-speaker vector is generated via sampling to provide the pseudo-speaker representation to every single utterance. Our experiments carried out on the datasets provided by the VoicePrivacy Challenge (VPC) demonstrate that the proposed pseudo-speaker vector is effective in de-identification when applied in anonymization.
Anonymization samples:
In the following, samples of four speakers are displayed, two from VCTK and two from LibriSpeech. For each speaker, a recording is presented for reference. Following that, the anonymized speech utterances generated with the baseline and our proposed method are given. As configured by VoicePrivacy Challenge, the speech utterances of an original speaker are anonymized with different cohort speakers when appearing in enrollment and trial. As such, for each speaker, the anonymized speech in enrollment and trial are presented, respectively.