This page contains following speech samples:

  1. Source (Normal): Natural speech (source speaker, male, normal)
  2. Source (Falsetto): Natural speech (source speaker, male, falsetto)
  3. Target: Natural speech (target speaker, female)
  4. Converted (Normal-MSE): Converted speech (Source (Normal) to Target, trained w/ MSE)
  5. Converted (Normal-GAN): Converted speech (Source (Normal) to Target, trained w/ GAN)
  6. Converted (Falsetto-MSE): Converted speech (Source (Falsetto) to Target, trained w/ MSE)
  7. Converted (Falsetto-GAN): Converted speech (Source (Falsetto) to Target, trained w/ GAN)

(1) Speech feature conversion + WORLD vocoder

Source (Normal) Source (Falsetto) Target Converted (Normal-MSE) Converted (Normal-GAN) Converted (Falsetto-MSE) Converted (Falsetto-GAN)
1
2
3
4
5

(2) Spectral differential filtering

Source (Normal) Source (Falsetto) Target Converted (Normal-MSE) Converted (Normal-GAN) Converted (Falsetto-MSE) Converted (Falsetto-GAN)
1
2
3
4
5