Adversarial DNN-Based Voice Conversion
This page contains following speech samples:
Natural (source) : Natural speech of source speaker.
Natural (target) : Natural speech of target speaker.
MGE-FF : Minimum generation error training with Feed-Forward neural networks.
MGE-HW : Minimum generation error training with highway networks.
ADV-FF : Adversarial training with Feed-Forward neural networks.
ADV-HW : Adversarial training with highway networks.
References:
Yuki Saito , Shinnosuke Takamichi, Hiroshi Saruwatari,
"Evaluation of DNN-Based Voice Conversion Deceiving Anti-spoofing Verification ,"
IEICE Technical Report, SP2016-69, vol. 116, no. 414, Jan., 2017. (in Japanese, Student Poster Award ) (PDF , Poster )
Yuki Saito , Shinnosuke Takamichi, Hiroshi Saruwatari,
"Adversarial DNN-based voice conversion based on spectral differentials using highway networks ,"
Proc. ASJ, Spring meeting, 1-6-14, 235--236, Mar., 2017. (in Japanese) (PDF , Slide )
Samples
Natural (source)
Natural (target)
MGE-FF
MGE-HW
ADV-FF
ADV-HW
1
2
3
4
5