This page contains following speech samples:
- Source (Normal): Natural speech (source speaker, male, normal)
- Source (Falsetto): Natural speech (source speaker, male, falsetto)
- Target: Natural speech (target speaker, female)
- Converted (Normal-MSE): Converted speech (Source (Normal) to Target, trained w/ MSE)
- Converted (Normal-GAN): Converted speech (Source (Normal) to Target, trained w/ GAN)
- Converted (Falsetto-MSE): Converted speech (Source (Falsetto) to Target, trained w/ MSE)
- Converted (Falsetto-GAN): Converted speech (Source (Falsetto) to Target, trained w/ GAN)
(1) Speech feature conversion + WORLD vocoder
|
Source (Normal) |
Source (Falsetto) |
Target |
Converted (Normal-MSE) |
Converted (Normal-GAN) |
Converted (Falsetto-MSE) |
Converted (Falsetto-GAN) |
1 |
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
3 |
|
|
|
|
|
|
|
4 |
|
|
|
|
|
|
|
5 |
|
|
|
|
|
|
|
(2) Spectral differential filtering
|
Source (Normal) |
Source (Falsetto) |
Target |
Converted (Normal-MSE) |
Converted (Normal-GAN) |
Converted (Falsetto-MSE) |
Converted (Falsetto-GAN) |
1 |
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
3 |
|
|
|
|
|
|
|
4 |
|
|
|
|
|
|
|
5 |
|
|
|
|
|
|
|