This page contains following speech samples:
- Source (Normal): Natural speech (source speaker, male, normal)
- Source (Falsetto): Natural speech (source speaker, male, falsetto)
- Target: Natural speech (target speaker, female)
- Converted (Normal-MSE): Converted speech (Source (Normal) to Target, trained w/ MSE)
- Converted (Normal-GAN): Converted speech (Source (Normal) to Target, trained w/ GAN)
- Converted (Falsetto-MSE): Converted speech (Source (Falsetto) to Target, trained w/ MSE)
- Converted (Falsetto-GAN): Converted speech (Source (Falsetto) to Target, trained w/ GAN)
(1) Speech feature conversion + WORLD vocoder
|
Source (Normal) |
Source (Falsetto) |
Target |
Converted (Normal-MSE) |
Converted (Normal-GAN) |
Converted (Falsetto-MSE) |
Converted (Falsetto-GAN) |
| 1 |
 |
 |
 |
 |
 |
 |
 |
| 2 |
 |
 |
 |
 |
 |
 |
 |
| 3 |
 |
 |
 |
 |
 |
 |
 |
| 4 |
 |
 |
 |
 |
 |
 |
 |
| 5 |
 |
 |
 |
 |
 |
 |
 |
(2) Spectral differential filtering
|
Source (Normal) |
Source (Falsetto) |
Target |
Converted (Normal-MSE) |
Converted (Normal-GAN) |
Converted (Falsetto-MSE) |
Converted (Falsetto-GAN) |
| 1 |
 |
 |
 |
 |
 |
 |
 |
| 2 |
 |
 |
 |
 |
 |
 |
 |
| 3 |
 |
 |
 |
 |
 |
 |
 |
| 4 |
 |
 |
 |
 |
 |
 |
 |
| 5 |
 |
 |
 |
 |
 |
 |
 |