Yuki Saito

Language: EN/JP


I'm a Ph.D. student in System #1 Lab. at the University of Tokyo.
I'm also working as Research Fellow (DC1) of The Japan Society of the Promotion of Science (JSPS).
My research interests are speech synthesis, voice conversion, machine learning, machine intelligence, and so on.

My CV is available [here].

Email: yuuki_saito {at} ipc.i.u-tokyo.ac.jp


Demonstrations:


Publications:

Journal Papers

  1. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Statistical parametric speech synthesis incorporating generative adversarial networks," IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 26, No. 1, pp. 84--96, Jan. 2018. (The 34th TELECOM System Technology Award for Students from TAF, IEEE Signal Processing Society Japan Student Journal Paper Award), (IEEE Xplore)
  2. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Voice conversion using input-to-output highway networks," IEICE Transactions on Information and Systems, Vol. E100-D, No. 8, pp. 1925--1928, Aug. 2017. (J-STAGE)

International Conferences (Peer-Reviewed)

  1. Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari, "Generative moment matching network-based random modulation post-filter for DNN-based singing voice synthesis and neural double-tracking," Proc. ICASSP, pp. 7070--7074, Brighton, United Kingdom, May 2019. (accepted) (arXiv preprint, Demo)
  2. Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki and Hiroshi Saruwatari, "Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech," Proc. APSIPA ASC, pp. 99--103, Hawaii, U.S.A., Nov. 2018. (Invited Special Session), (PDF, Slide)
  3. Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, and Hiroshi Saruwatari, "Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network," Proc. IWAENC, pp. 286--290, Tokyo, Japan, Sep. 2018. (PDF, Poster)
  4. Yuki Saito, Yusuke Ijima, Kyosuke Nishida, and Shinnosuke Takamichi, "Non-parallel voice conversion using variational autoencoders conditioned by phonetic posteriorgrams and d-vectors," Proc. ICASSP, pp. 5274--5278, Alberta, Canada, Apr. 2018. (Grants for Researchers Attending International Conferences from NEC C&C, Outstanding Paper Award for Young C&C Researchers) (PDF, Poster)
  5. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Text-to-speech synthesis using STFT spectra based on low-/multi-resolution generative adversarial networks," Proc. ICASSP, pp. 5299--5303, Alberta, Canada, Apr. 2018. (PDF, Poster)
  6. Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Voice conversion using sequence-to-sequence learning of context posterior probabilities," Proc. INTERSPEECH, pp. 1268--1272, Stockholm, Sweden, Aug. 2017. (arXiv preprint, Slide, Speech samples)
  7. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Training algorithm to deceive anti-spoofing verification for DNN-based speech synthesis," Proc. ICASSP, pp. 4900--4904, New Orleans, U.S.A., Mar. 2017. (Spoken Language Processing Student Grant of ICASSP 2017), (PDF, Slide)
  8. Yuki Saito, and Hiroshi Tenmoto, "Construction of highly interpretable classification rule based on linear SVM," Proc. ISTS, Taipei, Taiwan, Nov. 2014.

Technical Reports

  1. Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari, "Generative moment matching network-based random modulation post-filter for singing voices synthesized using DNNs and its application to neural double-tracking," IPSJ SIG Technical Report, 2018-SLP-125, No. 1, pp. 1--6, Dec. 2018. (in Japanese) (PDF, Slide)
  2. Satoshi Mizoguchi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Evaluation of DNN-based low-musical-noise speech enhancement using kurtosis matching," IEICE Technical Report, EA2018-xx, Vol. xxx, No. xxx, pp. xx--xx, Nov. 2018. (in Japanese) (PDF, Poster)
  3. Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, and Hiroshi Saruwatari, "Phase reconstruction from amplitude spectra based on von Mises distribution DNN," IPSJ SIG Technical Report, 2018-SLP-122, No. 1, pp. 1--6, June 2018. (in Japanese, Presentation Award) (PDF, Poster)
  4. Yuki Saito, Yusuke Ijima, Kyosuke Nishida, and Shinnosuke Takamichi, "Non-parallel and many-to-many voice conversion using variational autoencoder conditioned by phonetic posteriorgrams and d-vectors," IEICE Technical Report, SP2017-88, Vol. 117, No. 517, pp. 21--26, Mar. 2018. (in Japanese, 2017 IEICE ISS Young Researcher's Award in Speech Field) (PDF, Slide)
  5. Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki and Hiroshi Saruwatari, "Generative adversarial training of the noise generation model for speech synthesis using speech in noise," IPSJ SIG Technical Report, 2017-SLP-118, No. 1, pp. 1--6, Oct. 2017. (in Japanese) (PDF, Slide)
  6. Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Voice conversion using sequence-to-sequence learning of context posterior probabilities and evaluation of the dual learning," IEICE Technical Report, SP2017-16, Vol. 117, No. 160, pp. 9--14, Jul. 2017. (in Japanese) (PDF, Slide)
  7. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Training algorithm to deceive anti-spoofing verification for DNN-based text-to-speech synthesis," IPSJ SIG Technical Report, 2017-SLP-115, No. 1, pp. 1--6, Feb. 2017. (in Japanese) (PDF, Slide)
  8. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Evaluation of DNN-based voice conversion deceiving anti-spoofing verification," IEICE Technical Report, SP2016-69, Vol. 116, No. 414, pp. 29--34, Jan. 2017. (in Japanese, Student Poster Award) (PDF, Poster)

Domestic Conferences

  1. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "DNN-based speaker embedding considering subjective inter-speaker similarity towards DNN-based speech synthesis," Proc. ASJ, Spring meeting, 3-10-7, pp. xxx--xxx, Mar. 2019. (in Japanese) (PDF, Slide)
  2. Taiki Nakamura, Yuki Saito, Kyosuke Nishida, Yusuke Ijima, and Shinnosuke Takamichi, "Evaluation of VAE-based non-parallel and many-to-many voice conversion conditioned by phonetic posteriorgrams and d-vectors in terms of training data and dimensionality of d-vectors," Proc. ASJ, Spring meeting, 2-P-30, pp. xxx--xxx, Mar. 2019. (in Japanese) (PDF, Poster)
  3. Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari, "Generative moment matching network-based random modulation post-filter for singing voices and its application to double-tracking," Proc. ASJ, Spring meeting, 2-10-5, pp. xxx--xxx, Mar. 2019. (in Japanese) (PDF)
  4. Satoshi Mizoguchi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Low-musical-noise DNN-based speech enhancement applied to noise with various kurtosis," Proc. ASJ, Spring meeting, 1-6-6, pp. xxx--xxx, Mar. 2019. (in Japanese) (PDF)
  5. Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, and Hiroshi Saruwatari, "Phase reconstruction from amplitude spectrograms based on directional-statistics DNNs," Proc. ASJ, Autumn meeting, 2-4-2, pp. 1127--1130, Sep. 2018. (in Japanese) (PDF, Slide)
  6. Satoshi Mizoguchi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Low-musical-noise speech enhancement based on DNNs and kurtosis matching," Proc. ASJ, Autumn meeting, 2-1-7, pp. 177--180, Sep. 2018. (in Japanese, Student Presentation Award) (PDF, Slide)
  7. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Adversarial DNN-based speech synthesis using multi-frequency resolution STFT spectra," Proc. ASJ, Spring meeting, 3-8-14, pp. 259--262, Mar. 2018. (in Japanese) (PDF, Slide)
  8. Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki and Hiroshi Saruwatari, "Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech," Proc. ASJ, Spring meeting, 3-8-8, pp. 243--244, Mar. 2018. (in Japanese) (PDF, Slide)
  9. Shinnosuke Takamichi, Tomoki Koriyama, Yuki Saito, and Hiroshi Saruwatari, "Evaluation of inter-utterance variation in speech synthesis based on moment-matching networks," Proc. ASJ, Autumn meeting, 1-8-9, pp. 195--196, Sep. 2017. (in Japanese) (PDF, Slide)
  10. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Experimental investigation of divergences in adversarial DNN-based speech synthesis," Proc. ASJ, Autumn meeting, 1-8-7, pp. 189--192, Sep. 2017. (in Japanese) (PDF, Slide)
  11. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "F0 contour and duration generation for adversarial DNN-based speech synthesis," Proc. ASJ, Spring meeting, 2-6-6, pp. 257--258, Mar. 2017. (in Japanese) (PDF, Slide)
  12. Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Voice conversion using sequence-to-sequence learning of context posterior probabilities," Proc. ASJ, Spring meeting, 1-6-15, pp. 237--238, Mar. 2017. (in Japanese) (PDF, Slide)
  13. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Adversarial DNN-based voice conversion based on spectral differentials using highway networks," Proc. ASJ, Spring meeting, 1-6-14, pp. 235--236, Mar. 2017. (in Japanese, IEEE Signal Processing Society Tokyo Joint Chapter Student Award) (PDF, Slide)
  14. Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Training algorithm considering anti-spoofing verification for DNN-based speech synthesis," Proc. ASJ, Autumn meeting, 3-5-1, pp. 149--150, Sep. 2016. (in Japanese, Student Presentation Award) (PDF, Slide)

Dissertations

  1. Yuki Saito (supervisor: Prof. Hiroshi Saruwatari), "High-quality statistical parametric speech synthesis using generative adversarial networks," M.S. Thesis, Graduate School of Information Science and Technology, the University of Tokyo, 2018.


Competitive Funds:

  1. Grant-in-Aid for JSPS Fellows, the Japan Society of the Promotion of Science (JSPS), May 2018.
  2. Grants for Researchers Attending International Conferences from NEC C&C, Apr. 2018.

Awards:

  1. Outstanding Paper Award for Young C&C Researchers, Jan. 2019.
  2. The 12th IEEE Signal Processing Society Japan Student Journal Paper Award, Nov. 2018.
  3. 2017 IEICE ISS Young Researcher's Award in Speech Field, Aug. 2018.
  4. Partial Exemption from Repayment of Scholarship Loan for Students with Outstanding Results, Japan Student Services Organization (JASSO), May 2018.
  5. The 34th TELECOM System Technology Award for Students from TAF, Mar. 2018.
  6. The 1st IEEE Signal Processing Society Tokyo Joint Chapter Student Award, Nov. 2017.
  7. Spoken Language Processing Student Grant of ICASSP, Mar. 2017.
  8. 2017 IEICE ISS Student Poster Award, Jan. 2017.
  9. The 14th Best Student Presentation Award of ASJ, Mar. 2017.
  10. Graduation Research Award, Advanced Course of Electronic and Information Systems Engineering, National Institute of Technology, Kushiro College, Feb. 2016.
  11. Dean”Ēs Award, Department of Information Engineering, National Institute of Technology, Kushiro College, Mar. 2014.

Co-author's Awards:

  1. The 18th Best Student Presentation Award of ASJ, Mar. 2019. (Awardee: Satoshi Mizoguchi)
  2. IPSJ SIG-MUS/SLP Presentation Award, June 2018. (Awardee: Shinnosuke Takamichi)

Professional Activities:

  1. Paper Reviews for IEEE ICASSP (from 2018)
  2. Paper Reviews for IEEE Signal Processing Letter (from 2018)

Research and Work Experiences:

  1. Short-time researcher in DeNA Co., Ltd., Japan, Sep. 2018--Mar. 2019. (Mentor: Kentaro Tachibana)
  2. Research fellow (DC1) of Japan Society for the Promotion of Science, Japan, Apr. 2018--Mar. 2021.
  3. Short-time researcher in NTT Media Intelligence Laboratories, NTT Corporation, Japan, Aug. 2017--Oct. 2017. (Mentor: Yusuke Ijima)
  4. Short-time researcher in NTT Communication Science Laboratories, NTT Corporation, Japan, Aug. 2016--Sep. 2016. (Mentor: Hirokazu Kameoka)

Volunteer Works:

  1. Acoustical Society of Japan (ASJ) Students and Young Researchers Forum, Organizing member (from Mar. 2017) and Vice President (from Apr. 2019 to Mar. 2021)


Education:

Misc.: