-
Kazuki Yamauchi, Wataru Nakata, Yuki Saito, and Hiroshi Saruwatari,
"Decoding strategy with perceptual rating prediction for language model-based text-to-speech synthesis,"
Proc. NeurIPS Audio Imagination Workshop, pp. xxxx--xxxx, Vancouver, Canada, Dec. 2024. (ACCEPTED)
-
Wataru Nakata, Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"NecoBERT: Self-supervised learning model trained by masked language modeling on rich acoustic features derived from neural audio codec,"
Proc. APSIPA ASC, pp. xxxx--xxxx, Macau, China, Dec. 2024. (ACCEPTED)
-
Yuto Ishikawa, Osamu Take, Tomohiko Nakamura, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Real-time noise estimation for Lombard-effect speech synthesis in human--avatar dialogue systems,"
Proc. APSIPA ASC, pp. xxxx--xxxx, Macau, China, Dec. 2024. (ACCEPTED)
-
Kaito Baba, Wataru Nakata, Yuki Saito, and Hiroshi Saruwatari,
"The T05 system for The VoiceMOS Challenge 2024: Transfer learning from deep image classifier to naturalness MOS prediction of high-quality synthetic speech,"
Proc. SLT, pp. xxxx--xxxx, Macau, China, Dec. 2024. (ACCEPTED)
-
Kazuki Yamauchi, Yuki Saito, and Hiroshi Saruwatari,
"Cross-dialect text-to-speech in pitch-accent language incorporating multi-dialect phoneme-level BERT,"
Proc. SLT, pp. xxxx--xxxx, Macau, China, Dec. 2024. (ACCEPTED)
-
Dong Yang, Tomoki Koriyama, and Yuki Saito,
"Frame-wise breath detection with self-training: An exploration of enhancing breath naturalness in text-to-speech,"
Proc. INTERSPEECH, pp. 4928--4932, Kos, Greece, Sep. 2024. (PDF, Poster) (Shortlisted for the ISCA Best Student Paper Award 2024)
-
Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, and Hiroshi Saruwatari,
"Noise-robust voice conversion by conditional denoising training using latent variables of recording quality and environment,"
Proc. INTERSPEECH, pp. 2750--2754, Kos, Greece, Sep. 2024. (PDF, Poster)
-
Yuki Saito, Takuto Igarashi, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, and Hiroshi Saruwatari,
"SRC4VC: Smartphone-recorded corpus for voice conversion benchmark,"
Proc. INTERSPEECH, pp. 1825--1829, Kos, Greece, Sep. 2024. (PDF, Poster)
-
Kentaro Seki, Shinnosuke Takamichi, Norihiro Takamune, Yuki Saito, Kanami Imamura, and Hiroshi Saruwatari,
"Spatial voice conversion: Voice conversion preserving spatial information and non-target signals,"
Proc. INTERSPEECH, pp. 177--181, Kos, Greece, Sep. 2024. (PDF, Slide)
-
Kazuki Yamauchi, Yusuke Ijima, and Yuki Saito,
"StyleCap: Automatic speaking-style captioning from speech based on speech and language self-supervised learning models,"
Proc. ICASSP, 5 pages, Seoul, South Korea, Apr. 2024. (PDF, Poster)
-
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, and Hiroshi Saruwatari,
"Coco-Nut: Corpus of Japanese utterances and voice characteristics description for prompt-based control,"
Proc. ASRU, pp. 781--788, Taipei, Taiwan, Dec. 2023. (PDF, Project page, Poster)
-
Ryunosuke Hirai, Yuki Saito, and Hiroshi Saruwatari,
"Federated learning for human-in-the-loop many-to-many voice conversion,"
Proc. The 12th ISCA SSW, 6 pages, Grenoble, France, Aug. 2023. (OpenReview)
-
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi, Kentaro Tachibana, and Hiroshi Saruwatari,
"CALLS: Japanese empathetic dialogue speech corpus of complaint handling and attentive listening in customer center,"
Proc. INTERSPEECH, pp. 5561--5565, Dublin, Ireland, Aug. 2023. (Demo, Poster) (Travel Grant Award for INTERSPEECH2023)
-
Yota Ueda, Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, and Hiroshi Saruwatari,
"HumanDiffusion: diffusion model using perceptual gradients,"
Proc. INTERSPEECH, pp. 4264--4268, Dublin, Ireland, Aug. 2023. Poster
-
Yuki Saito, Shinnosuke Takamichi, Eiji Iimori, Kentaro Tachibana, and Hiroshi Saruwatari,
"ChatGPT-EDSS: empathetic dialogue speech synthesis trained from ChatGPT-derived context word embeddings,"
Proc. INTERSPEECH, pp. 3048--3052, Dublin, Ireland, Aug. 2023. (Demo, Slide) (Travel Grant Award for INTERSPEECH2023)
-
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, and Hiroshi Saruwatari,
"Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech,"
Proc. ICASSP, 5 pages, Rhodes, Greece, Jun. 2023. (Demo)
-
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, and Hiroshi Saruwatari,
"Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models,"
Proc. ICASSP, 5 pages, Rhodes, Greece, Jun. 2023. (Demo)
-
Kazuki Fujii, Yuki Saito, and Hiroshi Saruwatari,
"Adaptive end-to-end text-to-speech synthesis based on error correction feedback from humans,"
Proc. APSIPA ASC, pp. 1699--1674, Chiang Mai, Thailand, Nov. 2022. (PDF, Slide)
-
Yusuke Nakai, Yuki Saito, Kenta Udagawa, and Hiroshi Saruwatari,
"Multi-task adversarial training algorithm for multi-speaker neural text-to-speech,"
Proc. APSIPA ASC, pp. 744--749, Chiang Mai, Thailand, Nov. 2022. (PDF, Slide)
-
Yuki Saito, Yuto Nishimura, Shinnosuke Takamichi, Kentaro Tachibana, and Hiroshi Saruwatari,
"STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent,"
Proc. INTERSPEECH, pp. 5155--5159, Incheon, South Korea, Sep. 2022. (PDF, Speech samples, Poster)
-
Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Yuki Saito, Yusuke Ijima, Ryo Masumura, and Hiroshi Saruwatari,
"Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis,"
Proc. INTERSPEECH, pp. 4551--4555, Incheon, South Korea, Sep. 2022. (PDF, Speech samples, Poster)
-
Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi, Kentaro Tachibana, and Hiroshi Saruwatari,
"Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History,"
Proc. INTERSPEECH, pp. 3373--3377, Incheon, South Korea, Sep. 2022. (Google Travel Grants for Students in East Asia) (PDF, Speech samples, Slide)
-
Kenta Udagawa, Yuki Saito, and Hiroshi Saruwatari,
"Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS,"
Proc. INTERSPEECH, pp. 2968--2972, Incheon, South Korea, Sep. 2022. (PDF, Speech samples, Poster)
-
Xuan Luo, Shinnosuke Takamichi, Tomoki Koriyama, Yuki Saito, and Hiroshi Saruwatari,
"Emotion-controllable speech synthesis using emotion soft labels and fine-grained prosody factors,"
Proc. APSIPA ASC, pp. 794--799, Tokyo, Japan, Dec. 2021. (PDF, Speech samples)
-
Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari,
"Cross-lingual speaker adaptation using domain adaptation and speaker consistency loss for text-to-speech synthesis,"
Proc. INTERSPEECH, pp. 1614--1618, Brno, Czech Republic, Sep. 2021. (PDF)
-
Yota Ueda, Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, and Hiroshi Saruwatari,
"HumanACGAN: conditional generative adversarial network with human-based auxiliary classifier and its evaluation in phoneme perception,"
Proc. ICASSP, pp. 6468--6472, Toronto, Canada, Jun. 2021. (PDF, arXiv preprint, Poster)
-
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Ryo Masumura, and Hiroshi Saruwatari,
"Investigating effective additional contextual factors in DNN-based spontaneous speech synthesis,"
Proc. INTERSPEECH, pp. 3201--3205, Shanghai, China, Oct. 2020. (PDF)
-
Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari,
"Cross-lingual text-to-speech synthesis via domain adaptation and perceptual similarity regression in speaker space,"
Proc. INTERSPEECH, pp. 2947--2951, Shanghai, China, Oct. 2020. (PDF) (Speech samples)
-
Shunsuke Goto, Kotaro Ohnishi, Yuki Saito, Kentaro Tachibana, and Koichiro Mori,
"Face2Speech: towards multi-speaker text-to-speech synthesis using an embedding vector predicted from a face image,"
Proc. INTERSPEECH, pp. 1321--1325, Shanghai, China, Oct. 2020. (PDF) (Demo)
-
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Real-time, full-band, online DNN-based voice conversion system using a single CPU,"
Proc. INTERSPEECH, pp. 1021--1022, Shanghai, China, Oct. 2020. (PDF, Video)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"SMASH corpus: a spontaneous speech corpus recording third-person audio commentaries on gameplay,"
Proc. LREC, pp. 6573--6579, Marseille, France, May 2020. (PDF)
-
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Ryo Masumura, and Hiroshi Saruwatari,
"DNN-based speech synthesis using abundant tags of spontaneous speech corpus,"
Proc. LREC, pp. 6440--6445, Marseille, France, May 2020. (PDF)
-
Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, and Hiroshi Saruwatari,
"HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling,"
Proc. ICASSP, pp. 6239--6243, Barcelona, Spain, May 2020. (Main contribution paper for FujiSankei Business i Awards, Main contribution paper for National Institute of Technology Student Award) (PDF, arXiv preprint, Video)
-
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Lifter training and sub-band modeling for computationally efficient and high-quality voice conversion using spectral differentials,"
Proc. ICASSP, pp. 7784--7788, Barcelona, Spain, May 2020. (PDF, arXiv preprint, Video)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"DNN-based speaker embedding using subjective inter-speaker similarity for multi-speaker modeling in speech synthesis,"
Proc. The 10th ISCA SSW, pp. 51--56, Vienna, Austria, Sep. 2019. (PDF, arXiv preprint, Poster)
-
Taiki Nakamura, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, and Hiroshi Saruwatari,
"V2S attack: building DNN-based voice conversion from automatic speaker verification,"
Proc. The 10th ISCA SSW, pp. 161--165, Vienna, Austria, Sep. 2019. (PDF, arXiv preprint, Poster)
-
Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari,
"Generative moment matching network-based random modulation post-filter for DNN-based singing voice synthesis and neural double-tracking,"
Proc. ICASSP, pp. 7070--7074, Brighton, United Kingdom, May 2019. (PDF, arXiv preprint, Poster, Demo)
-
Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki and Hiroshi Saruwatari,
"Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech,"
Proc. APSIPA ASC, pp. 99--103, Hawaii, U.S.A., Nov. 2018. (Invited Special Session),
(PDF, Slide)
-
Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, and Hiroshi Saruwatari,
"Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network,"
Proc. IWAENC, pp. 286--290, Tokyo, Japan, Sep. 2018.
(PDF, Poster)
-
Yuki Saito, Yusuke Ijima, Kyosuke Nishida, and Shinnosuke Takamichi,
"Non-parallel voice conversion using variational autoencoders conditioned by phonetic posteriorgrams and d-vectors,"
Proc. ICASSP, pp. 5274--5278, Alberta, Canada, Apr. 2018. (Grants for Researchers Attending International Conferences from NEC C&C, Outstanding Paper Award for Young C&C Researchers) (PDF, Poster)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Text-to-speech synthesis using STFT spectra based on low-/multi-resolution generative adversarial networks,"
Proc. ICASSP, pp. 5299--5303, Alberta, Canada, Apr. 2018. (PDF, Poster)
-
Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Voice conversion using sequence-to-sequence learning of context posterior probabilities,"
Proc. INTERSPEECH, pp. 1268--1272, Stockholm, Sweden, Aug. 2017. (PDF, Slide, Speech samples)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Training algorithm to deceive anti-spoofing verification for DNN-based speech synthesis,"
Proc. ICASSP, pp. 4900--4904, New Orleans, U.S.A., Mar. 2017. (Spoken Language Processing Student Grant of ICASSP 2017), (PDF, Slide)
-
Yuki Saito, and Hiroshi Tenmoto,
"Construction of highly interpretable classification rule based on linear SVM,"
Proc. ISTS, Taipei, Taiwan, Nov. 2014.
-
Taisei Takano, Yuki Okamoto, Yuki Saito,
"Performance analysis on CLAP-Score for text-to-audio evaluation,"
YANS2024, Sep. 2024. (Poster) (YANS2024 IVRy Award)
-
Kazuki Yamauchi, Wataru Nakata, Yuki Saito, Hiroshi Saruwatari,
"Decoding strategy with subjective speech quality prediction for discrete-token-based text-to-speech,"
IPSJ SIG Technical Report, 2024-SLP-152, No. 14, pp. 1--6, Jun. 2024. (in Japanese, PDF, Poster) (2024 Otogaku Symposium Best Presentation Award)
-
Wataru Nakata*, Kazuki Yamauchi*, Dong Yang, Hiroaki Hyodo, and Yuki Saito,
"UTDUSS: UTokyo-SaruLab System for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge,"
Technical Report for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge, 5 pages, Mar. 2024. (arXiv, *: equal contribution) (Ranked 1st in TTS (Acoustic+Vocoder) track, Leaderboard)
-
Kazuki Yamauchi, Yuki Saito, and Hiroshi Saruwatari,
"Multi-dialect text-to-speech using VQVAE-derived interpretable accent latent variables,"
SP2023-80, Vol. 123, No. 403, pp.220--225, Jun. 2024. (in Japanese, Student Poster Award) (PDF, Poster)
-
Yuki Oda, Kazuki Yamauchi, Yuki Saito, and Hiroshi Saruwatari,
"Dialect adaptation of Japanese end-to-end text-to-speech based on crowdsourced dialect accent labels,"
SPEASIP Workshop Short Oral Presentation, Vol. 123, No. 403, Jun. 2024. (in Japanese)
-
Yuki Saito, Takuto Igarashi, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, and Hiroshi Saruwatari,
"SRC4VC: Smartphone-recorded corpus for benchmarking multi-speaker voice conversion models,"
SPEASIP Workshop Short Oral Presentation, Vol. 123, No. 403, Jun. 2024. (in Japanese)
-
Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, and Hiroshi Saruwatari,
"Noise-robust voice conversion using denoising training using recording quality and environment as conditional features,"
SP2023-45, Vol. 123, No. 403, pp. 13--18, Mar. 2024. (in Japanese, PDF)
-
Miyu Okamoto, Kentaro Seki, Shinnosuke Takamichi, Yuki Saito, and Takayuki Itoh,
"ImTTS: Multi-speaker text-to-speech system with visualization of impression estimation,"
NICOGRAPH 2023, 2 pages, P-9, Dec. 2023. (in Japanese, Peer Reviewed)
-
Yuki Saito, Shinnosuke Takamichi, Eiji Iimori, Kentaro Tachibana, and Hiroshi Saruwatari,
"ChatGPT-EDSS: Acoustic modeling for empathetic dialogue speech synthesis using ChatGPT-derived context word embeddings,"
IPSJ SIG Technical Report, 2023-SLP-147, No. 6, pp. 1--6, Jun. 2023. (in Japanese, PDF, Poster) (2023 Otogaku Symposium Best Presentation Award)
-
Junichi Kumada, Yuki Saito, Shinnosuke Takamichi, Aya Watanabe, Naoko Tanji, Mizuki Nagano, Yusuke Ijima, and Hiroshi Saruwatari,
"Analysis and evaluation towards sleep-inducing voice synthesis,"
IPSJ SIG Technical Report, 2023-SLP-147, No. 5, pp. 1--5, Jun. 2023. (in Japanese, PDF, Poster)
-
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, and Hiroshi Saruwatari,
"In-the-wild sentence data collection method towards voice characteristic control by free-form text script,"
IEICE Technical Report, NLC2022-29, Vol. 122, No. 449, pp.55--60, Mar. 2023. (in Japanese, PDF)
-
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi, Kentaro Tachibana, and Hiroshi Saruwatari,
"Corpus construction towards multi-domain empathetic dialogue speech synthesis,"
SPEASIP Workshop Short Oral Presentation, Vol. 122, No. 389, Mar. 2023. (in Japanese, Slide)
-
Ryunosuke Hirai, Yuki Saito, and Hiroshi Saruwatari,
"Fed-StarGANv2-VC: many-to-many voice conversion based on federated learning,"
IPSJ SIG Technical Report, 2023-SLP-146, No. 11, pp. 1--6, Mar. 2023. (in Japanese, PDF, Slide) (2023 IPSJ SIG-SLP Best Student Paper Award (Fairy Devices Award))
-
Yuki Saito and Hiroshi Sato,
"Report on Participation in Interspeech2022,"
IPSJ SIG Technical Report, 2022-SLP-144, No. 14, p. 1, Nov. 2022. (in Japanese, Slide)
-
Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi, Kentaro Tachibana, and Hiroshi Saruwatari,
"Empathetic dialogue synthesis considering textual and prosodic information of dialogue history,"
IPSJ SIG Technical Report, 2022-SLP-140, No. 16, pp. 1--6, Mar. 2022. (in Japanese, PDF, Speech samples, Slide)
-
Yusuke Nakai, Kenta Udagawa, Yuki Saito, and Hiroshi Saruwatari,
"Training algorithm for multi-speaker TTS considering adversarial regularizer,"
IEICE Technical Report, SP2021-57, Vol. 121, No. 385, pp. 50--55, Mar. 2022. (in Japanese, PDF, Speech samples, Slide)
-
Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Yuki Saito, Yusuke Ijima, Ryo Masumura, and Hiroshi Saruwatari,
"Multi-speaker audiobook speech synthesis using discrete character acting styles acquired by VQVAE,"
IEICE Technical Report, SP2021-47, Vol. 121, No. 282, pp. 42--47, Dec. 2021. (in Japanese, PDF, Slide, Speech samples)
-
Kazuki Fujii, Yuki Saito, and Hiroshi Saruwatari,
"Japanese non-augoregressive end-to-end text-to-speech synthesis conditioned by prosodic information,"
IPSJ SIG Technical Report, 2021-SLP-138, No. 16, pp. 1--6, Oct. 2021. (in Japanese, PDF, Slide)
-
Kenta Udagawa, Yuki Saito, and Hiroshi Saruwatari,
"Speech synthesis adaptation based on human speech perception feedback,"
IEICE Technical Report, SP2021-33, Vol. 121, No. 202, pp. 46--51, Oct. 2021. (in Japanese, PDF, Slide, Speech samples)
-
Masaki Kurata, Shinnosuke Takamichi, Takaaki Saeki, Riku Arakawa, Yuki Saito, Keita Higuchi, and Hiroshi Saruwatari,
"A method for obtaining speaking characteristics based on real-time DNN-based voice conversion feedback,"
IPSJ SIG Technical Report, 2021-SLP-136, No. 31, pp. 1--6, Mar. 2021. (in Japanese, PDF, Slide)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Active learning for DNN-based speaker embedding considering subjective inter-speaker similarity,"
IPSJ SIG Technical Report, 2021-SLP-136, No. 30, pp. 1--6, Mar. 2021. (in Japanese, 2021 IPSJ SIG-SLP Best Student Paper Award (Yahoo! JAPAN Award)) (PDF, Slide)
-
Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, and Hiroshi Saruwatari,
"HumanGAN: generative adversarial networks based on human perception evaluation and its application in speech naturalness modeling,"
IEICE Technical Report, SP2020-06, Vol. 120, No. 57, pp. 15--20, June 2020. (in Japanese, Student Poster Award) (PDF)
-
Satoshi Naitou, Yuki Saito, Shinnosuke Takamichi, Yasuyuki Saito, and Hiroshi Saruwatari,
"Automatic estimation of breath position for singing VOCALOID song,"
IPSJ SIG Technical Report, 2020-MUS-127, No. 33, pp. 1--6, June 2020. (in Japanese, PDF)
-
Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, Ryo Masumura, and Hiroshi Saruwatari,
"The effectiveness of additional context in DNN-based spontaneous speech synthesis,"
IEICE Technical Report, SP2019-61, Vol. 119, No. 441, pp. 65--70, Mar. 2020. (in Japanese, PDF)
-
Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Lifter training and sub-band modeling for DNN-based voice conversion using spectral differentials,"
IPSJ SIG Technical Report, 2020-SLP-131, No. 2, pp. 1--6, Feb. 2020. (in Japanese, PDF, Slide)
-
Kazuki Fujii, Yuki Saito, Shinnosuke Takamichi, Yukino Baba, and Hiroshi Saruwatari,
"HumanGAN: generative adversarial networks trained with human perception evaluation,"
Information-based Induction Sciences (IBIS) Workshop 2019, 2-037, Nov. 2019. (in Japanese, Poster)
-
Taiki Nakamura, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, and Hiroshi Saruwatari,
"Speaker V2S attack: statistical voice conversion built from speaker verification and its evaluation on speaker spoofing attack,"
Computer Security Symposium (CSS) 2019, 2E1-2, pp. 697--703, Oct. 2019. (in Japanese, PDF, Slide)
-
Shinnosuke Takamichi, Kentaro Mitsui, Yuki Saito, Tomoki Koriyama, Naoko Tanji, and Hiroshi Saruwatari,
"JVS corpus: online available Japanese versatile speech corpus,"
IPSJ SIG Technical Report, 2019-SLP-129, No. 1, pp. 1--6, Oct. 2019. (in Japanese, PDF, Slide)
-
Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, and Hiroshi Saruwatari,
"Generative moment matching network-based random modulation post-filter for singing voices synthesized using DNNs and its application to neural double-tracking,"
IPSJ SIG Technical Report, 2018-SLP-125, No. 1, pp. 1--6, Dec. 2018. (in Japanese, PDF, Slide)
-
Satoshi Mizoguchi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Evaluation of DNN-based low-musical-noise speech enhancement using kurtosis matching,"
IEICE Technical Report, EA2018-66, Vol. 118, No. 312, pp. 19--24, Nov. 2018. (in Japanese, PDF, Poster)
-
Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, Daichi Kitamura, and Hiroshi Saruwatari,
"Phase reconstruction from amplitude spectra based on von Mises distribution DNN,"
IPSJ SIG Technical Report, 2018-SLP-122, No. 1, pp. 1--6, June 2018. (in Japanese, 2018 Otogaku Symposium Best Presentation Award, IPSJ Yamashita SIG Research Award) (PDF, Poster)
-
Yuki Saito, Yusuke Ijima, Kyosuke Nishida, and Shinnosuke Takamichi,
"Non-parallel and many-to-many voice conversion using variational autoencoder conditioned by phonetic posteriorgrams and d-vectors,"
IEICE Technical Report, SP2017-88, Vol. 117, No. 517, pp. 21--26, Mar. 2018. (in Japanese, 2017 IEICE ISS Young Researcher's Award in Speech Field) (PDF, Slide)
-
Masakazu Une, Yuki Saito, Shinnosuke Takamichi, Daichi Kitamura, Ryoichi Miyazaki and Hiroshi Saruwatari,
"Generative adversarial training of the noise generation model for speech synthesis using speech in noise,"
IPSJ SIG Technical Report, 2017-SLP-118, No. 1, pp. 1--6, Oct. 2017. (in Japanese, PDF, Slide)
-
Hiroyuki Miyoshi, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Voice conversion using sequence-to-sequence learning of context posterior probabilities and evaluation of the dual learning,"
IEICE Technical Report, SP2017-16, Vol. 117, No. 160, pp. 9--14, Jul. 2017. (in Japanese, PDF, Slide)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Training algorithm to deceive anti-spoofing verification for DNN-based text-to-speech synthesis,"
IPSJ SIG Technical Report, 2017-SLP-115, No. 1, pp. 1--6, Feb. 2017. (in Japanese, PDF, Slide)
-
Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari,
"Evaluation of DNN-based voice conversion deceiving anti-spoofing verification,"
IEICE Technical Report, SP2016-69, Vol. 116, No. 414, pp. 29--34, Jan. 2017. (in Japanese, Student Poster Award) (PDF, Poster)
-
Google-initiated Research Grant, 30,000 USD, Nov. 2023--Oct. 2024. (Representative: Yuki Saito)
-
Japan Science and Technology Agency, ACT-X. 4,500,000 JPY, Oct. 2023--Mar. 2026. (Representative: Yuki Saito)
-
Travel Grant Award for INTERSPEECH2023, 750 EUR, Aug. 2023.
-
Research Grant (S) from Tateisi Science and Technology Foundation, 30,000,000 JPY, Apr. 2023--Mar. 2026. (Representative: Hiroshi Saruwatari)
-
Grant-in-Aid for Young Scientists, Japan Society of the Promotion of Science (JSPS), 3,600,000 JPY, Apr. 2022--Mar. 2025. (Representative: Yuki Saito)
-
Research Grant (A) from Tateisi Science and Technology Foundation, 2,200,000 JPY, Apr. 2022--Mar. 2023. (Representative: Yuki Saito)
-
Grant-in-Aid for Research Activity Start-up, Japan Society of the Promotion of Science (JSPS), 2,400,000 JPY, Sep. 2021--Mar. 2023. (Representative: Yuki Saito)
-
KIOXIA Incentive Research, 1,000,000 JPY, Jun. 2021--Mar. 2022. (Representative: Yuki Saito)
-
Grant-in-Aid for JSPS Fellows, the Japan Society of the Promotion of Science (JSPS), 2,500,000 JPY, May 2018--Mar. 2021. (Representative: Yuki Saito)
-
Grants for Researchers Attending International Conferences from NEC C&C, 250,000 JPY, Apr. 2018.
-
Winners of The INTERSPEECH2024 Discrete Speech Challenge (TTS Track), Sep. 2024.
-
2024 IPSJ Yamashita SIG Research Award, Jul. 2024.
-
The 40th Inoue Research Award for Young Scientists, Feb. 2024.
-
Travel Grant Award for INTERSPEECH2023, Aug. 2023.
-
2023 Otogaku Symposium Best Presentation Award, Jun. 2023.
-
The 22nd Funai Information Technology Award for Young Researchers, May 2023.
-
2021 IEICE Journal Paper Award, Jun. 2022.
-
2021 IPSJ SIG-SLP Best Student Paper Award (Yahoo! JAPAN Award), Mar. 2022.
-
2020 IEEE SPS Young Author Best Paper Award, Jun. 2021.
-
Dean's Award, Graduate School of Information Science and Technology, The University of Tokyo, Mar. 2021.
-
The 49th Awaya Prize Young Researcher Award of ASJ, Mar. 2021.
-
Outstanding Paper Award for Young C&C Researchers, Jan. 2019.
-
The 12th IEEE Signal Processing Society Japan Student Journal Paper Award, Nov. 2018.
-
2017 IEICE ISS Young Researcher's Award in Speech Field, Aug. 2018.
-
Partial Exemption from Repayment of Scholarship Loan for Students with Outstanding Results, Japan Student Services Organization (JASSO), May 2018.
-
The 34th TELECOM System Technology Award for Students from TAF, Mar. 2018.
-
The 1st IEEE Signal Processing Society Tokyo Joint Chapter Student Award, Nov. 2017.
-
Spoken Language Processing Student Grant of ICASSP, Mar. 2017.
-
2017 IEICE ISS Student Poster Award, Jan. 2017.
-
The 14th Best Student Presentation Award of ASJ, Mar. 2017.
-
Graduation Research Award, Advanced Course of Electronic and Information Systems Engineering, National Institute of Technology, Kushiro College, Feb. 2016.
-
Dean's Award, Department of Information Engineering, National Institute of Technology, Kushiro College, Mar. 2014.
-
YANS2024 IVRy Award, Sep. 2024. (Awardee: Taisei Takano)
-
The 28th Best Student Presentation Award of ASJ, Sep. 2024. (Awardee: Kazuki Yamauchi)
-
Shortlisted for the ISCA Best Student Paper Award 2024, Aug. 2024. (Awardee: Dong Yang)
-
2024 Otogaku Symposium Best Presentation Award, Jun. 2024. (Awardee: Kazuki Yamauchi)
-
2024 IEICE ISS Student Poster Award, Mar. 2024. (Awardee: Kazuki Yamauchi)
-
2023 IPSJ SIG-SLP Best Student Paper Award (Fairy Devices Award), Mar. 2024 (Awardee: Ryunosuke Hirai)
-
The 27th Best Student Presentation Award of ASJ, Mar. 2014. (Awardee: Aya Watanabe)
-
Google Travel Grants for Students in East Asia, Jul. 2022. (Awardee: Yuto Nishimura)
-
National Institute of Technology Student Award, Mar. 2021. (Awardee: Kazuki Fujii)
-
IPSJ SIG-MUS/SLP Student Poster Award, June 2020. (Awardee: Kazuki Fujii)
-
FujiSankei Business i Awards, June 2020. (Awardee: Kazuki Fujii)
-
IPSJ Yamashita SIG Research Award, Mar. 2020. (Awardee: Shinnosuke Takamichi)
-
The 3rd IEEE Signal Processing Society Tokyo Joint Chapter Student Award, Dec. 2019. (Awardee: Hiroki Tamaru)
-
The 18th Best Student Presentation Award of ASJ, Mar. 2019. (Awardee: Satoshi Mizoguchi)
-
2018 Otogaku Symposium Best Presentation Award, June 2018. (Awardee: Shinnosuke Takamichi)
-
Paper Reviews for Acoustical Science and Technology (from 2024)
-
Paper Reviews for Computer Speech and Language (from 2023)
-
Paper Reviews for Journal of Audio Engineering Society (from 2022)
-
Paper Reviews for IEICE Transactions on Information and Systems (from 2022)
-
Paper Reviews for Journal of Information Processing (from 2022)
-
Paper Reviews for APSIPA Transactions on Signal and Information Processing (from 2021)
-
Paper Reviews for EURASIP Journal on Audio Speech and Music Processing (from 2021)
-
Paper Reviews for INTERSPEECH (from 2021)
-
Paper Reviews for IEEE Access (from 2021)
-
Paper Reviews for IEEE/ACM Transactions on Audio, Speech, and Language Processing (from 2020)
-
Paper Reviews for IEEE MLSP (from 2019)
-
Paper Reviews for IEEE Signal Processing Letter (from 2018)
-
Paper Reviews for IEEE ICASSP (from 2018)
-
Lecturer of The University of Tokyo, Japan. Apr. 1, 2024--XX. (Lab. page)
-
Assistant Professor of The University of Tokyo, Japan. Apr. 1, 2023--Mar. 31, 2024. (Lab. page)
-
Project Research Associate of The University of Tokyo, Japan. Apr. 1, 2021--Mar. 31, 2023. ("Research and Development on Acoustic Information Processing and Voice Conversion," Moonshot Research & Development Program of Japan Science and Technology Agency, Representative: Hiroshi Saruwatari) (Project)
-
Research assistant of The University of Tokyo, Japan. Apr. 1, 2019--Mar. 31, 2021. ("Stress-free, real-time, and full-band voice conversion based on perceptual models," executed under the Commissioned Research of MIC SCOPE 182103104, Representative: Shinnosuke Takamichi) (Project)
-
Short-time researcher in DeNA Co., Ltd., Japan, Oct. 1, 2018--Mar. 31, 2019 & June 1, 2019--Mar. 31, 2020. (Mentor: Kentaro Tachibana)
-
Research fellow (DC1) of Japan Society for the Promotion of Science, Japan, Apr. 1, 2018--Mar. 31, 2021. ("Active speech synthesis based on listener perceptual modeling," JSPS KAKENHI 18J22090, Representative: Yuki Saito) (KAKEN) (Project)
-
Short-time researcher in NTT Media Intelligence Laboratories, NTT Corporation, Japan, Aug. 30, 2017--Oct. 31, 2017. (Mentor: Yusuke Ijima)
-
Short-time researcher in NTT Communication Science Laboratories, NTT Corporation, Japan, Aug. 8, 2016--Sep. 9, 2016. (Mentor: Hirokazu Kameoka)
-
Kentaro Tachibana, Yuki Saito, Kei Akuzawa, “SPEECH PROCESSING APPARATUS AND SPEECH PROCESSING PROGRAM," JP2020190605, Filled in May 21.
-
Shinnosuke Takamichi, Yuki Saito, Takaaki Saeki, and Hiroshi Saruwatari, “VOICE CONVERSION DEVICE, VOICE CONVERSION METHOD, AND VOICE CONVERSION PROGRAM," JP2021032940, Filled in Aug. 19, 2019.
-
Shinnosuke Takamichi, Yuki Saito, Takaaki Saeki, and Hiroshi Saruwatari, “VOICE CONVERSION DEVICE, VOICE CONVERSION METHOD, AND VOICE CONVERSION PROGRAM," PCT/JP2020/031122, Filled in Aug. 18, 2020.
-
Shinnosuke Takamichi, Yuki Saito, Takaaki Saeki, and Hiroshi Saruwatari, “VOICE CONVERSION DEVICE, VOICE CONVERSION METHOD, AND VOICE CONVERSION PROGRAM," PCT/JP2021/004367, Filled in Feb. 5, 2021.
-
"Applied Acoustics," Department of Mathematical Engineering and Information Physics, The University of Tokyo, Japan.(FY2024 Instructor)
-
07."Speech Production" Slide
-
08."Speech Perception" Slide
-
09."Automatic Speech Recognition System" Slide
-
10."Text-To-Speech Synthesis System" Slide
-
11."Voice Conversion System" Slide
-
12."Speaker Recognition/Verification System" Slide
-
"Academics Frontier Lecture (Introduction to Cybernetics --Advanced Information Science Connecting Physics, People, and Society--): Signal Processing Technologies for sound analysis and synthesis," College of Arts and Sciences (Junior Division), The University of Tokyo, Japan. (FY2024 Instructor) Slide
-
"Information System Laboratory III: Signal Processing and Machine Learning," Department of Mathematical Engineering and Information Physics, The University of Tokyo, Japan.(FY2024 Instructor)
-
"Information System Laboratory: Project Practice," Department of Mathematical Engineering and Information Physics, The University of Tokyo, Japan.(FY2016--2017 TA,FY2023 Instructor)
-
"Mathematical Engineering and Information Physics: Digital Signal Processing and Acoustic Systems," Department of Mathematical Engineering and Information Physics, The University of Tokyo, Japan.(FY2023 Instructor)
-
"Advanced Signal Processing," Graduate School of Information Science and Technology, The University of Tokyo, Japan. (Guest Presenter)
-
"Applied Gaussian Process and Machine Learning," Graduate School of Information Science and Technology, The University of Tokyo, Japan. (FY2021 Guest Presenter) (Slide)
-
Ph.D. degree in Information Science and Technology.
Mar. 2021,
Dept. of Information Physics and Computing,
Graduate School of Information Science and Technology,
The University of Tokyo, Japan.
(Adviser: Professor Hiroshi Saruwatari)
-
M.S. degree in Information Science and Technology.
Mar. 2018,
Dept. of Creative Informatics,
Graduate School of Information Science and Technology,
The University of Tokyo, Japan.
(Adviser: Professor Hiroshi Saruwatari)
-
B.S. degree in Engineering.
Mar. 2016,
Advanced Course of Electronic and Information Systems Engineering,
National Institute of Technology, Kushiro College, Japan.
(Adviser: Assistant Professor Hiroshi Tenmoto)
-
A.S. degree in Engineering.
Mar. 2014,
Dept. of Information Engineering,
National Institute of Technology, Kushiro College, Japan.
(Adviser: Assistant Professor Hiroshi Tenmoto)