Selected publications

2021

  1. Pretraining Techniques for Sequence-to-Sequence Voice Conversion Huang, Wen-Chin, Hayashi, Tomoki, Wu, Yi-Chiao, Kameoka, Hirokazu, and Toda, Tomoki IEEE/ACM Transactions on Audio, Speech, and Language Processing 2021 [arXiv] [Demo]
  2. Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations Huang, Wen-Chin, Hayashi, Tomoki, Wu, Yi-Chiao, and Toda, Tomoki In Proc. ICASSP 2021 [arXiv] [Demo]
  3. BERT-ASR
    Speech Recognition by Simply Fine-tuning BERT Huang, Wen-Chin, Wu, Chia-Hua, Luo, Shang-Bao, Chen, Kuan-Yu, Wang, Hsin-Min, and Toda, Tomoki In Proc. ICASSP 2021 [arXiv]
  4. DVC-VTN-VAE
    A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion Huang, Wen-Chin, Kobayashi, K., Peng, Y.-H., Liu, C.-F., Tsao, Y., Wang, H.-M., and Toda, T. In Proc. Interspeech 2021 [arXiv] [Demo]
  5. On Prosody Modeling for ASR+TTS based Voice Conversion Huang, Wen-Chin, Hayashi, T., Li, X., Watanabe, S., and Toda, T. In Proc. ASRU 2021 [arXiv] [Demo]

2020

  1. VCC2020
    Voice Conversion Challenge 2020 – Intra-lingual semi-parallel and cross-lingual voice conversion – Yi, Zhao, Huang, Wen-Chin, Tian, Xiaohai, Yamagishi, Junichi, Das, Rohan Kumar, Kinnunen, Tomi, Ling, Zhen-Hua, and Toda, Tomoki In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 2020 [arXiv]
  2. Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions Das, Rohan Kumar, Kinnunen, Tomi, Huang, Wen-Chin, Ling, Zhen-Hua, Yamagishi, Junichi, Yi, Zhao, Tian, Xiaohai, and Toda, Tomoki In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 2020 [arXiv]
  3. ASR+TTS
    The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS Huang, Wen-Chin, Hayashi, Tomoki, Watanabe, Shinji, and Toda, Tomoki In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 2020 [arXiv]
  4. The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders Huang, Wen-Chin, Tobing, Patrick Lumban, Wu, Yi-Chiao, Kobayashi, Kazuhiro, and Toda, Tomoki In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 2020 [arXiv]
  5. VTN
    Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining Huang, Wen-Chin, Hayashi, Tomoki, Wu, Yi-Chiao, Kameoka, Hirokazu, and Toda, Tomoki In Proc. Interspeech 2020 [arXiv] [Demo] [Code]
  6. CDVAE-CLS-GAN
    Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion Huang, Wen-Chin, Luo, Hao, Hwang, Hsin-Te, Lo, Chen-Chou, Peng, Yu-Huai, Tsao, Yu, and Wang, Hsin-Min IEEE Transactions on Emerging Topics in Computational Intelligence 2020 [arXiv] [Demo] [Code]

2019

  1. Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion Huang, Wen-Chin, Wu, Yi-Chiao, Kobayashi, Kazuhiro, Peng, Yu-Huai, Hwang, Hsin-Te, Lumban Tobing, Patrick, Toda, Tomoki, Tsao, Yu, and Wang, Hsin-Min In Proc. 10th ISCA Speech Synthesis Workshop 2019 [arXiv] [Demo]
  2. F0-FCN-CDVAE
    Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion Huang, Wen-Chin, Wu, Yi-Chiao, Lo, Chen-Chou, Lumban Tobing, Patrick, Tomoki, Hayashi, Kobayashi, Kazuhiro, Toda, Tomoki, Tsao, Yu, and Wang, Hsin-Min In Proc. Interspeech 2019 [arXiv] [Demo]
  3. Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion Huang, Wen-Chin, Wu, Yi-Chiao, Hwang, Hsin-Te, Lumban Tobing, Patrick, Hayashi, Tomoki, Kobayashi, Kazuhiro, Toda, Tomoki, Tsao, Yu, and Wang, Hsin-Min In Proc. 27th European Signal Processing Conference (EUSIPCO) 2019 [arXiv] [Demo]

2018

  1. CDVAE
    Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders Huang, Wen-Chin, Hwang, Hsin-Te, Peng, Yu-Huai, Tsao, Yu, and Wang, Hsin-Min In Proc. The 11th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2018 [arXiv] [Demo] [Code]
  2. WaveNet Vocoder and its Applications in Voice Conversion Huang, Wen-Chin, Lo, Chen-Chou, Hwang, Hsin-Te, Tsao, Yu, and Wang, Hsin-Min In Proc. The 30th ROCLING Conference on Computational Linguistics and Speech Processing (ROCLING) 2018 [Paper]