Publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

My full publication list can be found on my Google Scholar page or my CV.

2024

  1. A review on subjective and objective evaluation of synthetic speech
    E. Cooper, Wen-Chin Huang, Y. Tsao, H.-M. Wang, T. Toda, and J. Yamagishi
    Acoustical Science and Technology, 2024
  2. VMC’24
    The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
    In Proc. SLT, 2024

2023

  1. A Holistic Cascade System, Benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation
    Wen-Chin Huang, Benjamin Peloquin, Justine Kao, Changhan Wang, Hongyu Gong, Elizabeth Salesky, Yossi Adi, Ann Lee, and Peng-Jen Chen
    In Proc. ICASSP, 2023
  2. Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
    Wen-Chin Huang, and Tomoki Toda
    In Proc. APSIPA ASC, 2023
  3. SVCC2023
    The Singing Voice Conversion Challenge 2023
    Wen-Chin Huang, Lester Phillip Violeta, Songxiang Liu, Jiatong Shi, Yusuke Yasuda, and Tomoki Toda
    In Proc. ASRU, 2023
  4. VMC’23
    The Voicemos Challenge 2023: Zero-Shot Subjective Speech Quality Prediction for Multiple Domains
    E. Cooper, Wen-Chin Huang, Y. Tsao, H.-M. Wang, T. Toda, and J. Yamagishi
    In Proc. ASRU, 2023

2022

  1. S3PRL-VC
    S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
    Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, and Tomoki Toda
    In Proc. ICASSP, 2022
  2. N2D VC
    Towards Identity Preserving Normal to Dysarthric Voice Conversion
    Wen-Chin Huang, Bence Mark Halpern, Lester Phillip Violeta, Odette Scharenborg, and Tomoki Toda
    In Proc. ICASSP, 2022
  3. LDNet
    LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech
    Wen-Chin Huang, E. Cooper, J. Yamagishi, and T. Toda
    In Proc. ICASSP, 2022
  4. A Comparative Study of Self-Supervised Speech Representation Based Voice Conversion
    Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, and Tomoki Toda
    IEEE Journal of Selected Topics in Signal Processing, 2022
  5. End-to-End Binaural Speech Synthesis
    Wen-Chin Huang, Dejan Markovic, Alexander Richard, Israel Dejene Gebru, and Anjali Menon
    In Proc. Interspeech, 2022
  6. VMC’22
    The Voicemos Challenge 2022
    Wen-Chin Huang, E. Cooper, Y. Tsao, H.-M. Wang, J. Yamagishi, and T. Toda
    In Proc. Interspeech, 2022

2021

  1. Pretraining Techniques for Sequence-to-Sequence Voice Conversion
    Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, and Tomoki Toda
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021
  2. Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
    Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, and Tomoki Toda
    In Proc. ICASSP, 2021
  3. BERT-ASR
    Speech Recognition by Simply Fine-tuning BERT
    Wen-Chin Huang, Chia-Hua Wu, Shang-Bao Luo, Kuan-Yu Chen, Hsin-Min Wang, and Tomoki Toda
    In Proc. ICASSP, 2021
  4. DVC-VTN-VAE
    A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion
    Wen-Chin Huang, K. Kobayashi, Y.-H. Peng, C.-F. Liu, Y. Tsao, H.-M. Wang, and T. Toda
    In Proc. Interspeech, 2021
  5. On Prosody Modeling for ASR+TTS based Voice Conversion
    Wen-Chin Huang, T. Hayashi, X. Li, S. Watanabe, and T. Toda
    In Proc. ASRU, 2021
  6. Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion
    Y.-S. Liou, Wen-Chin Huang, M.-C. Yen, S.-W. Tsai, Y.-H. Peng, T. Toda, Y. Tsao, and H.-M. Wang
    In Proc. APSIPA ASC, 2021

2020

  1. VCC2020
    Voice Conversion Challenge 2020 – Intra-lingual semi-parallel and cross-lingual voice conversion –
    Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhen-Hua Ling, and Tomoki Toda
    In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
  2. Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions
    Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhen-Hua Ling, Junichi Yamagishi, Zhao Yi, Xiaohai Tian, and Tomoki Toda
    In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
  3. ASR+TTS
    The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
    Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, and Tomoki Toda
    In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
  4. The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
    Wen-Chin Huang, Patrick Lumban Tobing, Yi-Chiao Wu, Kazuhiro Kobayashi, and Tomoki Toda
    In Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
  5. VTN
    Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
    Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, and Tomoki Toda
    In Proc. Interspeech, Aug 2020
  6. CDVAE-CLS-GAN
    Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion
    Wen-Chin Huang, Hao Luo, Hsin-Te Hwang, Chen-Chou Lo, Yu-Huai Peng, Yu Tsao, and Hsin-Min Wang
    IEEE Transactions on Emerging Topics in Computational Intelligence, Aug 2020

2019

  1. Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion
    Wen-Chin Huang, Yi-Chiao Wu, Kazuhiro Kobayashi, Yu-Huai Peng, Hsin-Te Hwang, Patrick Lumban Tobing, Tomoki Toda, Yu Tsao, and Hsin-Min Wang
    In Proc. 10th ISCA Speech Synthesis Workshop, Sep 2019
  2. F0-FCN-CDVAE
    Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion
    Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing, Hayashi Tomoki, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, and Hsin-Min Wang
    In Proc. Interspeech, Sep 2019
  3. Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
    Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, and Hsin-Min Wang
    In Proc. 27th European Signal Processing Conference (EUSIPCO), Sep 2019

2018

  1. CDVAE
    Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders
    Wen-Chin Huang, Hsin-Te Hwang, Yu-Huai Peng, Yu Tsao, and Hsin-Min Wang
    In Proc. The 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), Nov 2018
  2. WaveNet Vocoder and its Applications in Voice Conversion
    Wen-Chin Huang, Chen-Chou Lo, Hsin-Te Hwang, Yu Tsao, and Hsin-Min Wang
    In Proc. The 30th ROCLING Conference on Computational Linguistics and Speech Processing (ROCLING), Oct 2018