VC

Demo pages of VC works.

Author: Wen-Chin Huang (HP)

Publications

(Interspeech 2019) Investigation of F0 conditioning and Fully Convolutional Networks in Variational Autoencoder based Voice Conversion
(SSW10 (2019)) Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion
(TETCI (2019)) Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion
(Interspeech 2020) Voice Transformer Network: Sequence-to-Sequence Voice Conversion using Transformer with Text-to-Speech Pretraining
(IEEE/ACM TASLP (2021)) Pretraining Techniques for Sequence-to-Sequence Voice Conversion
(ICASSP 2021) Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
(Interspeech 2021) A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion
(ASRU 2021) On Prosody Modeling for ASR+TTS based Voice Conversion
(APSIPA 2021) Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion
(ICASSP 2022) S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
(ICASSP 2022) Towards Identity Preserving Normal to Dysarthric Voice Conversion
(APSIPA 2023) Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
(ASJ2024 Spring) AAS-VC: On the Generalization Ability of Automatic Alignment Search based Non-autoregressive Sequence-to-sequence Voice Conversion
(Signal Processing Letters) Multi-speaker Text-to-speech Training with Speaker Anonymized Data
(ASJ2024 Autumn) Simulated electrolaryngeal speech corpus
(Submitted to ICASSP2025) Investigating Factors Related to the Naturalness of Synthesized Unison Singing
Page design from Google's Tacotron demo page.
Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY