Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Last update: Dec 30, 2022

Overview

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao

Zhejiang University

ACL 2022 Main conference

Project Page

🚧 ⛏️ 🛠️ 👷

This repository is the official PyTorch implementation of our ACL-2022 paper. Now, we release the codes for SADTW algorithm in our paper. The current expected release time of the full version codes and data is at the ACL-2022 conference (before June. 2022). Please star us and stay tuned!

|--modules
    |--voice_conversion
        |--dtw
            |--enhance_sadtw.py  (Our algorithm)
|--tasks
    |--singing
        |--pitch_alignment_task.py  (Usage example)

🚀 News:

Feb.24, 2022: Our new work, NeuralSVB was accepted by ACL-2022. Demo Page.
Dec.01, 2021: Our recent work DiffSinger was accepted by AAAI-2022. | .
Sep.29, 2021: Our recent work PortaSpeech was accepted by NeurIPS-2021. .
May.06, 2021: We submitted DiffSinger to Arxiv .

Abstract

We are interested in a novel task, singing voice beautifying (SVB). Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre. Current automatic pitch correction techniques are immature, and most of them are restricted to intonation but ignore the overall aesthetic quality. Hence, we introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task, which adopts a conditional variational autoencoder as the backbone and learns the latent representations of vocal tone. In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one. Extensive experiments on both Chinese and English songs demonstrate the effectiveness of our methods in terms of both objective and subjective metrics.

Issues

Before raising a issue, please check our Readme and other issues for possible solutions.
We will try to handle your problem in time but we could not guarantee a satisfying solution.
Please be friendly.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

64 Dec 17, 2022

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

107 Dec 2, 2022

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Inductive entity representations from text via link prediction This repository contains the code used for the experiments in the paper "Inductive enti

45 Jan 9, 2023

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Ratatoskr: Worcester Tech's conference scheduling system

Ratatoskr: Worcester Tech's conference scheduling system In Norse mythology, Ratatoskr is a squirrel who runs up and down the world tree Yggdrasil to

4 Dec 22, 2022

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

25 Oct 30, 2022

Comments

Problem with proper data loading

Hi, I'd like to run your model by myself, however I cannot find proper way to load the dataset with .mp3 files you provided. Is there a chance to share the dataloader you've used or give some hints how to process the .mp3 files to valid dataset which could be used in your usage examples? I'll be very grateful!

opened by pstryczke 9
关于NSVB

听了demo后有些疑问， 1 如果实际使用来美化唱歌，那么Inference的时候是需要原唱的pitch curve对吧？ 2 虽然测试样例不在训练样本中，但是GT Professional和GT Amateur是同一个人录制的。Inference中GT Professional不可能是自己，这样泛化性有测试过吗？

opened by suzhenghang 0
hi, request for datasets and source code.

This work is very outstanding and we are insterested in it. Are there any plans to make the dataset and associated pretrained models public in the near future? Thank you

opened by hertz-pj 0

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Related tags

Overview

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Abstract

Issues

You might also like...

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Ratatoskr: Worcester Tech's conference scheduling system

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Comments

Problem with proper data loading

关于NSVB

hi, request for datasets and source code.

Releases(pre-release)

pre-release(May 27, 2022)

Owner

Jinglin Liu

yolox_backbone is a deep-learning library and is a collection of YOLOX Backbone models.

Warning: This project does not have any current developer. See bellow.

Ppq - A powerful offline neural network quantization tool with custimized IR

This is the pytorch re-implementation of the IterNorm

This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

Official implementation of MSR-GCN (ICCV 2021 paper)

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Joint parameterization and fitting of stroke clusters

This repo is to present various code demos on how to use our Graph4NLP library.

Source code, data, and evaluation details for “Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Formation, and Ramifications”

Soomvaar is the repo which 🏩 contains different collection of 👨‍💻🚀code in Python and 💫✨Machine 👬🏼 learning algorithms📗📕 that is made during 📃 my practice and learning of ML and Python✨💥

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

网络协议2天集训

Recurrent Conditional Query Learning

Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

GAN-based Matrix Factorization for Recommender Systems

3D Generative Adversarial Network

SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.

The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"