Unofficial PyTorch Implementation of Multi-Singer

Last update: Dec 28, 2022

Related tags

Deep Learning Multi-Singer

Overview

Multi-Singer

Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.

Requirements

See requirements in requirement.txt:

linux
python 3.6
pytorch 1.0+
librosa
json, tqdm, logging

TODO

1026: upload code
1024: implement multi-singer & perceptual loss
1023: implement singer encoder

Getting started

Apply recipe to your own dataset

Put any wav files in data directory
Edit configuration in config/config.yaml

1. Pretrain

Pretrain the Singer Embedding Extractor using repository here, and set the 'enc_model_fpath' in config/config.yaml

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

Extract mel-spectrogram

python preprocess.py -i data/wavs -o data/feature -c config/config.yaml

-i your audio folder

-o output acoustic feature folder

-c config file

3. Train

Training conditioned on mel-spectrogram

python train.py -i data/feature -o checkpoints/ --config config/config.yaml

-i acoustic feature folder

-o directory to save checkpoints

-c config file

4. Inference

python inference.py -i data/feature -o outputs/  -c checkpoints/*.pkl -g config/config.yaml

-i acoustic feature folder

-o directory to save generated speech

-c checkpoints file

-c config file

5. Singing Voice Synthesis

For Singing Voice Synthesis:

Take modified FastSpeech for mel-spectrogram synthesis
Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.

Acknowledgements

Citation

Please cite this repository by the "Cite this repository" of About section (top right of the main page).

Question

Feel free to contact me at [email protected]

Unofficial PyTorch Implementation of Multi-Singer

Related tags

Overview

Multi-Singer

Requirements

TODO

Getting started

Apply recipe to your own dataset

1. Pretrain

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

3. Train

4. Inference

5. Singing Voice Synthesis

Acknowledgements

Citation

Question

Owner

SunMail-hub

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

Libraries, tools and tasks created and used at DeepMind Robotics.

shufflev2-yolov5：lighter, faster and easier to deploy

Robust fine-tuning of zero-shot models

Heterogeneous Deep Graph Infomax

3D cascade RCNN for object detection on point cloud

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency

Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021

Equivariant layers for RC-complement symmetry in DNA sequence data

Python interface for SmartRF Sniffer 2 Firmware

Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

A diff tool for language models

FedScale: Benchmarking Model and System Performance of Federated Learning

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)

using yolox+deepsort for object-tracker

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".