Jax/Flax implementation of Variational-DiffWave.

Last update: Dec 16, 2022

Overview

jax-variational-diffwave

Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.)

DiffWave with Continuous-time Variational Diffusion Models.
DiffWave: A Versatile Diffusion Model for Audio Synthesis, Zhifeng Kong et al., 2020. [arXiv:2009.09761]
Variational Diffusion Models, Diederik P. Kingma et al., 2021. [arXiv:2107.00630]

Requirements

Tested in python 3.7.9 conda environment, requirements.txt

Usage

To train model, run train.py.
Checkpoint will be written on TrainConfig.ckpt, tensorboard summary on TrainConfig.log.

python train.py --data-dir /datasets/ljspeech --from-raw
tensorboard --logdir ./log/

To start to train from previous checkpoint, --load-step is available.

python train.py --load-epoch 10 --config ./ckpt/l1.json

[WIP] To synthesize test set, run synth.py.

python synth.py

[WIP] Pretrained checkpoints are relased on releases.

To use pretrained model, download files and unzip it.
Checkout git repository to proper commit tags and following is sample script.

with open('l1.json') as f:
    config = Config.load(json.load(f))

diffwave = VLBDiffWaveApp(config.model)
diffwave.restore('./l1/l1_99.ckpt')

# mel: [B, T, mel]
audio, _ = diffwave(mel, timesteps=50, key=jax.random.PRNGKey(0))

Jax/Flax implementation of Variational-DiffWave.

Related tags

Overview

jax-variational-diffwave

Requirements

Usage

Owner

YoungJoong Kim

Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash through feeding it pictures or videos.

The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.

Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data

Solver for Large-Scale Rank-One Semidefinite Relaxations

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Out-of-boundary View Synthesis towards Full-frame Video Stabilization

It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

Collection of generative models in Pytorch version.

🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

Galaxy images labelled by morphology (shape). Aimed at ML development and teaching

Styled Handwritten Text Generation with Transformers (ICCV 21)

Improved Fitness Optimization Landscapes for Sequence Design

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".