Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Overview

Introduction

This repository is about paper SpeakerGAN , and is unofficially implemented by Mingming Huang ([email protected]), Tiezheng Wang ([email protected]) and thanks for advice from TongFeng.

SpeakerGAN paper

SpeakerGAN: Speaker identification with conditional generative adversarial network, by Liyang Chen , Yifeng Liu , Wendong Xiao , Yingxue Wang ,Haiyong Xie.

Usage

For train / test / generate:

python speakergan.py

You may need to change the path of wav vad preprocessed files.

Our results

acc: 94.27% with random sampled testset. 

acc: 93.21% with fixed start sampled testset.

using model file: model/49_D.pkl

acc: 98.44% on training classification accuracy with real samples.

There is about 4% gap on testset lower compared to paper result. We can't find out the reason. We want your help !

Alt accuracy
Alt loss_d_loss_g
Alt learning_rate

Details of paper

The following are details about this paper.

================ input ==================

  1. feature: fbank, 8000hz, 25ms frame, 10ms overlap. shape:(160,64)

  2. dataset: librispeech-100 train-clean-100 POI:251

  3. data preprocess: vad、mean and variance normalization, shuffled.

  4. 60% train. 40% test.

================ model architecture ==================

  1. dataflow: data -> feature extraction -> G & D

  2. model architecture:

    G: gated CNN, encoder-decoder, Huber loss + adversarial loss

    D: ResnetBlocks, template average pooling, FC, softmax, crossentropy loss + adversarial loss

  3. G: shuffler layer, GLU

  4. D: ReLU

================ training ==================

  1. lr: 0-9, 0.0005 | 9-49, 0.0002

  2. L(d): λ1 λ2 = 1

  3. batch_size: 64

  4. D_train steps / G_train steps = 4

  5. Ladv Loss: Label smoothing, 1 -> 0.7 ~ 1.0, 0 -> 0 ~ 0.3

======== not sure or differences with paper ========

  1. weights,bias initialize function, use: xavier_uniform and zeros

  2. pytorch huber_loss: + 0.5 to be same with paper. but no implement here.

  3. for shorter wav, paper: padded with zero. we: padded with feature again.

  4. gated cnn architecture.

  5. we use webrtcvad mode(3) for vad preprocess.

Image Fusion Transformer

Image-Fusion-Transformer Platform Python 3.7 Pytorch =1.0 Training Dataset MS-COCO 2014 (T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ram

Vibashan VS 68 Dec 23, 2022
🛠️ SLAMcore SLAM Utilities

slamcore_utils Description This repo contains the slamcore-setup-dataset script. It can be used for installing a sample dataset for offline testing an

SLAMcore 7 Aug 04, 2022
An open source python library for automated feature engineering

"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to

alteryx 6.4k Jan 03, 2023
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 662 Nov 20, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022
😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t

Hugging Face 865 Dec 24, 2022
An experimental technique for efficiently exploring neural architectures.

SMASH: One-Shot Model Architecture Search through HyperNetworks An experimental technique for efficiently exploring neural architectures. This reposit

Andy Brock 478 Aug 04, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 29 Jan 08, 2023
Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia

RedliningExploration The Google Colaboratory file contained in this repository contains work inspired by a project on educational inequality in the Ph

Benjamin Warren 1 Jan 20, 2022
Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

GAN-Supervised Dense Visual Alignment — Official PyTorch Implementation Paper | Project Page | Video This repo contains training, evaluation and visua

944 Jan 07, 2023
Tensorflow implementation for Self-supervised Graph Learning for Recommendation

If the compilation is successful, the evaluator of cpp implementation will be called automatically. Otherwise, the evaluator of python implementation will be called.

152 Jan 07, 2023
Graph Robustness Benchmark: A scalable, unified, modular, and reproducible benchmark for evaluating the adversarial robustness of Graph Machine Learning.

Homepage | Paper | Datasets | Leaderboard | Documentation Graph Robustness Benchmark (GRB) provides scalable, unified, modular, and reproducible evalu

THUDM 66 Dec 22, 2022
A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

Hypercomplex A Python library for working with quaternions, octonions, sedenions, and beyond following the Cayley-Dickson construction of hypercomplex

7 Nov 04, 2022
Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

TensorGeodesic Ray tracing of a Schwarzschild black hole written entirely in TensorFlow. Dependencies: Python 3 TensorFlow 2.x numpy matplotlib About

5 Jan 15, 2022
The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you

Yu Meng 38 Dec 12, 2022
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022
Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

SGNet Project for the IJCAI 2021 paper "Structure Guided Lane Detection" Abstract Recently, lane detection has made great progress with the rapid deve

Jinming Su 27 Dec 08, 2022
Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

Jingyun Liang 139 Dec 29, 2022