wenet-kws

Production First and Production Ready End-to-End Keyword Spotting Toolkit.

The goal of this toolkit it to...

Small footprint keyword spotting (KWS), or specifically wake-up word (WuW) detection is a typical and important module in internet of things (IoT) devices. It provides a way for users to control IoT devices with a hands-free experience. A WuW detection system usually runs locally and persistently on IoT devices, which requires low consumptional power, less model parameters, low computational comlexity and to detect predefined keyword in a streaming way, i.e., requires low latency.

Typical Scenario

We are going to support the following typical applications of wakeup word:

Single wake-up word
Multiple wake-up words
Customizable wake-up word
Personalized wake-up word, i.e. combination of wake-up word detection and voiceprint

Dataset

We plan to support a variaty of open source wake-up word datasets, include but not limited to:

All the well-trained models on these dataset will be made public avaliable.

Runtime

We plan to support a variaty of hardwares and platforms, including:

Web browser
x86
Android
Raspberry Pi

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Related tags

Overview

wenet-kws

Typical Scenario

Dataset

Runtime

Owner

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

Skipgram Negative Sampling in PyTorch

Implementation of paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa.

DeepPavlov Tutorials

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Text-Based zombie apocalyptic decision-making game in Python

This repository contains (not all) code from my project on Named Entity Recognition in philosophical text

This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project

Conditional probing: measuring usable information beyond a baseline

Transformer - A TensorFlow Implementation of the Transformer: Attention Is All You Need

Script to generate VAD dataset used in Asteroid recipe

Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP

Yet Another Compiler Visualizer

PyTorch implementation of Tacotron speech synthesis model.

SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time

PUA Programming Language written in Python.

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

A programming language with logic of Python, and syntax of all languages.