Wenet STT Python

Overview

Wenet STT Python

Beta Software

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using WeNet models for speech recognition.

Donate Donate Donate

Requirements:

  • Python 3.7+ x64
  • Platform: Windows/Linux/MacOS
  • Python package requirements: cffi, numpy
  • Wenet Model (must be "runtime" format)
    • Several are available ready-to-go on this project's releases page and below.

Features:

  • Synchronous decoding of single utterance
  • Streaming decoding, using separate thread

Models:

Model Download Size
gigaspeech_20210728_u2pp_conformer 549 MB
gigaspeech_20210811_conformer_bidecoder 540 MB

Usage

from wenet_stt import WenetSTTModel
model = WenetSTTModel(WenetSTTModel.build_config('model_dir'))

import wave
with wave.open('tests/test.wav', 'rb') as wav_file:
    wav_samples = wav_file.readframes(wav_file.getnframes())

assert model.decode(wav_samples).lower() == 'it depends on the context'

Also contains a simple CLI interface for recognizing wav files:

$ python -m wenet_stt decode model test.wav
IT DEPENDS ON THE CONTEXT
$ python -m wenet_stt decode model test.wav test.wav
IT DEPENDS ON THE CONTEXT
IT DEPENDS ON THE CONTEXT
$ python -m wenet_stt -h
usage: python -m wenet_stt [-h] {decode} ...

positional arguments:
  {decode}    sub-command
    decode    decode one or more WAV files

optional arguments:
  -h, --help  show this help message and exit

Installation/Building

Recommended installation via binary wheel from pip (requires a recent version of pip):

python -m pip install wenet_stt

For details on building from source, see the Github Actions build workflow.

Author

License

This project is licensed under the GNU Affero General Public License v3 (AGPL-3.0-or-later). See the LICENSE file for details. If this license is problematic for you, please contact me.

Acknowledgments

  • Contains and uses code from WeNet, licensed under the Apache-2.0 License, and other transitive dependencies (see source).
You might also like...
Space-invaders - Simple Game created using Python & PyGame, as my Beginner Python Project
Space-invaders - Simple Game created using Python & PyGame, as my Beginner Python Project

Space Invaders This is a simple SPACE INVADER game create using PYGAME whihc hav

Snapchat-filters-app-opencv-python - Here we used opencv and other inbuilt python modules to create filter application like snapchat Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python
Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python

yolov5-opencv-cpp-python Example of performing inference with ultralytics YOLO V

Python-kafka-reset-consumergroup-offset-example - Python Kafka reset consumergroup offset example

Python Kafka reset consumergroup offset example This is a simple example of how

Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.
Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

PyOpenVINO - An Experimental Python Implementation of OpenVINO Inference Engine (minimum-set) Description The PyOpenVINO is a spin-off product from my

A python-image-classification web application project, written in Python and served through the Flask Microframework
A python-image-classification web application project, written in Python and served through the Flask Microframework

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.
PyArmadillo: an alternative approach to linear algebra in Python

PyArmadillo is a linear algebra library for the Python language, with an emphasis on ease of use.

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

Comments
  • library dependency failures

    library dependency failures

    when running decode, i get a library linking issue python -m wenet_stt decode model test.wav

      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/runpy.py", line 194, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/__main__.py", line 46, in <module>
        main()
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/__main__.py", line 24, in main
        wenet_stt = WenetSTTModel(WenetSTTModel.build_config(args.model_dir))
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/wrapper.py", line 71, in __init__
        super().__init__()
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/wrapper.py", line 35, in __init__
        self.init_ffi()
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/wrapper.py", line 39, in init_ffi
        cls._lib = _ffi.init_once(cls._init_ffi, cls.__name__ + '._init_ffi')
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/cffi/api.py", line 749, in init_once
        result = func()
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/wrapper.py", line 48, in _init_ffi
        return _ffi.dlopen(_library_binary_path)
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/cffi/api.py", line 150, in dlopen
        lib, function_cache = _make_ffi_library(self, name, flags)
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/cffi/api.py", line 832, in _make_ffi_library
        backendlib = _load_backend_lib(backend, libname, flags)
      File "/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/cffi/api.py", line 827, in _load_backend_lib
        raise OSError(msg)
    OSError: cannot load library '/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/libwenet_stt_lib.dylib': dlopen(/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/libwenet_stt_lib.dylib, 0x0002): Library not loaded: @rpath/libtorch.dylib
      Referenced from: /Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/libwenet_stt_lib.dylib
      Reason: tried: '/private/var/folders/w_/vt72cbr92797v0q4r91wk8380000gn/T/pip-req-build-tp3um_02/native/wenet/runtime/server/x86/fc_base/openfst-subbuild/openfst-populate-prefix/lib/libtorch.dylib' (no such file), '/private/var/folders/w_/vt72cbr92797v0q4r91wk8380000gn/T/pip-req-build-tp3um_02/native/wenet/runtime/server/x86/fc_base/libtorch-src/lib/libtorch.dylib' (no such file), '/private/var/folders/w_/vt72cbr92797v0q4r91wk8380000gn/T/pip-req-build-tp3um_02/native/wenet/runtime/server/x86/fc_base/openfst-subbuild/openfst-populate-prefix/lib/libtorch.dylib' (no such file), '/private/var/folders/w_/vt72cbr92797v0q4r91wk8380000gn/T/pip-req-build-tp3um_02/native/wenet/runtime/server/x86/fc_base/libtorch-src/lib/libtorch.dylib' (no such file), '/Users/myuser/opt/miniconda3/envs/wenet/lib/libtorch.dylib' (no such file), '/Users/myuser/opt/miniconda3/envs/wenet/bin/../lib/libtorch.dylib' (no such file), '/usr/local/lib/libtorch.dylib' (no such file), '/usr/lib/libtorch.dylib' (no such file).  Additionally, ctypes.util.find_library() did not manage to locate a library called '/Users/myuser/opt/miniconda3/envs/wenet/lib/python3.8/site-packages/wenet_stt/libwenet_stt_lib.dylib'```
    opened by eschmidbauer 0
  • Issues with LM (TLG-rescoring)

    Issues with LM (TLG-rescoring)

    I'm trying to use CTC WFST-search for rescoring with compiled TLG graph using this tutorial: https://wenet-e2e.github.io/wenet/lm.html and passing these parameters to decoder: config = { "model_path": f"wenet/{model_name}/final.zip", "dict_path": f"wenet/{model_name}/words.txt", "rescoring_weight": 1.0, "blank_skip_thresh": 0.98, "beam": 15.0, "lattice_beam": 7.5, "min_active": 10, "max_active": 7000, "ctc_weight": 0.5, "reverse_weight": 0.0, "chunk_size": -1, "fst_path": f"wenet/examples/aishell/s0/data/lang_test/TLG.fst" }

    However I'm getting error: `ERROR: FstImpl::ReadHeader: FST not of type vector, found qq: wenet/examples/aishell/s0/data/lang_test/TLG.fst F1102 22:28:04.138978 26002 wenet_stt_lib.cpp:160] Check failed: fst != nullptr *** Check failure stack trace: *** @ 0x7f81d6cfb38d google::LogMessage::Fail() @ 0x7f81d6cfd604 google::LogMessage::SendToLog() @ 0x7f81d6cfaec0 google::LogMessage::Flush() @ 0x7f81d6cfdd89 google::LogMessageFatal::~LogMessageFatal() @ 0x7f81e83701b5 InitDecodeResourceFromSimpleJson() @ 0x7f81e8380ebc WenetSTTModel::WenetSTTModel() @ 0x7f81e83719bb wenet_stt__construct @ 0x7f82021b7dec ffi_call_unix64 @ 0x7f82021b6f55 ffi_call @ 0x7f82023d9e56 cdata_call @ 0x5da58b _PyObject_FastCallKeywords @ 0x54bc71 (unknown) @ 0x552d2d _PyEval_EvalFrameDefault @ 0x54cb89 _PyEval_EvalCodeWithName @ 0x5dac6e _PyFunction_FastCallDict @ 0x590713 (unknown) @ 0x5da1c9 _PyObject_FastCallKeywords @ 0x552fb7 _PyEval_EvalFrameDefault @ 0x54c522 _PyEval_EvalCodeWithName @ 0x54e933 PyEval_EvalCode @ 0x6305a2 (unknown) @ 0x630657 PyRun_FileExFlags @ 0x6312cf PyRun_SimpleFileExFlags @ 0x654232 (unknown) @ 0x65458e _Py_UnixMain @ 0x7f820422fb97 __libc_start_main @ 0x5e0cca _start @ (nil) (unknown) Aborted

    The same TLG-graph works fine when I'm using the default WeNet decoder. Ubuntu 18.04.

    opened by tonko22 0
Owner
David Zurow
david.zurow at gmail
David Zurow
This is Unofficial Repo. Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection (CVPR 2021)

Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection This is a PyTorch implementation of the LipForensics paper. This is an U

Minha Kim 2 May 11, 2022
Official repo for our 3DV 2021 paper "Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements".

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy Paper. Pr

Yu Rong 41 Dec 13, 2022
Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

CSA: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking PyTorch training code for CSA (Contextual Similarity Aggregation). We

Hui Wu 19 Oct 21, 2022
A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Bag of tricks for long-tailed visual recognition with deep convolutional neural networks This repository is the official PyTorch implementation of AAA

Yong-Shun Zhang 181 Dec 28, 2022
Dist2Dec: A Simplicial Neural Network for Homology Localization

Dist2Dec: A Simplicial Neural Network for Homology Localization

Alexandros Keros 6 Jun 12, 2022
This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans. TABS relies on a Res-Unet backbone, with a Vision

6 Nov 07, 2022
Fight Recognition from Still Images in the Wild @ WACVW2022, Real-world Surveillance Workshop

Fight Detection from Still Images in the Wild Detecting fights from still images is an important task required to limit the distribution of social med

Şeymanur Aktı 10 Nov 09, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

Hao Tang 131 Dec 07, 2022
An experiment to bait a generalized frontrunning MEV bot

Honeypot 🍯 A simple experiment that: Creates a honeypot contract Baits a generalized fronturnning bot with a unique transaction Analyze bot behaviour

0x1355 14 Nov 24, 2022
Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

CyGNet This repository reproduces the AAAI'21 paper “Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Network

CunchaoZ 89 Jan 03, 2023
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

SW-CV-ModelZoo Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset Framework: TF/Keras 2.7 Training SQLite D

20 Dec 27, 2022
Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

PGNet Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022, CVPR 2022 (arXiv 2204.05041) Abstract Recent salient objec

CVTEAM 109 Dec 05, 2022
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022
Annotated, understandable, and visually interpretable PyTorch implementations of: VAE, BIRVAE, NSGAN, MMGAN, WGAN, WGANGP, LSGAN, DRAGAN, BEGAN, RaGAN, InfoGAN, fGAN, FisherGAN

Overview PyTorch 0.4.1 | Python 3.6.5 Annotated implementations with comparative introductions for minimax, non-saturating, wasserstein, wasserstein g

Shayne O'Brien 471 Dec 16, 2022
Flexible Option Learning - NeurIPS 2021

Flexible Option Learning This repository contains code for the paper Flexible Option Learning presented as a Spotlight at NeurIPS 2021. The implementa

Martin Klissarov 7 Nov 09, 2022
RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

RRxIO - Robust Radar Visual/Thermal Inertial Odometry RRxIO offers robust and accurate state estimation even in challenging visual conditions. RRxIO c

Christopher Doer 64 Dec 29, 2022
for taichi voxel-challange event

Taichi Voxel Challenge Figure: result of python3 example6.py. Please replace the image above (demo.jpg) with yours, so that other people can immediate

Liming Xu 20 Nov 26, 2022
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo

Cleison Amorim 5 Nov 20, 2022
Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Description The source code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chin

Zhengxiang Wang 3 Jun 28, 2022