Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

Related tags

Audiogammatone
Overview

Gammatone Filterbank Toolkit

Utilities for analysing sound using perceptual models of human hearing.

Jason Heeris, 2013

Summary

This is a port of Malcolm Slaney's and Dan Ellis' gammatone filterbank MATLAB code, detailed below, to Python 2 and 3 using Numpy and Scipy. It analyses signals by running them through banks of gammatone filters, similar to Fourier-based spectrogram analysis.

Gammatone-based spectrogram of Für Elise

Installation

You can install directly from this git repository using:

pip install git+https://github.com/detly/gammatone.git

...or you can clone the git repository however you prefer, and do:

pip install .

...or:

python setup.py install

...from the cloned tree.

Dependencies

  • numpy
  • scipy
  • nose
  • mock
  • matplotlib

Using the Code

See the API documentation. For a demonstration, find a .wav file (for example, Für Elise) and run:

python -m gammatone FurElise.wav -d 10

...to see a gammatone-gram of the first ten seconds of the track. If you've installed via pip or setup.py install, you should also be able to just run:

gammatone FurElise.wav -d 10

Basis

This project is based on research into how humans perceive audio, originally published by Malcolm Slaney:

Malcolm Slaney (1998) "Auditory Toolbox Version 2", Technical Report #1998-010, Interval Research Corporation, 1998.

Slaney's report describes a way of modelling how the human ear perceives, emphasises and separates different frequencies of sound. A series of gammatone filters are constructed whose width increases with increasing centre frequency, and this bank of filters is applied to a time-domain signal. The result of this is a spectrum that should represent the human experience of sound better than, say, a Fourier-domain spectrum would.

A gammatone filter has an impulse response that is a sine wave multiplied by a gamma distribution function. It is a common approach to modelling the auditory system.

The gammatone filterbank approach can be considered analogous (but not equivalent) to a discrete Fourier transform where the frequency axis is logarithmic. For example, a series of notes spaced an octave apart would appear to be roughly linearly spaced; or a sound that was distributed across the same linear frequency range would appear to have more spread at lower frequencies.

The real goal of this toolkit is to allow easy computation of the gammatone equivalent of a spectrogram — a time-varying spectrum of energy over audible frequencies based on a gammatone filterbank.

Slaney demonstrated his research with an initial implementation in MATLAB. This implementation was later extended by Dan Ellis, who found a way to approximate a "gammatone-gram" by using the fast Fourier transform. Ellis' code calculates a matrix of weights that can be applied to the output of a FFT so that a Fourier-based spectrogram can easily be transformed into such an approximation.

Ellis' code and documentation is here: Gammatone-like spectrograms

Interest

I became interested in this because of my background in science communication and my general interest in the teaching of signal processing. I find that the spectrogram approach to visualising signals is adequate for illustrating abstract systems or the mathematical properties of transforms, but bears little correspondence to a person's own experience of sound. If someone wants to see what their favourite piece of music "looks like," a normal Fourier transform based spectrogram is actually quite a poor way to visualise it. Features of the audio seem to be oddly spaced or unnaturally emphasised or de-emphasised depending on where they are in the frequency domain.

The gammatone filterbank approach seems to be closer to what someone might intuitively expect a visualisation of sound to look like, and can help develop an intuition about alternative representations of signals.

Verifying the port

Since this is a port of existing MATLAB code, I've written tests to verify the Python implementation against the original code. These tests aren't unit tests, but they do generally test single functions. Running the tests has the same workflow:

  1. Run the scripts in the test_generation directory. This will create a .mat file containing test data in tests/data.

  2. Run nosetest3 in the top level directory. This will find and run all the tests in the tests directory.

Although I'm usually loathe to check in generated files to version control, I'm willing to make an exception for the .mat files containing the test data. My reasoning is that they represent the decoupling of my code from the MATLAB code, and if the two projects were separated, they would be considered a part of the Python code, not the original MATLAB code.

Owner
Jason Heeris
Jason Heeris
Code for paper 'Audio-Driven Emotional Video Portraits'.

Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G

197 Dec 31, 2022
❤️ This Is The EzilaXMusicPlayer Advaced Repo 🎵

Telegram EzilaXMusicPlayer Bot 🎵 A bot that can play music on telegram group's voice Chat ❤️ Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7+

Sadew Jayasekara 11 Nov 12, 2022
Frescobaldi LilyPond Editor

README for Frescobaldi Homepage: http://www.frescobaldi.org/ Main author: Wilbert Berendsen Frescobaldi is a LilyPond sheet music text editor. It aims

Frescobaldi 600 Dec 29, 2022
Tune in is a Collaborative Music Playing Systems where multiple guests can join a room and enjoy the song being played

✨A collaborative music playing systems🎶 where multiple guests can join a room ➡🚪 and enjoy the song🎧 being played.

Vedansh Vijaywargiya 8 Nov 05, 2022
Speech recognition module for Python, supporting several engines and APIs, online and offline.

SpeechRecognition Library for performing speech recognition, with support for several engines and APIs, online and offline. Speech recognition engine/

Anthony Zhang 6.7k Jan 08, 2023
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Rhet Turnbull 14 Nov 02, 2022
extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Uexp2Awb extract unpack asset file (form unreal engine 4 pak) with extenstion .uexp which contain awb/acb (cri/cpk like) sound or music resource. i ju

max 6 Jun 22, 2022
kapre: Keras Audio Preprocessors

Kapre Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time. Tested on Python 3.6 and 3.7 Why Kapre? vs. Pre-co

Keunwoo Choi 867 Dec 29, 2022
The project aims to develop a personal-assistant for Windows & Linux-based systems

The project aims to develop a personal-assistant for Windows & Linux-based systems. Samiksha draws its inspiration from virtual assistants like Cortana for Windows, and Siri for iOS. It has been desi

SHUBHANSHU RAI 1 Jan 16, 2022
Use python MIDI to write some simple music

Use Python MIDI to write songs

小宝 1 Nov 19, 2021
An Amazon Music client for Linux (unpretentious)

Amusiz An Amazon Music client for Linux (unpretentious) ↗️ Install You can install Amusiz in multiple ways, choose your favorite. 🚀 AppImage Here you

Mirko Brombin 25 Nov 08, 2022
Simple discord bot by @merive 🤖

Parzibot Powerful and Useful Discord Bot on Python. The source code of the bot is available to everyone. Parzibot uses English language. This is free

merive_ 3 Dec 28, 2022
This library provides common speech features for ASR including MFCCs and filterbank energies.

python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs ar

James Lyons 2.2k Jan 04, 2023
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs.

vog_oracles Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs. Huge thanks to mzucker on GitHub for the note detection cod

19 Sep 29, 2022
A small project where I identify notes and key harmonies in a piece of music and use them further to recreate and generate the same piece of music through Python

A small project where I identify notes and key harmonies in a piece of music and use them further to recreate and generate the same piece of music through Python

5 Oct 07, 2022
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022
Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.

Okaeri-Music is a telegram bot project that's allow you to play music on telegram voice chat group

Wahyusaputra 1 Dec 22, 2021
📺Headless全自动B站直播录播、切片、上传一体工具

DDRecorder Headless全自动B站直播录播、切片、上传一体工具 感谢 FortuneDayssss/BilibiliUploader 安装指南(Windows) 在Release下载zip包解压。 修改配置文件config.json 双击运行DDRecorder.exe (这将使用co

322 Dec 27, 2022
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022