Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
-
Updated
May 21, 2023 - Jupyter Notebook
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.
Classify audio with neural nets on embedded systems like the Raspberry Pi
Kaggle Competitions: TensorFlow Speech Recognition Challenge
A library built for easier audio self-supervised training, downstream tasks evaluation
Attention-based model for keywords spotting
Generalized Deep Multiset Canonical Correlation Analysis for Multiview Learning of Speech Representations
Speech command recognition DenseNet transfer learning from UrbanSound8k in keras tensorflow
Pytorch Reimplementation of DiffWave unconditional generation: a high quality waveform synthesizer.
Pytorch implementation of BiFSMN, IJCAI 2022
Effective processing pipeline and advanced neural network architectures for small-footprint keyword spotting
Multi-class classification of speech command data. Dataset collected from kaggle speech recognition challenge and used pyTorch for implementation.
zero-shot keyword spotting with KWS test dataset using ImageBind
Small footprint, standalone, zero dependency, offline keyword spotting (KWS) CLI tool.
A Vocola 2 (DNS) extension for creating and editing mathematics (in LaTeX) by voice, using a ZOO interface (Zoomable Online Outliner) such as WorkFlowy or Dynalist.
This project is about spotting a keyword from the Google Speech Commands Dataset.
Audio Classification with AlexNet and Speech Commands dataset
Female Replacement Intelligent Digital Assistant Youth
A Model-based Agent, for chinese speech recognize.
Speech recognition of keyword commands
Add a description, image, and links to the speech-commands topic page so that developers can more easily learn about it.
To associate your repository with the speech-commands topic, visit your repo's landing page and select "manage topics."