About

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition Park, Daniel S. and Chan, William and Zhang, Yu and Chiu, Chung-Cheng and Zoph, Barret and Cubuk, Ekin D. and Le, Quoc V. Interspeech 2019 [Paper]

About

This repository contains a implementation of the augmentation methodology proposed in the above paper.

Base Input

SpecAugmented Output (Policy = 'LB')

Requirements:

python3
librosa
libsndfile
audioread
ffmpeg
numpy
tensorflow
tensorflow_addons

Usage:

main.py [--dir][--policy]

--dir | path/to/dataset | default='./LibriSpeech/'
--policy | augmentation policy to use from {'LB','LD', 'SS', 'SM'} | deafault='LD'

OR

refer to demo/demo.ipynb for jupyter notebook demo

References:

@article{Park_2019, title={SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition}, url={http://dx.doi.org/10.21437/Interspeech.2019-2680}, DOI={10.21437/interspeech.2019-2680}, journal={Interspeech 2019}, publisher={ISCA}, author={Park, Daniel S. and Chan, William and Zhang, Yu and Chiu, Chung-Cheng and Zoph, Barret and Cubuk, Ekin D. and Le, Quoc V.}, year={2019}, month={Sep} }

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
demo		demo
README.md		README.md
augment.py		augment.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Base Input

SpecAugmented Output (Policy = 'LB')

Requirements:

Usage:

References:

About

Releases

Packages

Languages

pyyush/SpecAugment

Folders and files

Latest commit

History

Repository files navigation

About

Base Input

SpecAugmented Output (Policy = 'LB')

Requirements:

Usage:

References:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages