Simple LJSpeech Dataset Maker

This is a simple LJSpeech Dataset Maker, based on LJSpeechTools. It splits and transcribes the inputs WAV files. Underthehood, it uses Google Speech Recognition for transcriping. Single speaker only.

Usage

Install the dependencies.
```
pip install -r requirements.txt
```
Place your WAV files into the input folder. Mono 22khz WAV is ideal.

Run the pipeline.

usage: pipeline.py [-h] [-p PARALLEL] [-l SEGMENT_LENGTH] [-s SILENCE_THRESHOLD] [-d DELAY]

LJSpeech Dataset Maker

options:
-h, --help            show this help message and exit
-p PARALLEL, --parallel PARALLEL
                        Number of running process. Default is your core/thread count minus 2.
-l SEGMENT_LENGTH, --segment-length SEGMENT_LENGTH
                        The length of a sengment, in seconds. Default is 12 seconds.
-s SILENCE_THRESHOLD, --silence-threshold SILENCE_THRESHOLD
                        The silence threshold for splitting, in dBFS (negative integer). Default is -40.
-d DELAY, --delay DELAY
                        Add a delay to online transcription, in senconds. Default is 0.1 second.
-u DISCARD_UNDER_SECOND, --discard-under-second DISCARD_UNDER_SECOND
                        Discard any segment under this length, in senconds. Default is 1 second.

The output dataset will be in the dataset folder.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
input		input
ljspeech_dataset_maker		ljspeech_dataset_maker
.gitignore		.gitignore
README.md		README.md
pipeline.py		pipeline.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple LJSpeech Dataset Maker

Usage

About

Releases

Languages

TheBill2001/ljspeech-dataset-maker

Folders and files

Latest commit

History

Repository files navigation

Simple LJSpeech Dataset Maker

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages