emoTag helps you train emotion detection model for Chinese audios.
pip install -r requirement.txt
We used Emotional Speech Dataset (ESD) for Speech Synthesis and Voice Conversion from HLT Singapore.
Use this command to train a classifier. Adjust training setups in conf/logfbank_train-emo.json
.
python train.py --config conf/logfbank_train-emo.json --name task_trial_1
Models and logs will be find in exp/
.
usage: train.py [-h] [-c CONFIG] [-r RESUME] [-n NAME] [--lr LR] [--bs BS]
[--train_utt2wav TRAIN_UTT2WAV] [--val_utt2wav VAL_UTT2WAV]
[--blocks BLOCKS] [--optimizer OPTIMIZER]
[--train_pad0 TRAIN_PAD0] [--devel_pad0 DEVEL_PAD0]
[--pretrain PRETRAIN]
PyTorch Template
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
config file path (default: None)
-r RESUME, --resume RESUME
path to latest checkpoint (default: None)
-n NAME, --name NAME
--lr LR, --learning_rate LR
--bs BS, --batch_size BS
--train_utt2wav TRAIN_UTT2WAV
--val_utt2wav VAL_UTT2WAV
--blocks BLOCKS
--optimizer OPTIMIZER
--train_pad0 TRAIN_PAD0
--devel_pad0 DEVEL_PAD0
--pretrain PRETRAIN
python infer_label.py
Adjust the vad_file
param and code if necessary to adapt to new tasks. infer_label.py
adopted multiprocessing, increased cpu utilities rate and inference efficiency. See usage details below.
usage: infer_label.py [-h] [--vad_file VAD_FILE] [--model_dir MODEL_DIR]
[--data_dir DATA_DIR] [--output_dir OUTPUT_DIR]
[--process PROCESS]
parse model info
optional arguments:
-h, --help show this help message and exit
--vad_file VAD_FILE
--model_dir MODEL_DIR
--data_dir DATA_DIR data directory to be labelled
--output_dir OUTPUT_DIR
--process PROCESS multiprocess number
infer_label.py
automatically generates kill_label.sh
script while running. Kill multiprocess labeling program simply by
bash kill_label.sh
Find inference failure log in bad_file
.