By Zhandong Liu, Wengang Zhou and Houqiang Li.
This project contains the following source files: model training and testing, text center block label and word stroke region label generation, label augmentation, and sample models that have been trained.
- Clone the repo
git clone https://github.com/lzd0825/AB-LSTM.git
cd ./AB-LSTM
-
Requirements for
Caffe
andpycaffe
(see: Caffe installation instructions)Note: Caffe must be built with support for Python layers!
# In your Makefile.config, make sure to have this line uncommented WITH_PYTHON_LAYER := 1 # Unrelatedly, it's also recommended that you use CUDNN USE_CUDNN := 1
-
Then you can do as follow:
cd ./AB-LSTM/Train_Test_ABLSTM/caffe/}
make –j
make pycaffe
- Download the TD_Total_Text_WSR_iter_175000.caffemodel, trained on Total-text training dataset.
- Download the TD_ICDAR2013_TCB_iter_50000.caffemodel, finute trained on ICDAR2013 training dataset.
- Then you can do as follow:
cd ../snapshot
-
Put both trained caffemodels to the fold of ${/AB-LSTM/Train_Test_ABLSTM/snapshot}.
-
Suppose you have downloaded the test datasets (e.g. ICDAR2013, MSRA-TD500, etc.), execute the following commands to test the model on the test datasets. Then you can do as follow:
cd ../Demo
python Demo_forword_TCB.py
python Demo_forword_WSR.py
You can do as follow:
cd ${AB-LSTM/Demo_Text_detection}
python fuse_thred.py
You can do as follow:
python Demo_region_word.py
Download the pretrained model vgg16convs.caffemodel, and put it to ${AB-LSTM/Train_Test_ABLSTM/model/}
Scripts for generating ground truth have been provided in the ${AB-LSTM/Label_generate}. You can use our code to generate you own training labels on different public datasets (e.g. ICDAR2013, MSRA-TD500, CTW1500, and Total-text, etc.).
We use “ImageDataGenerator” in “keras.preproces-sing.image” to achieve data augmentation. cd ${AB-LSTM/Data_aug}
You must modify the parameters image_save_prefix and mask_save_prefix in the trainGenerator function. Note that you must use an absolute path, such as: image_save_prefix = "/data1/XXX/aug_dataset/Aug_example/train_aug/aug",mask_save_prefix = "/data1/XXX/aug_dataset/Aug_example /train_gt_aug/aug".
Modify ${AB-LSTM/Train_Test_ABLSTM/TD_ICDAR2013_TCB.py, and TD_Total_Text_WSR.py} to configure your dataset name and dataset path like:
......
data_params['root'] = "./AB-LSTM/Train_Test_ABLSTM/datasets/Total_Text_WSR/"
data_params['source'] = "Total_Text_WSR.lst"
......
You can do as follow:
cd ${AB-LSTM/Train_Test_ABLSTM/}
sh ./train_ICDAR2013_TCB.sh
sh ./train_Total_Text_WSR.sh
Use this bibtex to cite this repository:
@misc{liu_AB-LSTM_2018,
title={AB-LSTM: Attention-Based Bidirectional LSTM Model for Scene Text Detection},
author={Zhandong Liu, Wengang Zhou, Houqiang Li},
year={2018},
publisher={Github},
journal={GitHub repository},
howpublished={\url{https://github.com/lzd0825/AB-LSTM/}},
}