Skip to content

Implementation of "GL-RG: Global-Local Representation Granularity for Video Captioning".

Notifications You must be signed in to change notification settings

goodproj13/GL-RG

Repository files navigation

GL-RG: Global-Local Representation Granularity for Video Captioning

framework.png

GL-RG exploit extensive vision representations from different video ranges to improve linguistic expression. We devise a novel global-local encoder to produce rich semantic vocabulary. With our incremental training strategy, GL-RG successfully leverages the global-local vision representation to achieve fine-grained captioning on video contents. [Note] This branch includes data (>900MB) and links, for a smaller version please goto min-branch/for-review (52.5MB).

Dependencies

This repo was tested with Python 2.7, PyTorch 0.2.0 (1.0.1), cuDNN 6.0 (10.0), and CUDA 8.0. But it should be runnable with more recent PyTorch>=1.0 (or >=0.2, <=1.0) versions.

You can use anaconda or miniconda to install the dependencies:

conda create -n GL-RG-pytorch python=2.7 pytorch=0.2 scikit-image h5py requests
conda activate GL-RG-pytorch

Installation

First clone the this repository to any location using --recursive:

git clone --recursive https://github.com/goodproj13/GL-RG.git

Check out the coco-caption/, cider/, data/ and model/ projects into your working directory. If not, please find detailed steps INSTALL.md for installation and dataset preparation.

Please run following script to download Stanford CoreNLP 3.6.0 models to coco-caption/:

cd coco-caption
./get_stanford_models.sh

Model Zoo

Model Dataset Exp. B@4 M R C Download Link
GL-RG MSR-VTT XE 45.5 30.1 62.6 51.2 GL-RG_XE_msrvtt
GL-RG MSR-VTT DXE 46.9 30.4 63.9 55.0 GL-RG_DXE_msrvtt
GL-RG + IT MSR-VTT DR 46.9 31.2 65.7 60.6 GL-RG_DR_msrvtt
GL-RG MSVD XE 52.3 33.8 70.4 58.7 GL-RG_XE_msvd
GL-RG MSVD DXE 57.7 38.6 74.9 95.9 GL-RG_DXE_msvd
GL-RG + IT MSVD DR 60.5 38.9 76.4 101.0 GL-RG_DR_msvd

Test

Check out the trained model weights under the model/ directory (following Installation) and run:

./test.sh

Note: Please modify MODEL_NAME, EXP_NAME and DATASET in test.sh if experiment setting changes. For more details please refer to TEST.md.

License

GL-RG is released under the MIT license.

Acknowledgements

We are truly thankful of the following prior efforts in terms of knowledge contributions and open-source repos.

About

Implementation of "GL-RG: Global-Local Representation Granularity for Video Captioning".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published