Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

This is a PyTorch/GPU implementation of the paper SCL and its extension version GLSCL.

Install

Our environment: CUDA11.3, torch1.11.0, torchvision0.12.0.

pip install -r requirements.txt

In our further experiments of the extension journal, we use a newer environment: CUDA11.7, torch2.0, Pytorch-Lightning2.0.

The pytorch and lightning version relation can be found at here.

Weights and ALIGN-BENCH

Our pre-trained and fine-tuned model weights can be downloaded at huggingface/SCL.

Our developed cross-modal alignment benchmark can be gained at huggingface/ALIGN-BENCH.

Dataset Preparation

We follow ViLT and use pyarrow to serialize the datasets. See this link for details.

Pre-training

python run.py --task pretrain

The detailed settings can be found in './scl/config.py', like pretraining datasets, optimation arguments, input size.

Note that 'plugins=[MyCluster(), MyDDPPlugin()]' of pl.Trainer(run.py) is used in multi-nodes ddp training.

Downstream Tasks

python run.py --task vqa/nlvr2/f30k/coco/msrvtt/lsmdc

Visualization and Quantify

python visualize_global.py # global-local cross-modal visualization
python visualize_local.py # local-local cross-modal visualization
python align_global.py # global-local cross-modal alignment quantify on ALIGN-BENCH
python align_local.py # local-local cross-modal alignment quantify on ALIGN-BENCH

Acknowledgements

The code is based on METER and VLC.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

Install

Weights and ALIGN-BENCH

Dataset Preparation

Pre-training

Downstream Tasks

Visualization and Quantify

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
scl		scl
README.md		README.md
align_global.py		align_global.py
align_local.py		align_local.py
local_run.sh		local_run.sh
requirements.txt		requirements.txt
run.py		run.py
run.sh		run.sh
visualize_global.py		visualize_global.py
visualize_local.py		visualize_local.py

IIGROUP/SCL

Folders and files

Latest commit

History

Repository files navigation

Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

Install

Weights and ALIGN-BENCH

Dataset Preparation

Pre-training

Downstream Tasks

Visualization and Quantify

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages