This repository contains the implementation of the so-called Wasserstein Barycenter Transport Algorithm, explored in the following publications,
Eduardo F. Montesuma, Fred-Maurice Ngolè Mboula (2021, June). Wasserstein Barycenter Transport for Multi-Source Domain Adaptation. In 2021 IEEE conference on computer vision and pattern recognition. [Paper] [Supplementary]
Eduardo F. Montesuma, Fred-Maurice Ngolè Mboula, "Wasserstein Barycenter Transport for Domain Adaptation", International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2021. [IEEE Explore]
- 12/06/2021: CVPR publication is available at the CVF repository.
- 27/05/2021: GPU implementation for the Wasserstein Barycenter Transport/Sinkhorn algorithm is now available using torch. This implementation is preferred as it speeds up computation.
- 13/05/2021: ICASSP publication is on IEEE Explore. Access is free for a month.
In this repo we provide a single package that implements all tested domain adaptation algorithms. Especially, TCA and KMM were implemented using the libtlda toolbox and OT-related methods were implemented through the POT toolbox. The implementations can be found in the ./msda folder.
You can either use pre-extracted featuers (available on ./data folder) or download the samples and run the generation scripts provided in this repo.
- Music Speech Recognition Source Direct Link
- Noise Dataset Source
- GTZAN Music Genre Recognition Source Direct Link
- Noise Dataset Source
- Caltech-Office Decaf features Source Direct Link
- PIE Dataset Source
NOTE: on the ICASSP publication we explore solely Music-Speech Discrimination and Music-Genre Recognition. In the CVPR publication, we explore all four.
Method | Buccaneer2 | Destroyerengine | F16 | Factory2 |
---|---|---|---|---|
Baseline | 22.90 ± 0.84 | 38.25 ± 0.91 | 51.57 ± 1.11 | 47.80 ± 0.34 |
KMM | 21.75 ± 0.99 | 39.25 ± 0.66 | 49.81 ± 1.69 | 47.37 ± 0.71 |
TCA | 58.95 ± 1.27 | 60.67 ± 2.07 | 68.75 ± 2.11 | 59.82 ± 0.50 |
SinT | 56.35 ± 0.84 | 61.92 ± 1.64 | 66.72 ± 1.86 | 61.77 ± 1.65 |
SinTreg | 58.02 ± 1.45 | 60.47 ± 1.75 | 66.55 ± 1.60 | 63.87 ± 1.51 |
JCPOT | 35.87 ± 0.41 | 48.47 ± 2.97 | 51.92 ± 3.25 | 51.95 ± 1.75 |
JCPOT-LP | 36.40 ± 0.39 | 52.92 ± 1.32 | 56.30 ± 0.37 | 51.52 ± 2.28 |
WBT | 21.37 ± 2.25 | 24.30 ± 2.71 | 25.30 ± 6.02 | 22.70 ± 2.25 |
WBTreg | 70.60 ± 1.33 | 83.10 ± 1.64 | 83.92 ± 1.01 | 90.00 ± 0.86 |
Target-only | 67.43 ± 1.43 | 67.96 ± 2.91 | 66.86 ± 2.00 | 68.37 ± 1.87 |
Method | Buccaneer2 | Destroyerengine | F16 | Factory2 |
---|---|---|---|---|
Baseline | 82.43 ± 1.75 | 51.57 ± 2.56 | 88.89 ± 2.72 | 50.02 ± 2.21 |
KMM | 87.12 ± 2.79 | 52.35 ± 2.94 | 74.86 ± 5.58 | 50.41 ± 2.17 |
TCA | 90.43 ± 1.40 | 87.14 ± 4.99 | 95.12 ± 2.02 | 84.76 ± 3.30 |
SinT | 89.26 ± 1.56 | 82.84 ± 2.78 | 84.97 ± 3.09 | 91.21 ± 2.04 |
SinTreg | 87.28 ± 2.97 | 84.38 ± 1.76 | 86.14 ± 2.79 | 90.61 ± 1.68 |
JCPOT | 92.55 ± 2.11 | 87.89 ± 1.39 | 88.67 ± 1.67 | 82.41 ± 2.22 |
JCPOT-LP | 89.06 ± 1.38 | 84.97 ± 3.23 | 90.24 ± 1.71 | 86.13 ± 1.88 |
WBT | 56.88 ± 9.54 | 56.63 ± 6.88 | 56.63 ± 6.56 | 59.38 ± 2.61 |
WBTreg | 96.42 ± 1.48 | 92.79 ± 2.95 | 93.75 ± 0.97 | 95.31 ± 1.11 |
Target-only | 90.51 ± 3.98 | 93.07 ± 3.81 | 89.23 ± 4.25 | 92.30 ± 3.62 |
If you find this work useful in your research, please consider citing us using the bibtex below,
@InProceedings{montesuma2021cvpr,
author = {Montesuma, Eduardo Fernandes and Mboula, Fred Maurice Ngole},
title = {Wasserstein Barycenter for Multi-Source Domain Adaptation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021},
pages = {16785-16793}
}
@INPROCEEDINGS{montesuma2021icassp,
author={Montesuma, Eduardo F. and Ngolè Mboula, Fred-Maurice},
booktitle={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Wasserstein Barycenter Transport for Acoustic Adaptation},
year={2021},
volume={},
number={},
pages={3405-3409},
doi={10.1109/ICASSP39728.2021.9414199}}