The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
This is the Pytorch implementation of our paper:
Achieving Cross Modal Generalization with Multimodal Unified Representation
Yan Xia, Hai Huang, Jieming Zhu, Zhou Zhao
In NeurIPS 2023
git clone https://github.com/haihuangcode/CMG
cd CMG
# You don't actually have to install all the libraries in the txt file, you can choose to install them as needed.
# It is recommended to use Python 3.7, as some libraries used do not support higher versions of Python.
conda create -n your_env_name python=3.7
pip install -r requirements.txt
cd CMG/code/src
./pretrain.sh
cd CMG/code/src
./ave.sh
cd CMG/code/src
./avvp.sh
cd CMG/code/src
./ave_avvp.sh
cd CMG/code/src
./ucf_vggsound.sh
cd CMG/code/AVSBench_downstream/avs_scripts/avs_s4
./train.sh
./test.sh
If you find this work useful, please consider citing it.
@article{xia2024achieving,
title={Achieving Cross Modal Generalization with Multimodal Unified Representation},
author={Xia, Yan and Huang, Hai and Zhu, Jieming and Zhao, Zhou},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}
Baidu Disk (pwd: 1234)
CMG
├── checkpoint
├── cnt.pkl
├── code
├── data
├── figs
├── paper
├── README.md
└── requirements.txt