[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain. (CVPR2021)
OpenAI GPT2 pre-training and sequence prediction implementation in Tenso...
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-...
[Survey] Masked Modeling for Self-supervised Representation Learning on ...
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Seg...
A simple and working implementation of Electra, the fastest way to pretr...
Collection of training data management explorations for large language m...
Autoregressive Predictive Coding: An unsupervised autoregressive model f...
An implementation of masked language modeling for Pytorch, made as conci...
A collection of Audio and Speech pre-trained models.
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
Bamboo: 4 times larger than ImageNet; 2 time larger than Object365; Buil...
Code and Data for EMNLP2020 Paper "KGPT: Knowledge-Grounded Pre-Training...
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirection...