🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle
State-of-the-art Visio-Linguistic Models 🥶
Vilio aims to replicate the organization of huggingface's transformer repo at: https://github.com/huggingface/transformers
/bash Shell files to reproduce hateful memes results
/data By default, directory for loading in data & saving checkpoints
/ernie-vil Ernie-vil sub-repository written in PaddlePaddle
/fts_lmdb Scripts for handling .lmdb extracted features
/fts_tsv Scripts for handling .tsv extracted features
/notebooks Jupyter Notebooks for demonstration & reproducibility
/py-bottm-up-attention Sub-repository for tsv feature extraction forked & adapted from here
src/vilio All implemented models (also see below for a quick overview of models)
/utils Pandas & ensembling scripts for data handling
entry.py files Scripts used to access the models and apply model-specific data preparation
pretrain.py files Same purpose as entry files, but for pre-training; Point of entry for pre-training
hm.py Training code for the hateful memes challenge; Main point of entry
param.py Args for running hm.py
Follow SCORE_REPRO.md for reproducing performance on the Hateful Memes Task.
Follow GETTING_STARTED.md for using the framework for your own task.
See the paper at: https://arxiv.org/abs/2012.07788
🥶 Vilio currently provides the following architectures with the outlined language transformers:
The code heavily borrows from the following repositories, thanks for their great work:
@article{muennighoff2020vilio,
title={Vilio: State-of-the-art visio-linguistic models applied to hateful memes},
author={Muennighoff, Niklas},
journal={arXiv preprint arXiv:2012.07788},
year={2020}
}