Code for Look for the Change paper published at CVPR 2022
This repository contrains code for the CVPR'22 paper Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos.
Prerequisites
Download model weights
mkdir weights; cd weights
wget https://data.ciirc.cvut.cz/public/projects/2022LookForTheChange/look-for-the-change.pth
wget https://isis-data.science.uva.nl/mettes/imagenet-shuffle/mxnet/resnext101_bottomup_12988/resnext-101-1-0040.params
wget https://isis-data.science.uva.nl/mettes/imagenet-shuffle/mxnet/resnext101_bottomup_12988/resnext-101-symbol.json
mv resnext-101-symbol.json resnext-101-1-symbol.json
Setup the environment
docker build -t look-for-the-change .
docker run -it --rm --gpus 1 -v $(pwd):$(pwd) -w $(pwd) look-for-the-change bash
Extract video features
python extract.py path/to/video.mp4
The script creates path/to/video.pickle
file with the extracted features.memory_limit
of tensorflow
in feature_extraction/tsm_model.py
if you have less than 6 GB of VRAM.Get predictions
python predict.py category path/to/video.pickle [--visualize --video path/to/video.mp4]
where category
is id of a dataset category such as bacon
for Bacon Frying.
See ChangeIt dataset categories for all options.path/to/video.category.csv
with raw model predictions for each second of the original video.Prerequisites
Dataset preparation
python extract.py path/to/video1.mp4 path/to/video2.mp4 ... --n_augmentations 10 --export_dir path/to/dataset_root/category_name
This script will create path/to/dataset_root/category_name/video1.pickle
and path/to/dataset_root/category_name/video2.pickle
files with extracted features.
It is important to have some dataset_root
folder containing category_name
sub-folders with individual video feature files.Train a model
python train.py --pickle_roots path/to/dataset_root
--category category_name
--annotation_root path/to/annotation_root
--noise_adapt_weight_root path/to/video_csv_files
--noise_adapt_weight_threshold_file path/to/categories.csv
--annotation_root
is the location of annotations
folder of ChangeIt dataset,
--noise_adapt_weight_root
is the location of videos
folder of the dataset, and
--noise_adapt_weight_threshold_file
points to categories.csv
file of the dataset.Tomáš Souček, Jean-Baptiste Alayrac, Antoine Miech, Ivan Laptev, and Josef Sivic. Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
@inproceedings{soucek2022lookforthechange,
title={Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos},
author={Sou\v{c}ek, Tom\'{a}\v{s} and Alayrac, Jean-Baptiste and Miech, Antoine and Laptev, Ivan and Sivic, Josef},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022}
}
The project was supported by the European Regional Development Fund under the project IMPACT (reg. no. CZ.02.1.01/0.0/0.0/15_003/0000468) and by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:90140), the French government under management of Agence Nationale de la Recherche as part of the "Investissements d'avenir" program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute), and Louis Vuitton ENS Chair on Artificial Intelligence. We would like to also thank Kateřina Součková and Lukáš Kořínek for their help with the dataset.