Official source code for "Continual 3D Convolutional Neural Networks for Real-time Processing of Videos" [ECCV2022]
Continual 3D Convolutional Neural Networks (Co3D CNNs) are a novel computational formulation of spatio-temporal 3D CNNs, in which videos are processed frame-by-frame rather than by clip.
In online processing tasks demanding frame-wise predictions, Co3D CNNs dispense with the computational redundancies of regular 3D CNNs, namely the repeated convolutions over frames, which appear in multiple clips.
Co3D CNNs are weight-compatible with regular 3D CNNs, do not need further training, and reduce the floating point operations for frame-wise computations by more than an order of magnitude!
Clone the project code
git clone https://github.com/LukasHedegaard/co3d
cd co3d
Create and activate conda
environent (optional)
conda create --name co3d python=3.8
conda activate co3d
Install Python dependencies
pip install -e .[dev]
Fill in the information on your dataset folder path in the .env
file:
DATASETS_PATH=/path/to/datasets
LOGS_PATH=/path/to/logs
CACHE_PATH=.cache
Download dataset using these instructions
CoX3D is the Continual-CNN implementation of X3D. In contrast to regular 3D CNNs, which take a whole video clip as input, Continual CNNs operate frame-by-frame and can thus speed up computation by a significant margin.
CoSlow is the Continual-CNN implementation of Slow.
CoSlow is the Continual-CNN implementation of I3d.
X3D [ArXiv, Repo] is a family of 3D variants of the EfficientNet achitecture, which produce state-of-the-art results for lightweight human activity recognition.
R(2+1)D [ArXiv, Repo] is a CNN for activity recognition, which separates the 3D convolution into a spatial 2D convolution and a temporal 1D convolution in order to reduce the number of parameters and increase the network efficiency.
I3D [ArXiv, Repo] is a 3D CNN for activity recognition, proposed to "inflate" the weights from a 2D CNN pretrained on ImageNet in the initialisation of the 3D CNN, thereby improving accuracy and reducing training time.
The implementation here is a port of the one found in the SlowFast Repo.
SlowFast [ArXiv, Repo] is two-stream 3D CNNs architecture for video-recognition. The structure includes two pathways with one pathway operating at a slower frame-rate than the other.
Slow is the "slow" branch of the SlowFast network [ArXiv, Repo]
The project code written in PyTorch and uses Ride to provide implementations of training, evaluations, and benchmarking methods. A plethora of usage options are available, which are best explored in the Ride docs or the command-line help, e.g.:
python models/cox3d/main.py --help
This repository contains the implementations of Continual X3D (CoX3D), as well as number of 3D-CNN baselines.
Each model has its own folder with a self-contained implementation, scripts, weight download utilities, hparams and profiling results. Overview tables for scripts used to download weights, run the model test-sequences, and throughput benchmarks are found below:
Model | Dataset | Download |
---|---|---|
I3D-R50 | Kinetics | download |
R(2+1)D-18 | Kinetics | download |
SlowFast-8x8 | Kinetics | download |
SlowFast-4x16 | Kinetics | download |
Slow-8x8 | Kinetics | download |
(Co)X3D-XS | Kinetics | download |
(Co)X3D-S | Kinetics | download |
(Co)X3D-M | Kinetics | download |
(Co)X3D-L | Kinetics | download |
(Co)Slow-8x8 | Charades | download |
Evaluate the 1-clip accuracy of pretrained models. The scripts should be executed from project root.
Evaluate the 1-clip accuracy of pretrained models. The scripts should be executed from project root.
Model | Script |
---|---|
(Co)Slow-8x8 | ./models/coslow/scripts/test/charades.sh |
The scripts should be executed from project root.
@inproceedings{hedegaard2022continual,
title={Continual 3D Convolutional Neural Networks for Real-time Processing of Videos},
author={Lukas Hedegaard and Alexandros Iosifidis},
booktitle={European Conference on Computer Vision (ECCV)},
year={2022},
}
This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871449 (OpenDR).