Library with dynamic audio/video composition and runtime control
This project shipped with two parts: FFdynamic library and applications build on FFdynamic
Interactive Live (ial for short) is an application based on FFdynamic.
Ial does multiple video and audio input mixing, then streams it out. It could be run on phones or cloud servers.
ial gives flexiable control over the mixing process (dynamical layout change, backgroup picture change, mute/unmute, etc..), shown in the following gif:
This picture shows, 1. auto layout change when a new stream joining in (from 2 cells to 3 cells); 2. layout changes to 4 and 9 cells by http requeset. Changes are quite smooth, without any frozen or stuck, due to audio/video sync message communication mechnism.
Dynamic Detect is a playgroud one can change object detector types at run time while reading online video streams or local files. Those detectors are loaded via OpenCV api. Models of darknet yolo3, caffe vgg-ssd, and tensorflow mobilenet-ssd (all in coco dataset) are tested. Here is an output stream gif, which run 2 detecors in parallle, draw boxes and texts when they locate interested objects.
FFdynamic library Overview
Demux |-> Audio Decode -> |-> Audio Encode ------------------------------------------> |
| | -> Muxer
| |-> Dehaze Filter -> | |
|-> Video Decode -> | | Mix original and dehazed ->| Encode ->|
| -----------------> |
As shown, after demux the input stream, we do video decode which will output to two components: 'Dehaze Filter' component and 'mix video' component; after dehaze, its image also output to 'mix video' component, in there we mix original and dehazed image into one. The whole example is here. Normally, one can freely combine components as long as the input data can be processed.
customization One can define their own components, for instance
Those components are plugins. Once they are done, they can be composed with other components.
In short, FFdynamic is a scaffold allows develop complex audio/video application in a higher and flexiable manner.
It is suitable for two kind of applications:
Do transcoding in a dozen lines of code, see here
Here we take the 'dehaze', mentioned in the 'Overview' part, as the example. We developed a dehaze algorithm and make it a FFdynamic's component. Then mix original and dehazed image together to check the result visually.
Refer to here for plugin source files.
Installation
protobuf3 is not well supports by some linux distributions' package manager, here is a manually compiling script (sudo required):
DIR=$(mktemp -d) && cd ${DIR} && \
git clone https://github.com/protocolbuffers/protobuf.git && cd protobuf && \
git submodule update --init --recursive && \
./autogen.sh && ./configure && \
make && make check && \
sudo make install && sudo ldconfig
Install FFmpeg as usal, then
apt install -y cmake3 libgflags-dev libgoogle-glog-dev libboost-all-dev
or
yum install -y glog-devel gflags-devel cmake3 boost-devel
Install FFmpeg as usal, then
brew install cmake glog gflags protobuf boost
Docker build
To alleviate the build process, there is a docker with all dependencies installed that you can play with.
Under FFdynamic folder:
'sh build.sh' will build FFdynamic library (need sudo if make install)
Under app/interactiveLive folder:
'sh build.sh' will build FFdynamic library and ial program.
Contribution
All contributions are welcome. Some TODOs: