Unsupervised clustering of movie posters with features extracted from Convolutional Neural Network
Unsupervised clustering of movie posters with features extracted from Convolutional Neural Network. Visualization using flask as a backend and d3js for the frontend.
This project is divided into 3 main scripts:
To get parameters descriptions:
The extraction of the features from ConvNet is long if you do not owned a GPU. The computation of the similarity between each posters required O(n^2) in memory which required around 32Go of RAM.
Clone the depot:
$ git clone https://github.com/adrz/movie-posters-convnet.git
$ cd movie-posters-convnet/
$ virtualenv -p python3 env
$ source env/bin/activate
$ pip install -r requirements-gpu.txt
Create postgresql database (supposed you already install postgresql):
$ psql -U postgres -c "createuser movieposters;"
$ psql -U postgres -c "createdb movieposters;"
$ psql -U postgres -c "alter user movieposters with encrypted password 'yourpassword';"
$ psql -U postgres -c "grant all privileges on database movieposters to movieposters ;"
After cloning you can just launch the bash script that will:
$ python src/get_posters.py -c config/development.conf
$ python src/get_get_features_from_cnn.py -c config/development.conf
$ python src/get_data_visu.py -c config/development.conf
Then grab a coffee...
$ source env/bin/activate
$ configapi=./config/development.conf
$ python app.py
Then launch index.html into your favorite browser:
$ chromium 127.0.0.1:5000/index.html
or
$ chromium 127.0.0.1:5000/index_complete.html
Cherry-piking from the top-200 closest couple of posters (relative to cosine distance):
This project is licensed under the MIT License - see the LICENSE.md file for details