Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
No reviews for this project.