Meshed-Memory Transformer for Image Captioning. CVPR 2020
Show, Control and Tell: A Framework for Generating Controllable and Grou...
A neural network to generate captions for an image using CNN and RNN wit...
CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing T...
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model,...
[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense C...
PyTorch library for Visual-Semantic tasks
Deep CNN-LSTM for Generating Image Descriptions :smiling_imp: