Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
-
Updated
Jul 26, 2021 - Python
{{ message }}
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
Simple Swift class to provide all the configurations you need to create custom camera view in your app
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
TensorFlow Implementation of "Show, Attend and Tell"
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
Oscar and VinVL
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
An open-source tool for sequence learning in NLP built on TensorFlow.
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Meshed-Memory Transformer for Image Captioning. CVPR 2020
Complete Assignments for CS231n: Convolutional Neural Networks for Visual Recognition
Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
Image Captioning using InceptionV3 and beam search
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
Transformer-based image captioning extension for pytorch/fairseq
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019
A reverse image search engine powered by elastic search and tensorflow
A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.
Automatic image captioning model based on Caffe, using features from bottom-up attention.
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
Image Captions Generation with Spatial and Channel-wise Attention
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Code for "Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner" in ICCV 2017
Video to Text: Natural language description generator for some given video. [Video Captioning]
Add a description, image, and links to the image-captioning topic page so that developers can more easily learn about it.
To associate your repository with the image-captioning topic, visit your repo's landing page and select "manage topics."