Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
-
Updated
Apr 5, 2020 - Jupyter Notebook
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Bilinear attention networks for visual question answering
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"
Deep Modular Co-Attention Networks for Visual Question Answering
Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17
A lightweight, scalable, and general framework for visual question answering research
Strong baseline for visual question answering
A pytorch implementation for "A simple neural network module for relational reasoning", working on the CLEVR dataset
Bottom-up features extractor implemented in PyTorch.
Code for paper title "Learning Semantic Sentence Embeddings using Pair-wise Discriminator" COLING-2018
CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering
Co-attending Regions and Detections for VQA.
PyTorch implementation of paper "Visual Concept-Metaconcept Learner", NeruIPS 2019
PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind People
Code to reproduce results in our ACL 2018 paper "Did the Model Understand the Question?"
Real-world photo sequence question answering system
TensorFlow implementation of the CNN-LSTM, Relation Network and text-only baselines for the paper "FigureQA: An Annotated Figure Dataset for Visual Reasoning"
Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
Mid-level PyTorch Based Framework for Visual Question Answering.
PyTorch implementation of FiLM: Visual Reasoning with a General Conditioning Layer
The Easy Visual Question Answering dataset.
Visual Dialog
ROCK model for Knowledge-Based VQA in Videos
Visual Question Answering through modal dialogue (B.Tech Project) + API
A repository of vision and language papers (Still under construction... stay tuned!)
Implementation of the visual question answering model from the paper "Exploring Models and Data for Image Question Answering".
Implementation of the paper "Stacked Attention Networks for Image Question Answering" in Tensorflow
Add a description, image, and links to the visual-question-answering topic page so that developers can more easily learn about it.
To associate your repository with the visual-question-answering topic, visit your repo's landing page and select "manage topics."