Filters








40 Hits in 2.6 sec

FigureQA: An Annotated Figure Dataset for Visual Reasoning [article]

Samira Ebrahimi Kahou, Vincent Michalski, Adam Atkinson, Akos Kadar, Adam Trischler, Yoshua Bengio
2018 arXiv   pre-print
In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements.  ...  We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images.  ...  ACKNOWLEDGMENTS We thank Mahmoud Adada, Rahul Mehrotra and Marc-Alexandre Côté for technical support, as well as Adam Ferguson, Emery Fine and Craig Frayne for their help with the human baseline.  ... 
arXiv:1710.07300v2 fatcat:uzhb2k4kpvdctdqlhgrdhirpcu

Answering Questions about Data Visualizations using Efficient Bimodal Fusion [article]

Kushal Kafle, Robik Shrestha, Brian Price, Scott Cohen, Christopher Kanan
2020 arXiv   pre-print
Despite its simplicity, PReFIL greatly surpasses state-of-the art systems and human baselines on both the FigureQA and DVQA datasets.  ...  Chart question answering (CQA) is a newly proposed visual question answering (VQA) task where an algorithm must answer questions about data visualizations, e.g. bar charts, pie charts, and line graphs.  ...  We thank NVIDIA for gifting a GPU to C.K.'s lab. This work was supported in part by a gift from Adobe Research.  ... 
arXiv:1908.01801v2 fatcat:yoitczom7fgmnb36l32dg2qmyi

FigureNet: A Deep Learning model for Question-Answering on Scientific Plots [article]

Revanth Reddy, Rahul Ramesh, Ameet Deshpande, Mitesh M. Khapra
2019 arXiv   pre-print
We test our model on the FigureQA dataset which provides images and accompanying questions for scientific plots like bar graphs and pie charts, augmented with rich annotations.  ...  One area of interest is to tackle problems in reasoning and understanding, with an aim to emulate human intelligence.  ...  The FigureQA Dataset FigureQA [11] is a visual reasoning corpus which contains over a million question-answer pairs which are grounded in scientific style figures.  ... 
arXiv:1806.04655v2 fatcat:okpc5v6krzfmjoxuas7b3nnvvi

Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network [article]

Yuxuan Wu, Hideki Nakayama
2020 arXiv   pre-print
Neural Module Network (NMN) is a machine learning model for solving the visual question answering tasks.  ...  Our experiments on FigureQA and CLEVR dataset show that our methods can realize the training of NMN without ground-truth programs and achieve superior efficiency over existing reinforcement learning methods  ...  FigureQA [6] is another Visual Reasoning dataset we focus on in this work. It provides questions in fifteen different templates asked on five different types of figures.  ... 
arXiv:2009.14759v1 fatcat:gi6pljwobzghjf25so6mfjuyo4

ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning [article]

Ahmed Masry, Do Xuan Long, Jia Qing Tan, Shafiq Joty, Enamul Hoque
2022 arXiv   pre-print
Charts are very popular for analyzing data. When exploring charts, people often ask a variety of complex reasoning questions that involve several logical and arithmetic operations.  ...  To address the unique challenges in our benchmark involving visual and logical reasoning over charts, we present two transformer-based models that combine visual features and the data table of the chart  ...  Acknowledgement The authors would like to thank the anonymous reviewers for their helpful comments.  ... 
arXiv:2203.10244v1 fatcat:34c5kw3hjrcvrhwz5mhmsep22u

Classification-Regression for Chart Comprehension [article]

Matan Levy, Rami Ben-Ari, Dani Lischinski
2022 arXiv   pre-print
We validate our design with extensive experiments on the realistic PlotQA dataset, outperforming previous approaches by a large margin, while showing competitive performance on FigureQA.  ...  CQA requires analyzing the relationships between the textual and the visual components of a chart, in order to answer general questions or infer numerical values.  ...  Acknowledgments: We thank Or Kedar and Nir Zabari for their assistance in parts of this research. We thank PlotQA [20] authors for sharing additional breakdowns.  ... 
arXiv:2111.14792v2 fatcat:fc5dyegefnfpjfdanyv4wvggnu

LEAF-QA: Locate, Encode & Attend for Figure Question Answering

Ritwick Chaudhry, Sumit Shekhar, Utkarsh Gupta, Pranav Maneriker, Prann Bansal, Ajay Joshi
2020 2020 IEEE Winter Conference on Applications of Computer Vision (WACV)  
We introduce LEAF-QA, a comprehensive dataset of 250, 000 densely annotated figures/charts, constructed from real-world open data sources, along with 2 million question-answer (QA) pairs querying the structure  ...  The proposed architecture, LEAF-Net also considerably advances the current state-of-the-art on FigureQA and DVQA.  ...  Further, LEAF-Net advances start-of-the-art on DVQA and FigureQA datasets. LEAF-QA is an advancement towards reasoning from figures and charts emulating real-world data.  ... 
doi:10.1109/wacv45572.2020.9093269 dblp:conf/wacv/ChaudhrySGMBJ20 fatcat:tnpinbop25alxjo3eupcwo5uae

Linguistically Driven Graph Capsule Network for Visual Question Reasoning [article]

Qingxing Cao and Xiaodan Liang and Keze Wang and Liang Lin
2020 arXiv   pre-print
In this work, we aim to combine the benefits of both sides and overcome their limitations to achieve an end-to-end interpretable structural reasoning for general images without the requirement of layout  ...  annotations.  ...  FigureQA [30] is also a synthesized dataset. This dataset contains 100, 000 images and 1, 327, 368 questions for training. In contrast to CLEVR, the images are scientific-style figures.  ... 
arXiv:2003.10065v1 fatcat:2ayydjpwhbazdb2mn3jogsdeca

PlotQA: Reasoning over Scientific Plots [article]

Nitesh Methani, Pritha Ganguly, Mitesh M. Khapra, Pratyush Kumar
2020 arXiv   pre-print
Existing synthetic datasets (FigureQA, DVQA) for reasoning over plots do not contain variability in data labels, real-valued data, or complex reasoning questions.  ...  Consequently, proposed models for these datasets do not fully address the challenge of reasoning over plots.  ...  In the FigureQA dataset [13] , all questions are binary wherein answers are either Yes or No, (see Figure 1a for an example).  ... 
arXiv:1909.00997v3 fatcat:kkkzgbcypjfvfijtka5vnxc4le

Interpretable Visual Question Answering by Reasoning on Dependency Trees [article]

Qingxing Cao, Bailin Li, Xiaodan Liang, Liang Lin
2019 arXiv   pre-print
Collaborative reasoning for understanding image-question pairs is a very critical but underexplored topic in interpretable visual question answering systems.  ...  Thus, PTGRN is capable of building an interpretable visual question answering (VQA) system that gradually derives image cues following question-driven parse-tree reasoning.  ...  For CLEVR and FigureQA dataset, the model is trained with the Adam optimizer [47] . The base learning rate is 0.0003 for CLEVR, and it is 0.0001 for FigureQA. The batch size is 64.  ... 
arXiv:1809.01810v2 fatcat:zf5kr4zbd5ggnf63jg74wz7dta

Chart Question Answering: State of the Art and Future Directions [article]

Enamul Hoque, Parsa Kavehzadeh, Ahmed Masry
2022 arXiv   pre-print
Information visualizations such as bar charts and line charts are very common for analyzing data and discovering critical insights.  ...  Chart Question Answering (CQA) systems typically take a chart and a natural language question as input and automatically generate the answer to facilitate visual data analysis.  ...  We thank anonymous reviewers for their valuable comments and suggestions.  ... 
arXiv:2205.03966v2 fatcat:mvbob5ietfdmndogcr5dzmqosq

Towards Assisting the Visually Impaired: A Review on Techniques for Decoding the Visual Data from Chart Images

K C Shahira, A Lijiya
2021 IEEE Access  
The survey also contains comparisons and analyses relevant study datasets.  ...  In this method, even images can be conveyed by reading out the figure captions.  ...  FigureQA [32] is a visual reasoning corpus made of synthetic plots in five classes along with reasoning.  ... 
doi:10.1109/access.2021.3069205 fatcat:xlean3gmpfb6tpvnye7ougmyye

ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks

Monika Sharma, Shikha Gupta, Arindam Chowdhury, Lovekesh Vig
2019 2019 International Joint Conference on Neural Networks (IJCNN)  
A particular application area of interest from an accessibility perspective is that of reasoning over statistical charts such as bar and pie charts.  ...  Despite the improvements in perception accuracies brought about via deep learning, developing systems combining accurate visual perception with the ability to reason over the visual percepts remains extremely  ...  DATASET We created our own synthetic datasets of bar and pie-charts for visual reasoning purposes.  ... 
doi:10.1109/ijcnn.2019.8852427 dblp:conf/ijcnn/SharmaGCV19 fatcat:zx47z5pfffafrierux34haaeiu

Visuo-Linguistic Question Answering (VLQA) Challenge [article]

Shailaja Keyur Sampat, Yezhou Yang, Chitta Baral
2020 arXiv   pre-print
Each dataset item consists of an image and a reading passage, where questions are designed to combine both visual and textual information i.e., ignoring either modality would make the question unanswerable  ...  We believe that VLQA will be a good benchmark for reasoning over a visuo-linguistic context. The dataset, code and leaderboard is available at https://shailaja183.github.io/vlqa/.  ...  Acknowledgments We are thankful to the anonymous reviewers for the feedback. This work is partially supported by the National Science Foundation grant IIS-1816039.  ... 
arXiv:2005.00330v3 fatcat:ss4t4exf6ze5fjnmsvj3yfq6ny

Structured Multimodal Attentions for TextVQA [article]

Chenyu Gao and Qi Zhu and Peng Wang and Hui Li and Yuliang Liu and Anton van den Hengel and Qi Wu
2021 arXiv   pre-print
To grant our method an upper bound and make a fair testing base available for further works, we also provide human-annotated ground-truth OCR annotations for the TextVQA dataset, which were not given in  ...  The code and ground-truth OCR annotations for the TextVQA dataset are available at https://github.com/ChenyuGAO-CS/SMA  ...  To provide an upper bound for our method and a fair testing base for further works, we also provide human-annotated ground-truth OCR annotations for the TextVQA dataset, which were not given in the original  ... 
arXiv:2006.00753v2 fatcat:4lk4yloglnhdxftkzrrlcs3ztq
« Previous Showing results 1 — 15 out of 40 results