An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA

Yang, Zhengyuan; Gan, Zhe; Wang, Jianfeng; Hu, Xiaowei; Lu, Yumao; Liu, Zicheng; Wang, Lijuan

doi:10.1609/aaai.v36i3.20215

articleProceedings of the AAAI Conference on Artificial IntelligenceJun 28, 2022DIAMOND OA

An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA

ZYZhengyuan Yang ZGZhe Gan JWJianfeng Wang XHXiaowei Hu YLYumao Lu

Microsoft Research (United Kingdom)

Indexed incrossref

Abstract

Knowledge-based visual question answering (VQA) involves answering questions that require external knowledge not present in the image. Existing methods first retrieve knowledge from external resources, then reason over the selected knowledge, the input image, and question for answer prediction. However, this two-step approach could lead to mismatches that potentially limit the VQA performance. For example, the retrieved knowledge might be noisy and irrelevant to the question, and the re-embedded knowledge features during reasoning might deviate from their original meanings in the knowledge base (KB). To address this challenge, we propose PICa, a simple yet effective method that Prompts GPT3 via the use of…

Citation impact

258

total citations

FWCI: 14.46
Percentile: 100%
References: 56

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Question answering
Context (archaeology)
Benchmark (surveying)
Knowledge base
Artificial intelligence
Knowledge extraction
Information retrieval

No related works found for this paper.