An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA

Microsoft Research (United Kingdom)

Indexed incrossref

Abstract

Knowledge-based visual question answering (VQA) involves answering questions that require external knowledge not present in the image. Existing methods first retrieve knowledge from external resources, then reason over the selected knowledge, the input image, and question for answer prediction. However, this two-step approach could lead to mismatches that potentially limit the VQA performance. For example, the retrieved knowledge might be noisy and irrelevant to the question, and the re-embedded knowledge features during reasoning might deviate from their original meanings in the knowledge base (KB). To address this challenge, we propose PICa, a simple yet effective method that Prompts GPT3 via the use of…

Citation impact

258
total citations
FWCI
14.46
Percentile
100%
References
56
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Question answering
  • Context (archaeology)
  • Benchmark (surveying)
  • Knowledge base
  • Artificial intelligence
  • Knowledge extraction
  • Information retrieval
No related works found for this paper.