Prompting Large Language Models with Answer Heuristics for Knowledge-Based Visual Question Answering

Shao, Zhenwei; Zhou, Yu; Wang, Meng; Yu, Jun

doi:10.1109/cvpr52729.2023.01438

articleJun 1, 2023Closed access

Prompting Large Language Models with Answer Heuristics for Knowledge-Based Visual Question Answering

ZSZhenwei Shao YZYu Zhou MWMeng Wang JYJun Yu

Hangzhou Dianzi University · Hefei University of Technology

Indexed incrossref

Abstract

Knowledge-based visual question answering (VQA) requires external knowledge beyond the image to answer the question. Early studies retrieve required knowledge from explicit knowledge bases (KBs), which often introduces irrelevant information to the question, hence restricting the performance of their models. Recent works have sought to use a large language model (i.e., GPT-3 [3]) as an implicit knowledge engine to acquire the necessary knowledge for answering. Despite the encouraging results achieved by these methods, we argue that they have not fully activated the capacity of GPT-3 as the provided input information is insufficient. In this paper, we present Prophet-a conceptually simple framework designed to…

Citation impact

189

total citations

FWCI: 21.45
Percentile: 100%
References: 69

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Heuristics
Question answering
Computer science
Task (project management)
Knowledge extraction
Artificial intelligence
Information retrieval
General knowledge

UN Sustainable Development Goals

Quality Education

No related works found for this paper.