Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering

Liu, Yang; Li, Guanbin; Lin, Liang

doi:10.1109/tpami.2023.3284038

articleIEEE Transactions on Pattern Analysis and Machine IntelligenceJun 8, 2023Closed access

Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering

YLYang Liu GLGuanbin Li LLLiang Lin

Sun Yat-sen University

PubMed

Indexed incrossrefpubmed

Abstract

Existing visual question answering methods often suffer from cross-modal spurious correlations and oversimplified event-level reasoning processes that fail to capture event temporality, causality, and dynamics spanning over the video. In this work, to address the task of event-level visual question answering, we propose a framework for cross-modal causal relational reasoning. In particular, a set of causal intervention operations is introduced to discover the underlying causal structures across visual and linguistic modalities. Our framework, named Cross-Modal Causal RelatIonal Reasoning (CMCIR), involves three modules: i) Causality-aware Visual-Linguistic Reasoning (CVLR) module for collaboratively…

Citation impact

330

total citations

FWCI: 16.43
Percentile: 100%
References: 110

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Causal reasoning
Question answering
Artificial intelligence
Natural language processing
Event (particle physics)
Spurious relationship
Visual reasoning

No related works found for this paper.