Efficient Multimodal Transformer With Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis

Sun, Licai; Lian, Zheng; Liu, Bin; Tao, Jianhua

doi:10.1109/taffc.2023.3274829

articleIEEE Transactions on Affective ComputingMay 10, 2023Closed access

Efficient Multimodal Transformer With Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis

LSLicai Sun ZLZheng Lian BLBin Liu JTJianhua Tao

Chinese Academy of Sciences · Beijing Academy of Artificial Intelligence · +3 more institutions

Indexed incrossref

Abstract

With the proliferation of user-generated online videos, Multimodal Sentiment Analysis (MSA) has attracted increasing attention recently. Despite significant progress, there are still two major challenges on the way towards robust MSA: 1) inefficiency when modeling cross-modal interactions in unaligned multimodal data; and 2) vulnerability to random modality feature missing which typically occurs in realistic settings. In this paper, we propose a generic and unified framework to address them, named Efficient Multimodal Transformer with Dual-Level Feature Restoration (EMT-DLFR). Concretely, EMT employs utterance-level representations from each modality as the global multimodal context to interact with local…

Citation impact

212

total citations

FWCI: 35.10
Percentile: 100%
References: 71

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Computer science
Robustness (evolution)
Artificial intelligence
Feature (linguistics)
Feature learning
Machine learning
Sentiment analysis
Pattern recognition (psychology)

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Awards: U21B2010, 62276259, 61831022, 62201572