Deep Multimodal Data Fusion

Zhao, Fei; Zhang, Chengcui; Geng, Baocheng

doi:10.1145/3649447

reviewACM Computing SurveysFeb 24, 2024BRONZE OA

Deep Multimodal Data Fusion

FZFei Zhao CZChengcui Zhang BGBaocheng Geng

University of Alabama at Birmingham

Indexed incrossref

Abstract

Multimodal Artificial Intelligence (Multimodal AI), in general, involves various types of data (e.g., images, texts, or data collected from different sensors), feature engineering (e.g., extraction, combination/fusion), and decision-making (e.g., majority vote). As architectures become more and more sophisticated, multimodal neural networks can integrate feature extraction, feature fusion, and decision-making processes into one single model. The boundaries between those processes are increasingly blurred. The conventional multimodal data fusion taxonomy (e.g., early/late fusion), based on which the fusion occurs in, is no longer suitable for the modern deep learning era. Therefore, based on the main-stream…

Citation impact

306

total citations

FWCI: 68.46
Percentile: 100%
References: 221

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Artificial intelligence
Fusion
Sensor fusion

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.