Abstract
Multimodal Artificial Intelligence (Multimodal AI), in general, involves various types of data (e.g., images, texts, or data collected from different sensors), feature engineering (e.g., extraction, combination/fusion), and decision-making (e.g., majority vote). As architectures become more and more sophisticated, multimodal neural networks can integrate feature extraction, feature fusion, and decision-making processes into one single model. The boundaries between those processes are increasingly blurred. The conventional multimodal data fusion taxonomy (e.g., early/late fusion), based on which the fusion occurs in, is no longer suitable for the modern deep learning era. Therefore, based on the main-stream…
Citation impact
306
total citations
- FWCI
- 68.46
- Percentile
- 100%
- References
- 221
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Computer science
- Artificial intelligence
- Fusion
- Sensor fusion
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.