Foundations & Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Indexed incrossref
Abstract
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design computer agents with intelligent capabilities such as understanding, reasoning, and learning through integrating multiple communicative modalities, including linguistic, acoustic, visual, tactile, and physiological messages. With the recent interest in video understanding, embodied autonomous agents, text-to-image generation, and multisensor fusion in application domains such as healthcare and robotics, multimodal machine learning has brought unique computational and theoretical challenges to the machine learning community given the heterogeneity of data sources and the interconnections often found between…
Citation impact
148
total citations
- FWCI
- 32.98
- Percentile
- 100%
- References
- 300
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Computer science
- Artificial intelligence
- Machine learning
- Data science
- Human–computer interaction
No related works found for this paper.