Multi-Modal Self-Supervised Learning for Recommendation
University of Hong Kong · Brandeis University
Abstract
The online emergence of multi-modal sharing platforms (e.g., TikTok, Youtube) is powering personalized recommender systems to incorporate various modalities (e.g., visual, textual and acoustic) into the latent user representations. While existing works on multi-modal recommendation exploit multimedia content features in enhancing item embeddings, their model representation capability is limited by heavy label reliance and weak robustness on sparse user behavior data. Inspired by the recent progress of self-supervised learning in alleviating label scarcity issue, we explore deriving self-supervision signals with effectively learning of modality-aware user preference and cross-modal dependencies. To this end, we…
Citation impact
- FWCI
- 83.01
- Percentile
- 100%
- References
- 61
Authors
4Topics & keywords
- Computer science
- Modal
- Artificial intelligence
- Recommender system
- Machine learning