articlearXiv (Cornell University)Jan 23, 2013GREEN OA

Probabilistic Latent Semantic Analysis

University of California, Berkeley · International Computer Science Institute

Indexed inarxivdatacite

Abstract

Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results in a more principled approach which has a solid foundation in statistics. In order to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered…

Citation impact

2,093
total citations
FWCI
Percentile
References
13
Citations per year

Authors

1

Topics & keywords

Keywords
  • Probabilistic latent semantic analysis
  • Computer science
  • Latent class model
  • Overfitting
  • Latent semantic analysis
  • Artificial intelligence
  • Probabilistic logic
  • Semantic analysis (machine learning)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.