articleJun 2, 2010Closed access

Automatic Evaluation of Topic Coherence

University of California, Irvine · Data61 · +1 more institution

Abstract

This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a range of topic scoring models to the evaluation task, drawing on WordNet, Wikipedia and the Google search engine, and existing research on lexical similarity/relatedness. In comparison with human scores for a set of learned topics over two distinct datasets, we show a simple co-occurrence measure based on pointwise mutual information over Wikipedia data is able to achieve results for the task at or nearing the level of inter-annotator correlation, and that other Wikipedia-based lexical relatedness methods also achieve strong results.…

Citation impact

730
total citations
FWCI
29.38
Percentile
100%
References
36
Citations per year

Authors

4

Topics & keywords

Keywords
  • WordNet
  • Computer science
  • Interpretability
  • Coherence (philosophical gambling strategy)
  • Natural language processing
  • Task (project management)
  • Set (abstract data type)
  • Information retrieval
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.