Automatic Evaluation of Topic Coherence

Newman, David; Lau, Jey Han; Grieser, Karl; Baldwin, Timothy

articleJun 2, 2010Closed access

Automatic Evaluation of Topic Coherence

DNDavid Newman JHJey Han Lau KGKarl Grieser TBTimothy Baldwin

University of California, Irvine · Data61 · +1 more institution

Abstract

This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a range of topic scoring models to the evaluation task, drawing on WordNet, Wikipedia and the Google search engine, and existing research on lexical similarity/relatedness. In comparison with human scores for a set of learned topics over two distinct datasets, we show a simple co-occurrence measure based on pointwise mutual information over Wikipedia data is able to achieve results for the task at or nearing the level of inter-annotator correlation, and that other Wikipedia-based lexical relatedness methods also achieve strong results.…

Citation impact

730

total citations

FWCI: 29.38
Percentile: 100%
References: 36

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

WordNet
Computer science
Interpretability
Coherence (philosophical gambling strategy)
Natural language processing
Task (project management)
Set (abstract data type)
Information retrieval

UN Sustainable Development Goals

Quality Education

No related works found for this paper.