Evaluation and Interpretation

Users need to understand a model's output to draw conclusions. Understanding can be done by model visualization, interaction, and evaluation.

Display Topics

Words with highest weight in a topic best explain what the topic is about.

Word lists are a good way of showing topics, use lists, sets, tables, bars for word probability.
Word clouds use size of words to convey information.

Also, [plot word clouds with word](Concurrent visualization of relationships between words and topics in topic models. In) associations

labeling focuses on showing not the original words of a topic but rather a clearer label more akin to what a human summary of the data would provide.

Internal Information (unsupervised)
- Internal Labeling: take prominent phrases from the topic and compares how consistent the phrase context is with the topic distribution. It can be extended for hierarchies.
External Knowledge (higher quality)
- Labelling with Supervised Labels: use human annotators to select a top word topic using a SVR (predict most representative word), paper.
- Labeling with Knowledge Bases: 1) align topic models with an external ontology; 2) build a graph, find matching words and rank them with page rank
- Using Labeled Documents:
  1. Labeled LDA provides consistent topics with the target variable.
  2. Labeled Pachinko Allocation for hierarchical label sets (e.g. labels: Ecuador → Country)
  3. Sup. Anchor: Use labels to compute anchor words, which are words that might define a topic.

Find relevant documents: select a particular document-topic and sort them from largest to smallest.
Plot how words are used within a topic
How topics relate to each other?!

Find errors