First attempts to find topics from data is Latent Semantic Analysis (LSA): find the best low-rank approximation of a document-term matrix. Approximation, based on SVD.
Latent because we use probabilistic inference to infer missing probabilistic pieces of the generative story.
Dirichlet because of the Dirichlet parameters encoding sparsity. Allocation because the Dirichlet distribution encodes the prior for each document’s allocation over topics.
Story
Inference (rvs.)
Topic Assignments
Document Allocation
$$ \theta_{d,i}\approx\frac{N_{d,i}+\alpha_i}{\sum_kN_{d,k}+\alpha_k} $$
Topics
$$ \phi_{i,v}\approx\frac{V_{i,v}+\beta_v}{\sum_wV_{i,w}+\beta_w} $$
Assign word to a particular topic