Probabilistic models

Start with some user-supplied relevance information about a “training set” of documents

Compute P(relevant | T) and P(non-relevant |T) based on the terms observed in the training set

Start with some user-supplied relevance information about a “training set” of documents