Evaluation of Foundation Models

Evaluation is about answering the question: How good is our model?

N-gram Probabilities and Markov assumption

\[(p_1 * p_2 * p_3 * p_4) <> \log p_1 + \log p_2 + \log p_3 + \log p_4\]

Extrinsic evaluation of N-gram models

Intrinsic evaluation

\[PP(W) = P(w_1w_2...w_n)^{1/N}\]
N-gram orderUnigramBigramTrigram
Perplexity963170109

References