Language Models

Language Models#

How can we distinguish word salad, spelling mistakes, or grammatical errors from real language?

We can use a language model to help us.
A language model is a probability distribution over sequences of words.
It assigns a probability to a sentence, based on the words that come before and after it.
For example, the probability of the sentence “The cute little puppy chases the yellow ball” is much higher than the probability of the sentence “The cute little puppy chases the yellow cat”.
N-gram language models are the simplest kind of language model.

Language models are useful for a variety of natural language processing tasks, such as machine translation, speech recognition, and part-of-speech tagging.
- Machine translation: return text in the target language
- Speech recognition: return a transcript of what was spoken
- Natural language generation (NLU): return natural language text
- Spell-checking: return corrected spelling of input