Long-Short Term Memory (LSTM) models#
This session continues the journey into neural language models. We identify potential problems with (simple) RNNs and introduce a more sophisticated class of recurrent sequence-processing models: LSTMs. On the practical side, we look at how to implement language models with PyTorch’s built-in modules. Doing so requires learning how to supply text data in the right format.
Learning goals for this session#
understand limitations RNNs
understand how LSTMs improve on RNNs
become able to use PyTorch’s built-in modules for LMs
learn how feed text data to these modules
learn how to interpret the output of thee modules
learn about different decoding schemes for trained LMs
pure-sampling
greedy sampling
top-k and top-p sampling
softmax sampling
beam search
Slides#
Here are the slides for this session.
Practical exercises#
There are three notebooks with hands-on exercises.