Long-Short Term Memory (LSTM) models

Long-Short Term Memory (LSTM) models#

This session continues the journey into neural language models. We identify potential problems with (simple) RNNs and introduce a more sophisticated class of recurrent sequence-processing models: LSTMs. On the practical side, we look at how to implement language models with PyTorch’s built-in modules. Doing so requires learning how to supply text data in the right format.

Learning goals for this session#

understand limitations RNNs
understand how LSTMs improve on RNNs
become able to use PyTorch’s built-in modules for LMs
1. learn how feed text data to these modules
2. learn how to interpret the output of thee modules
learn about different decoding schemes for trained LMs
1. pure-sampling
2. greedy sampling
3. top-k and top-p sampling
4. softmax sampling
5. beam search

Slides#

Here are the slides for this session.

Practical exercises#

There are three notebooks with hands-on exercises.

Long-Short Term Memory (LSTM) models

Contents

Long-Short Term Memory (LSTM) models#

Learning goals for this session#

Slides#

Practical exercises#