Long-Short Term Memory (LSTM) models#

This session continues the journey into neural language models. We identify potential problems with (simple) RNNs and introduce a more sophisticated class of recurrent sequence-processing models: LSTMs. On the practical side, we look at how to implement language models with PyTorch’s built-in modules. Doing so requires learning how to supply text data in the right format.

Learning goals for this session#

  1. understand limitations RNNs

  2. understand how LSTMs improve on RNNs

  3. become able to use PyTorch’s built-in modules for LMs

    1. learn how feed text data to these modules

    2. learn how to interpret the output of thee modules

  4. learn about different decoding schemes for trained LMs

    1. pure-sampling

    2. greedy sampling

    3. top-k and top-p sampling

    4. softmax sampling

    5. beam search

Slides#

Here are the slides for this session.

Practical exercises#

There are three notebooks with hands-on exercises.