Long-Short Term Memory (LSTM)

LSTM is a type of Recurrent Neural Networks (RNN) which can learn and memorize long-term dependencies. An LSTM aims to remember past information for long periods. LSTMs try to combat the Exploding Gradients & Vanishing Gradient problem by introducing gates and an explicitly defined memory cell.

Each neuron has a memory cell and three gates: input, output and forget. The function of these gates is to safeguard the information by stopping or allowing the flow of it.

  • The input gate determines how much of the information from the previous layer gets stored in the cell.
  • The output layer takes the job on the other end and determines how much of the next layer gets to know about the state of this cell.
  • The forget gate seems like an odd inclusion at first but sometimes it’s good to forget
    Each of these gates has a weight to a cell in the previous neuron, so they typically require more resources to run.
Info

The Long Short-Term Memory (LSTM) algorithm can be used in an Encoder-Decoder Architecture. This architecture is particularly useful for tasks where the input and output sequences are of different lengths and have a complex relationship between them.


Notes: