tags: AI/Algorithms/ANN, AI/Algorithms/ANN/Architecture, AI/Algorithms/LSTM
aliases: Long-Short Term Memory, LSTM

Long-Short Term Memory (LSTM)

LSTM is a type of Recurrent Neural Networks (RNN) which can learn and memorize long-term dependencies. An LSTM aims to remember past information for long periods. LSTMs try to combat the Exploding Gradients & Vanishing Gradient problem by introducing gates and an explicitly defined memory cell.

Each neuron has a memory cell and three gates: input, output and forget. The function of these gates is to safeguard the information by stopping or allowing the flow of it.

The input gate determines how much of the information from the previous layer gets stored in the cell.
The output layer takes the job on the other end and determines how much of the next layer gets to know about the state of this cell.
The forget gate seems like an odd inclusion at first but sometimes it’s good to forget
Each of these gates has a weight to a cell in the previous neuron, so they typically require more resources to run.

Info

The Long Short-Term Memory (LSTM) algorithm can be used in an Encoder-Decoder Architecture. This architecture is particularly useful for tasks where the input and output sequences are of different lengths and have a complex relationship between them.

Notes:

LSTMs have been shown to be able to learn complex sequences.
LSTMs are commonly used for Natural Language Processing (NLP) and Time Series Analysis.