Recurrent Neural Networks (RNN)

RNNs are layered neural networks similar to Feed-Forward Neural Networks (FFNN) but they have a recurrent mechanism. Neurons in RNN are fed information (in context nodes) from the previous pass as well as the previous layer acting like a chain in which each time step, depends on the previous computation.

RNN is often used with time-series or sequence data (e.g., audio recordings or text). As each internal state relies on the previous one, you have information that is propagated onto each layer of neurons in the network since the beginning of the sequence; creating a sequence of dependent computations. This allows model to learn from earlier sequences.

Fully_connected_Recurrent_Neural_Network.webp

Parameter Sharing is one of important aspects of RNNs, meaning a single weight vector is shared across all time steps in the network. There’s only one set of parameters that is used, and optimized, across all parts of the network. If those parameters were not shared, the model would have to learn the parameters for each part of the input sequence and would have a much harder time generalizing examples it had not seen yet. Sharing parameters gives Recurrent Neural Networks the ability to handle inputs with different lengths, and still perform predictions in a an acceptable time frame. Shared parameters are particularly important to generalize sequences that share inputs, although in different positions.

The network can have as many hidden states as you’d like to, but there’s one important constant. In each hidden state you’re always computing the same activation function. The output of each layer is calculated using the same function.

Notes:

  • RNNs consider input as time series to generate output as time series.
  • RNNs are universal approximators: They can approximate virtually any dynamical system.
  • RNNs categories based on input and output:
    • Many-to-one
    • One-to-many
    • Many-to-many (synced)
    • Many-to-many (unsynced)
  • RNNs are designed to process sequential input data non-sequentially.

Application:


Code: