Java Deep Learning Projects
上QQ阅读APP看书,第一时间看更新

Recurrent neural networks 

A recurrent neural network (RNN) is a class of artificial neural network (ANN) where connections between units form a directed cycle. RNN architecture was originally conceived by Hochreiter and Schmidhuber in 1997. RNN architectures have standard MLPs plus added loops (as shown in the following diagram), so they can exploit the powerful nonlinear mapping capabilities of the MLP; and they have some form of memory:

RNN architecture

The preceding image shows a a very basic RNN having an input layer, 2 recurrent layers and an output layer. However, this basic RNN suffers from gradient vanishing and exploding problem and cannot model the long-term depedencies. Therefore, more advanced architectures are designed to utilize sequential information of input data with cyclic connections among building blocks such as perceptrons. These architectures include Long-Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), Bidirectional-LSTM and other variants.

Consequently, LSTM and GR can overcome the drawbacks of regular RNNs: gradient vanishing/exploding problem and the long-short term dependency. We will look at these architectures in chapter 2.