(the slides of this and other presentations are available in the PhDinfo restricted area)
The memory module in Deep Learning architecture is fundamental to solve sequential and
temporal task such as speech recognition, language modelling, sentiment analysis.
The Recurrent Neural Networks (RNN), analysing recurrently each element of a sequential data,
are able to store and compact the information in a compressed hidden state. Because of the
vanishing gradient, the memory in RNN have a short-term capacity.
To remedy this, a new type of recurrent neural network has been developed, Long-Short Term
Memory (LSTM). In addition to the hidden state, a cell state is able to memorize the information
for a longer time.
Both in RNN and in LSTM, the memory is a single dense vector and the ability to address individual
elements is lacking.
To overcome these problems, Memory Networks have been introduced to combine inference
components with a long-term memory. The memory has matrix-shaped structure and is element-
Neural Turing Machines are the first to use this type of memory to solve algorithmic task.
Different Memory Networks have been developed to solve real task such as MemN2N for
Question Answering and MANTRA for trajectory prediction.