Long short-term memory (1997) [pdf]
This paper introduces Long Short-Term Memory (LSTM), a recurrent neural network architecture that uses memory cells and gating mechanisms to overcome the vanishing gradient problem and learn long-term dependencies over extended time intervals.