LSTM
← Back to Recurrent Neural Networks
Long Short-Term Memory. Addresses the vanishing gradient problem with a gated architecture: forget gate (what to discard), input gate (what to store), output gate (what to output), and a cell state that carries information across long sequences.
Related
- Vanilla RNN (simpler but limited)
- GRU (simplified LSTM variant)
- Transformers (largely replaced LSTMs for most tasks)