Residual Connections

Back to Neural Network Fundamentals

Skip connections that add the input of a layer directly to its output: output = F(x) + x. Enable training of very deep networks (100+ layers) by providing gradient shortcuts. Introduced in ResNet; used in virtually all modern architectures including Transformers.


deep-learning residual-connections skip-connections