Notes

❯

❯

06_deep_learning_theory

index

Mar 02, 20261 min read

Deep Learning Theory Index

Navigation hub.

Notes

Foundations

Neural Network Notation — layers, activations, forward pass notation
Activation Functions — sigmoid, ReLU, GELU, Swish, universal approximation
Weight Initialization — Xavier, Kaiming, symmetry breaking

Training

Backpropagation — computational graph, chain rule, Jacobian chain
Backpropagation Through Time — unrolled RNNs, gradient truncation
Gradient Descent — SGD, mini-batch, learning rate schedules
Adaptive Optimizers — Momentum, RMSProp, Adam, AdamW
Gradient Checking — numerical gradient verification

Regularization and normalization

Dropout — inverted dropout, MC dropout, ensemble interpretation
Batch Normalization — normalise over batch, scale/shift, covariate shift
Layer Normalization — normalise over features, Pre-LN, RMSNorm
Residual Connections — skip connections, gradient highway, ensemble theory

Loss functions

Cross-Entropy Loss — categorical and binary cross-entropy
Triplet Loss — metric learning, anchor/positive/negative

Efficiency

Vectorization and Broadcasting — batch operations, NumPy/PyTorch broadcasting

Links

Foundations
Supervised Learning (→ 05_statistical_learning_theory)
Logistic Regression (→ 03_probability_and_statistics)

Navigation: ← Statistical Learning Theory | Foundations Index

16 items under this folder.

Mar 06, 2026
activation_functions
- algorithm
Mar 06, 2026
dropout
- algorithm
- training
Mar 06, 2026
layer_normalization
- algorithm
- training
Mar 06, 2026
residual_connections
- algorithm
Mar 02, 2026
adaptive_optimizers
- algorithm
- training
Mar 02, 2026
backpropagation
- algorithm
- training
Mar 02, 2026
backpropagation_through_time
- algorithm
- training
Mar 02, 2026
batch_normalization
- algorithm
- training
Mar 02, 2026
cross_entropy_loss
Mar 02, 2026
gradient_checking
- workflow
- training
Mar 02, 2026
gradient_descent
- algorithm
- training
Mar 02, 2026
neural_network_notation
- theory
Mar 02, 2026
style_cost_function
Mar 02, 2026
triplet_loss
- algorithm
- training
Mar 02, 2026
vectorization
- algorithm
Mar 02, 2026
weight_initialization
- algorithm
- training

Created with Quartz v4.5.2 © 2026

GitHub