Back to Glossary
techniques

Normalization

Techniques to standardize inputs or activations for more stable training.

Share:

Definition

Normalization techniques standardize data or layer outputs to improve training stability and speed.

  • **Types:**
  • Batch Normalization: Normalize across batch
  • Layer Normalization: Normalize across features (used in transformers)
  • Instance Normalization: Per-sample normalization
  • Group Normalization: Normalize groups of channels

Benefits: - Faster training convergence - Allows higher learning rates - Reduces internal covariate shift - Acts as regularization

In Transformers: - Layer normalization is standard - Pre-norm vs post-norm architectures - RMSNorm (simplified) gaining popularity

Data Normalization: - Scale inputs to similar ranges - Zero mean, unit variance common

Examples

Layer normalization after each transformer attention block.

Want more AI knowledge?

Get bite-sized AI concepts delivered to your inbox.

Free intelligence briefs. No spam, unsubscribe anytime.

Discussion