Definition
Dropout prevents overfitting by randomly setting neuron outputs to zero during training.
How It Works: - During training: Randomly zero out neurons (e.g., 20%) - During inference: Use all neurons, scale outputs - Forces network to not rely on specific neurons
Benefits: - Reduces overfitting - Acts like training ensemble of networks - Simple to implement
Typical Dropout Rates: - 0.1-0.5 depending on layer - Higher for fully connected layers - Lower for convolutional layers
In Transformers: - Applied after attention - Applied in feed-forward layers - Typically 0.1 dropout rate
Examples
Using 0.2 dropout to randomly disable 20% of neurons during each training step.
Related Terms
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free intelligence briefs. No spam, unsubscribe anytime.