..
Batchnorm
Batchnorm
- Used to reduce the shift of values in each layer in deep learning network
- Calculate $\mu$ and $\sigma$ of each feature in a mini batch
- Normalize the input features by the calculated $\mu$ and $\sigma$
- There are two learnable parameters scale ($\gamma$) and shift ($\beta$)
- Compute a moving average across all mini batches seen during training and use them during inference
Advantages
- Faster training and convergence
- Can use higher learning rate
- Reduced sensitivity to weight initialization
- Regularization