..

2023-07-17

Batchnorm

Batchnorm

Used to reduce the shift of values in each layer in deep learning network
Calculate $\mu$ and $\sigma$ of each feature in a mini batch
Normalize the input features by the calculated $\mu$ and $\sigma$
There are two learnable parameters scale ($\gamma$) and shift ($\beta$)
Compute a moving average across all mini batches seen during training and use them during inference

Advantages

Faster training and convergence
Can use higher learning rate
Reduced sensitivity to weight initialization
Regularization

References

AI Summer