1. Import common initialization methods
from torch.nn.init import xavier_uniform_, xavier_normal_ from torch.nn.init
import kaiming_uniform_, kaiming_normal_
2. Analysis of various initialization methods
* xavier_uniform_(tensor, gain=1.0)
Note: Initializes the input with evenly distributed values tensor. Methods based on 《Understanding the difficulty of training deep
feedforward neural networks - Glorot, X. & Bengio, Y. (2010)》 Paper implementation . What we got in the end Tesor Values are sampled at
U(−a,a) ,
among : \
parameter :
gain: Scaling factor (optional)
* xavier_normal_(tensor, gain=1.0)
Note: Initializes the input with a value of a positive distribution tensor. Methods based on 《Understanding the difficulty of training deep
feedforward neural networks - Glorot, X. & Bengio, Y. (2010)》 Paper implementation . What we got in the end Tesor Values are sampled at
,
among :
* kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
Note: Initializes the input with evenly distributed values tensor. Methods based on 《Delving deep into rectifiers: Surpassing
human-level performance on ImageNet classification - He, K. et al.
(2015)》 Paper implementation . What we got in the end Tesor Values are sampled at U(−bound,bound) ,
among :
parameter :a:
mode: "fan_in" or "fan_out". choice “fan_in" Save the magnitude of the weight variance in forward propagation , ”fan_out"
Preserving amplitude in backward propagation .
nonlinearity: Nonlinear function . recommend "relu" or "leaky_relu".
* kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
Note: Initializes the input with a value of a positive distribution tensor. Methods based on 《Delving deep into rectifiers: Surpassing
human-level performance on ImageNet classification - He, K. et al.
(2015)》 Paper implementation . What we got in the end Tesor Values are sampled at ,
among :
Technology