npdl.initialization

Functions to create initializers for parameter variables.

Examples

>>> from npdl.layers import Dense
>>> from npdl.initialization import GlorotUniform
>>> l1 = Dense(n_out=300, n_in=100, init=GlorotUniform())

Initializers

Zero Initialize weights with zero value.
One Initialize weights with one value.
Uniform([scale]) Sample initial weights from the uniform distribution.
Normal([std, mean]) Sample initial weights from the Gaussian distribution.
Orthogonal([gain]) Intialize weights as Orthogonal matrix.

Detailed description

class npdl.initialization.Initializer[source]

Base class for parameter tensor initializers.

The Initializer class represents a weight initializer used to initialize weight parameters in a neural network layer. It should be subclassed when implementing new types of weight initializers.

call(size)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters:

size : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class npdl.initialization.Zero[source]

Initialize weights with zero value.

class npdl.initialization.One[source]

Initialize weights with one value.

class npdl.initialization.Normal(std=0.01, mean=0.0)[source]

Sample initial weights from the Gaussian distribution.

Initial weight parameters are sampled from N(mean, std).

Parameters:

std : float

Std of initial parameters.

mean : float

Mean of initial parameters.

class npdl.initialization.Uniform(scale=0.05)[source]

Sample initial weights from the uniform distribution.

Parameters are sampled from U(a, b).

Parameters:

scale : float or tuple

When std is None then range determines a, b. If range is a float the weights are sampled from U(-range, range). If range is a tuple the weights are sampled from U(range[0], range[1]).

class npdl.initialization.Orthogonal(gain=1.0)[source]

Intialize weights as Orthogonal matrix.

Orthogonal matrix initialization [R2]. For n-dimensional shapes where n > 2, the n-1 trailing axes are flattened. For convolutional layers, this corresponds to the fan-in, so this makes the initialization usable for both dense and convolutional layers.

Parameters:

gain : float or ‘relu’

Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units, and to sqrt(2/(1+alpha**2)) for leaky rectified linear units with leakiness alpha. Other transfer functions may need different factors.

References

[R2](1, 2) Saxe, Andrew M., James L. McClelland, and Surya Ganguli. “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.” arXiv preprint arXiv:1312.6120 (2013).