Adam
class Adam(learningRate: Float, beta1: Float, beta2: Float, epsilon: Float, useNesterov: Boolean, clipGradient: ClipGradientAction) : Optimizer
Content copied to clipboard
Adam optimizer.
Updates variable according next formula:
lr_t := learning_rate * sqrt{1 - beta_2^t} / (1 - beta_1^t)
m_t := beta_1 * m_{t-1} + (1 - beta_1) * g
v_t := beta_2 * v_{t-1} + (1 - beta_2) * g * g
variable := variable - lr_t * m_t / sqrt{v_t} + epsilon
It is recommended to leave the parameters of this optimizer at their default values.
Constructors
Properties
clipGradient
Link copied to clipboard
Strategy of gradient clipping as sub-class of ClipGradientAction.
optimizerName
Link copied to clipboard