Adam
class Adam(learningRate: Float, beta1: Float, beta2: Float, epsilon: Float, useNesterov: Boolean, clipGradient: ClipGradientAction) : Optimizer
Content copied to clipboard
Adam optimizer.
Updates variable according next formula:
lr_t := learning_rate * sqrt{1 - beta_2^t} / (1 - beta_1^t)
m_t := beta_1 * m_{t-1} + (1 - beta_1) * g
v_t := beta_2 * v_{t-1} + (1 - beta_2) * g * g
variable := variable - lr_t * m_t / sqrt{v_t} + epsilon
It is recommended to leave the parameters of this optimizer at their default values.
Constructors
Properties
clipGradient
Link copied to clipboard
Strategy of gradient clipping as subclass of ClipGradientAction.
optimizerName
Link copied to clipboard