AdaDelta

class AdaDelta(learningRate: Float, rho: Float, epsilon: Float, clipGradient: ClipGradientAction) : Optimizer

Adadelta optimizer.

Updates variable according next formula:

accum = rho() * accum + (1 - rho()) * grad.square();
update = (update_accum + epsilon).sqrt() * (accum + epsilon()).rsqrt() * grad;
update_accum = rho() * update_accum + (1 - rho()) * update.square();
var -= update;

Adadelta is a more robust extension of Adagrad that adapts learning rates based on a moving window of gradient updates, instead of accumulating all past gradients. This way, Adadelta continues learning even when many updates have been done. Compared to Adagrad, in the original version of Adadelta you don't have to set an initial learning rate. In this version, initial learning rate and decay factor can be set, as in most other Keras optimizers.

It is recommended to leave the parameters of this optimizer at their default values.

See also

Constructors

AdaDelta
Link copied to clipboard
fun AdaDelta(learningRate: Float = 0.1f, rho: Float = 0.95f, epsilon: Float = 1e-8f, clipGradient: ClipGradientAction = NoClipGradient())

Properties

clipGradient
Link copied to clipboard
val clipGradient: ClipGradientAction

Strategy of gradient clipping as sub-class of ClipGradientAction.

optimizerName
Link copied to clipboard
open override val optimizerName: String

Returns optimizer name.