Adam (adaptive moment estimation)

class pints.Adam(x0, sigma0=0.1, boundaries=None)[source]

Adam optimiser (adaptive moment estimation), as described in [1] (see Algorithm 1).

This method is a variation on gradient descent that maintains two “moments”, allowing it to overshoot and go against the gradient for a short time. This property can make it more robust against noisy gradients.

Pseudo-code is given below. Here the value of the j-th parameter at iteration i is given as p_j[i] and the corresponding derivative is denoted g_j[i]:

m_j[i] = beta1 * m_j[i - 1] + (1 - beta1) * g_j[i]
v_j[i] = beta2 * v_j[i - 1] + (1 - beta2) * g_j[i]**2

m_j' = m_j[i] / (1 - beta1**(1 + i))
v_j' = v_j[i] / (1 - beta2**(1 + i))

p_j[i] = p_j[i - 1] - alpha * m_j' / (sqrt(v_j') + eps)

The initial values of the moments are m_j[0] = v_j[0] = 0, after which they decay with rates beta1 and beta2. In this implementation, beta1 = 0.9 and beta2 = 0.999.

The terms m_j' and v_j' are “initialisation bias corrected” versions of m_j and v_j (see section 3 of the paper).

The parameter alpha is a step size, which is set as min(sigma0) in this implementation.

Finally, eps is a small constant used to avoid division by zero, set to eps = 1e-8 in this implementation.

This is an unbounded method: Any boundaries will be ignored.

References

ask()[source]

See Optimiser.ask().

f_best()[source]

See Optimiser.f_best().

f_guessed()[source]

See Optimiser.f_guessed().

fbest()

Deprecated alias of f_best().

n_hyper_parameters()[source]

See pints.TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()[source]

See Optimiser.needs_sensitivities().

running()[source]

See Optimiser.running().

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:

x – An array of length n_hyper_parameters used to set the hyper-parameters.

stop()

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

tell(reply)[source]

See Optimiser.tell().

x_best()[source]

See Optimiser.x_best().

x_guessed()[source]

See Optimiser.x_guessed().

xbest()

Deprecated alias of x_best().